Acoustic quality evaluation device, acoustic quality evaluation method, and program

The acoustic quality evaluation device simulates noisy environments to evaluate public address systems using a data generation device and open-type headphones, allowing for accurate sound quality assessment without conversational tests, addressing the challenge of ambient noise impact on echo cancellation.

JP7875485B2Active Publication Date: 2026-06-18NIPPON TELEGRAPH & TELEPHONE CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
NIPPON TELEGRAPH & TELEPHONE CORP
Filing Date
2022-12-07
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing technologies have not established a method for evaluating the sound quality of public address systems under environmental noise conditions, and conventional methods like conversational tests have not effectively addressed the impact of ambient noise on acoustic echo cancellation evaluations.

Method used

An acoustic quality evaluation device and method that uses a data generation device to simulate near-end and far-end speaker environments, incorporating ambient noise, and a listening test setup with open-type headphones to evaluate sound quality without requiring actual conversations, by presenting reference and evaluation sounds in stereo to evaluators.

🎯Benefits of technology

Enables accurate acoustic quality evaluation in noisy environments through listening tests, eliminating the need for costly and time-consuming conversational tests, and providing consistent evaluation values.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 0007875485000001
    Figure 0007875485000001
  • Figure 0007875485000002
    Figure 0007875485000002
  • Figure 0007875485000003
    Figure 0007875485000003
Patent Text Reader

Abstract

The present invention achieves appropriate acoustic quality evaluation for a loudspeaker hands-free communication system through a listening test without carrying out a conversation test even in a telephony environment having environmental noise. An acoustic quality evaluation device according to this disclosed technology evaluates an acoustic quality of a loudspeaker hands-free communication system including a first terminal device and a second terminal device and comprises a data storage unit, an acoustic output processing unit, and a noise output processing unit. The data storage unit records sound to be evaluated which has been picked up by the first terminal device and received by the second terminal device. The acoustic output processing unit outputs the sound to be evaluated to a head-mounted open acoustic device. The noise output processing unit outputs environmental noise surrounding the open acoustic device.
Need to check novelty before this filing date? Find Prior Art

Description

【Technical Field】 【0001】 The disclosed technology relates to a technology for evaluating call quality, and particularly to a quality evaluation test technology for a loudspeaker communication system. 【Background Art】 【0002】 With the development of communication technology, the opportunity to use a loudspeaker communication system such as a hands-free loudspeaker call using a conference system or a smartphone has increased due to the convenience of making a call without having a device. In order to remove acoustic echo and ambient noise, which are problems in a loudspeaker communication system, and provide a comfortable call environment, an acoustic echo canceller (AEC) is used. 【0003】 FIG. 1 schematically shows acoustic echo and AEC. A near-end speaker 101 and a far-end speaker 102 communicate with each other using a loudspeaker communication system. 103 and 104 are the microphone and speaker on the near-end speaker side, and 105 and 106 are the microphone and speaker on the far-end speaker side. 【0004】 "Hello" spoken by the near-end speaker 101 is output from the far-end speaker 105 (107) and reaches the ear of the far-end speaker 102. In a loudspeaker communication system, the speaker output 107 also picks up the far-end microphone 106 (feedback 108). If the acoustic echo picked up by the far-end microphone 106 of the near-end speaker's voice "Hello" is transmitted directly to the near-end side, the call will become difficult or cause howling. Therefore, a loudspeaker communication system includes an AEC 109 and transmits an audio signal with the voice from the near-end speaker removed or reduced to the near-end side. When the AEC also has a noise cancellation function, the noise around the far-end speaker is also removed or suppressed. 【0005】 If the effect of the AEC is weak, the acoustic echo remains, and if it is too strong, the transmitted voice from the far-end is also removed, distorted or disappeared, making it difficult to hear. Since the performance of AEC depends on how accurately acoustic echo is eliminated, conventional AEC performance evaluations have mainly relied on objective evaluations (evaluations using computers, etc.) that focused on the amount of acoustic echo elimination. While objective evaluations are easy to perform because they can be done with computer processing, they have the problem that they do not necessarily match the quality that users experience during actual calls (also called "user-perceived quality"). 【0006】 In order to evaluate acoustic echo and processed sound by AEC in subjective evaluation (human listening evaluation), it is necessary to perceive the acoustic echo, and evaluation is only possible when the evaluator themselves is making a call. For this reason, quality evaluation through two-way conversation tests has been recommended for loudspeaker communication systems such as hands-free loudspeakers (see Non-Patent Literature 1). However, conducting conversation tests requires know-how, is time-consuming and costly, and has the problem of low reproducibility. 【0007】 On the other hand, in calls using handsets or headsets, the voice transmitted from the far end is not affected by the near-end speaker's voice, such as acoustic echo, and only the far-end voice can be evaluated. In this case, call quality can be evaluated by simplifying the conversation test and performing a listening test targeting one-way calls, and this test method is common in the evaluation of IP phone call quality. The listening test is more reproducible and requires less time than the conversation test, making it more convenient. Objective evaluation methods such as PESQ (Perceptual Evaluation of Speech Quality), which estimates subjective evaluation values ​​(also called "Listening MOS: Mean Opinion Score") from the listening test, have also been established (see Non-Patent Literature 2). In recent years, methods have been proposed to apply subjective evaluations from listening tests and objective evaluations such as PESQ to public address communication systems (Non-Patent Document 3, Patent Document 1). [Prior art documents] [Patent Documents] 【0008】 [Patent Document 1] Japanese Patent Publication No. 2016-46694 [Non-patent literature] 【0009】 [Non-Patent Document 1] ITU-T, "ITU-T Recommendation P.800: Methods for subjective determination of transmission quality", ITU, 1996 [Non-Patent Document 2] ITU-T, "ITU-T Recommendation P.862: Perceptual evaluation of speech quality (PESQ): An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codecs", ITU, 2002 [Non-Patent Document 3] Sachiko Kurihara, Suehiro Shimauchi, Katsuhiro Fukui, Noboru Harada, "Evaluation of User Experience Quality in Hands-Free Calling: Examination of a Subjective Evaluation Method Consistent with PESQ," IEICE Technical Report, vol. 117, no. 386, CQ2017-96, pp. 63-68, January 2018. [Overview of the project] [Problems that the invention aims to solve] 【0010】 With the widespread use of smartphones and PCs, opportunities to use hands-free voice chat in noisy environments are increasing. Noise refers to things like office air conditioning noise, sounds inside a moving car, traffic noise at intersections, insect sounds, keyboard typing sounds, factory machinery noise, and the voices of multiple people (chatting noise), regardless of volume or whether it is indoors or outdoors. However, a method for evaluating the sound quality of public address systems under such environmental noise conditions has not yet been established. 【0011】 When a near-end speaker is in a "quiet environment," the speaker output sound from the far-end terminal (the far-end speaker's voice, ambient noise, acoustic echo, distortion of the far-end speaker's voice due to AEC processing, residual echo, etc.) is easily perceived in detail, so even slight distortion or noise superimposition will result in a lower evaluation. On the other hand, when the near-end speaker is in an environment with ambient noise, the speaker output from the far-end terminal is masked by the near-end speaker's ambient noise, making it difficult to hear noise from the far-end terminal (far-end speaker's ambient noise, acoustic echo, distortion of the far-end speaker's voice due to AEC processing, residual echo, etc.). As a result, even with some distortion or noise superimposition, the impact on the evaluation becomes smaller, and the evaluation tends to be higher compared to when in a quiet environment. 【0012】 Thus, even if the output sound from the speaker (=the sound being evaluated) is the same, the evaluation of the output sound will differ depending on whether the speaker is in a quiet environment without ambient noise or in an environment with ambient noise. Until now, evaluating this type of environment could only be done through conversational tests. The purpose of the disclosed technology is to realize an acoustic quality evaluation technology that can obtain appropriate evaluation values ​​through listening tests without conducting conversation tests, even in a call environment with ambient noise. [Means for solving the problem] 【0013】 To solve the above problems, the disclosed acoustic quality evaluation device is a device for evaluating the acoustic quality of a public address communication system including a first terminal device and a second terminal device, and includes a data storage unit, an acoustic output processing unit, and a noise output processing unit. The data storage unit records the sound to be evaluated, which is received by the first terminal device and then received by the second terminal device. The acoustic output processing unit outputs the sound to be evaluated to a head-mounted, open-type acoustic device. The noise output processing unit outputs ambient noise that surrounds the open-type acoustic device. [Effects of the Invention] 【0014】 According to the disclosed technology, even in a call environment with environmental noise, appropriate acoustic quality evaluation of the voice amplification communication system can be achieved through a listening test without conducting a conversation test. 【Brief Description of the Drawings】 【0015】 [Figure 1] Diagram schematically showing acoustic echo and AEC. [Figure 2] Diagram explaining the acoustic quality evaluation test by a listening test in a voice amplification communication system. [Figure 3] Functional block diagram of the acoustic quality evaluation system according to the first embodiment. [Figure 4] Functional block diagram of the data generation device according to the first embodiment. [Figure 5] Flowchart showing the operation of the proximal system of the data generation device. [Figure 6] Flowchart showing the operation of the distal system of the data generation device. [Figure 7] Flowchart showing the operation of the data recording system of the data generation device. [Figure 8] Functional block diagram of the acoustic quality evaluation device according to the first embodiment. [Figure 9] Flowchart showing the operation of the acoustic quality evaluation device. [Figure 10] Diagram showing an example of the display by the display unit. [Figure 11] Diagram showing an example of a listening test room. [Figure 12] Diagram showing a second example of a listening test room. [Figure 13] Diagram showing a third example of a listening test room. [Figure 14] Diagram showing an example of the functional configuration of a computer. 【Modes for Carrying Out the Invention】 【0016】 Hereinafter, embodiments of the disclosed technology will be described in detail. Components having the same function are denoted by the same reference numerals, and redundant descriptions are omitted. 【0017】 [Overview of the Audition] First, we will conceptually explain the acoustic quality evaluation test using listening tests in a public address communication system, using Figure 2. In this acoustic quality evaluation test, a near-end speaker 201 and a far-end speaker 202 converse through a loudspeaker communication system 2, and an evaluator 203 located on the near-end speaker 201's side evaluates the quality of the loudspeaker communication system 2. A public address communication system is a communication system that transmits and receives acoustic signals between terminal devices equipped with a microphone and a speaker, wherein at least a portion of the sound output from the speaker of the terminal device (for example, 210 "Hello") is received by the microphone of the terminal device (for example, sound leakage 211 occurs). Examples of public address communication systems include voice conferencing systems and video conferencing systems. 【0018】 In the public address system 2, the voice 209 of the near-end speaker is received by the microphone 204 on the near-end speaker's side, and the resulting acoustic signal is transmitted to the far-end speaker via the network 208, and the sound represented by the acoustic signal is output from the speaker 206 on the far-end speaker's side. Also, the voice of the far-end speaker is received by the microphone 207 on the far-end speaker's side, and the resulting acoustic signal is transmitted to the near-end speaker via the network 208, and the sound represented by the acoustic signal is output from the speaker 205 on the near-end speaker's side. However, at least a portion of the sound output from the speaker 206 on the far-end speaker side is also received by the microphone 207 on the far-end speaker side. That is, the far-end speaker sound received by the microphone 207 on the far-end speaker side is the far-end speaker's voice 212 "Hi" superimposed with the sound leakage 211 (acoustic echo) originating from the near-end speaker. In other words, the far-end speaker sound received by the microphone 207 on the far-end speaker side is the far-end speaker's voice 212 superimposed with the sound 210 originating from the near-end speaker, which has been degraded in the far-end speaker's space 211. When the near-end speaker 201 is not speaking, the sound leakage 211 originating from the near-end speaker is not superimposed, so the far-end speaker's voice is not degraded. Furthermore, the degradation of sound on the far-end speaker's side is also caused by the superposition of ambient noise 213 on the far-end speaker's side. 【0019】 The acoustic signals transmitted to the near-speaker may originate from a processed signal obtained by applying predetermined processing to the sound received by the far-speaker's microphone, or they may be obtained without such signal processing. An example of signal processing is a process that includes at least one of echo cancellation and noise cancellation. Note that echo cancellation refers to processing using a broad-sense echo canceller to reduce echoes. Broad-sense echo cancellation refers to all processes for reducing echoes. Broad-sense echo cancellation may be implemented, for example, by a narrow-sense echo canceller using an adaptive filter, by an audio switch, by echo reduction, by a combination of at least some of these techniques, or by a combination with other techniques (see Reference 1 below). Noise cancellation processing refers to the process of suppressing or removing noise components caused by any ambient noise other than the voice of the far-end speaker that occurs around the microphone of the far-end terminal (see Reference 2 below). 【0020】 [Reference 1] Knowledge Base: Forest of Knowledge, Group 2-6-Chapter 5, "Acoustic Echo Canceller", Institute of Electronics, Information and Communication Engineers [Reference 2] Sumi Sakauchi, Yoichi Haneda, Masashi Tanaka, Junko Sasaki, Akitoshi Kataoka, "Acoustic Echo Canceller with Noise Suppression and Echo Suppression Functions," IEICE Transactions on Electronics, Information and Communication Engineers, Vol. J87-A, No. 4, pp. 448-457, April 2004. 【0021】 Furthermore, the disclosed technology provides apparatus and methods for listening tests, particularly in situations where there is ambient noise on the near-speaker's side. The technology disclosed in Patent Document 1 differs in that it was a listening test targeting a quiet environment where there was no noise around the near-speaker. 【0022】 [First Embodiment] Figure 3 shows a functional block diagram of an example of an acoustic evaluation system according to the first embodiment. Acoustic evaluation system 3 is a data generation device for testing. 4 This includes an acoustic quality evaluation device 32. 【0023】 [Data generation device] Figure 4 shows a functional block diagram of the data generation device 4 according to the first embodiment. The data generation device 4 consists of a near-end system 41 that simulates a near-end speaker environment, a far-end system 42 that simulates a far-end speaker environment, and a data recording system 43 that records simulated communication between the simulated environments. The near-end system 41 and the far-end system 42 communicate via the network 44. The simulated communication recorded by the data recording system 43 is used later for acoustic quality evaluation. 【0024】 <Near end system> The near-end system includes a near-end ambient noise signal storage unit 410, a near-end speaker voice signal storage unit 411, playback units 412 and 413, a near-end terminal unit 414, and a signal processing unit 415. <Far end system> The far-end system includes a far-end ambient noise signal storage unit 420, a far-end speaker voice signal storage unit 421, playback units 422 and 423, speakers 424, 425 and 426, a microphone 427, a far-end terminal unit 428, and a signal processing unit 429. <Data Recorder> The data recording system comprises a recording processing unit 430, a time adjustment processing unit 431, a data storage unit 432, data output units 433, 434, 435, 436, 437, 438, and a switch 439. 【0025】 Figures 5, 6, and 7 are flowcharts illustrating an example of the operation of the data generation device 4. <Operation of the near-end system> The action of the near-end system will be explained, focusing on Figures 4 and 5. The data generation device 4 extracts the audio signal from the near-end speaker audio signal storage unit 411, reproduces it in the playback unit 413 (step S501), and inputs it to the near-end terminal unit 414. This input corresponds to the voice emitted by the near-end speaker. Simultaneously, the reproduced signal is output to the output units 433, 435, and 437 of the data recording system 43 (step S504). This output (the voice emitted by the near-end speaker) becomes the reference sound (described later) in the stereo listening test (described later). The data generation device 4 also extracts noise signals from the near-end ambient noise signal storage unit 410, reproduces them in the playback unit 412 (step S502), and inputs them to the near-end terminal unit 414. This input corresponds to the ambient noise of the near-end speaker. 【0026】 The signal processing unit 415 performs signal processing (echo cancellation processing and noise cancellation processing) on ​​the voice and noise input to the near-end terminal unit 414 (step S503), and transmits it to the far-end terminal unit 428 via the network 44 (step S505). In parallel, the data generation device 4 outputs the audio from the far end received by the near end terminal to the recording processing unit 430 of the data recording system (step S507). 【0027】 <Operation of the far-end system> The function of the distal end system will be explained, focusing on Figures 4 and 6. The far-end terminal unit 428 outputs audio from the speaker 426 based on the signal received from the near-end terminal (step S602). This output corresponds to the audio from the near-end speaker that is emitted from the loudspeaker communication system and heard by the far-end speaker. The data generation device 4 extracts the audio signal from the far-end speaker audio signal storage unit 421, reproduces it in the playback unit 423 (step S603), and outputs it from the speaker 425 (step S604). This output corresponds to the voice emitted by the far-end speaker. In parallel, the playback unit 423 outputs the reproduced sound to the time adjustment processing unit 431 of the data recording system (step S611). This output serves as a reference signal for subsequent quality evaluation. The data generation device 4 also extracts a noise signal from the far-end ambient noise signal storage unit 420, reproduces it in the playback unit 422 (step S605), and outputs it from the speaker 424 (step S606). This output corresponds to the ambient noise of the far-end speaker. 【0028】 The far-end terminal unit 428 uses a microphone 427 to acquire the output from speakers 426, 425, and 424 (step S607) and inputs it to the signal processing unit 429. The signal processing unit 429 applies signal processing (echo cancellation and noise cancellation) to the input audio signal (a superposition of near-end speaker voice, far-end speaker voice, and ambient noise) as needed (step S608) and transmits it to the near-end terminal unit 414 via the network 44 (S610). At the same time, the signal processing unit transmits a signal indicating whether or not signal processing has been performed to the recording processing unit 430 of the data recording system (step S609). The signal processing status signal is used when recording the sound to be evaluated (see below). 【0029】 <Operation of the data recording system> The operation of the data recording system will be explained, focusing on Figures 4 and 7. The data recording system 43 records the audio output of the near-end system 41 and the audio output of the far-end system 42 in the data storage unit 432 for later use in sound quality evaluation. In sound quality evaluation, we compare the far-end speaker voice before echo and noise are superimposed (reference tone), the far-end speaker voice with superimposed echo and noise that has not undergone signal processing (degraded signal 1), and the far-end speaker voice with superimposed echo and noise that has undergone signal processing (degraded signal 2). 【0030】 <<Stereo Listening Test>> Furthermore, in order to make the evaluator perceive acoustic echo during the listening test, the test sounds will be in a stereo configuration. The far-end sound containing the acoustic echo (the sound to be evaluated) will be presented to one ear, and the near-end sound from which the acoustic echo originates (the reference sound) will be presented to the other ear simultaneously. This simulates a scenario where the evaluator is sitting next to the near-end speaker from which the acoustic echo originates and is listening to a conversation with the far-end speaker, thus effectively representing a simulated conversation test. The reference tone and the evaluation tone can be supplied to either ear, but it is preferable to supply the reference tone to the non-dominant ear (e.g., the right ear) and the evaluation tone to the dominant ear (e.g., the left ear). 【0031】 <<Reference Sound>> To record the above evaluation audio, the data generation device 4 outputs the output of the playback unit 413 of the near-end terminal to the output units 433, 435, and 437 as the reference tone of the reference signal, the reference tone of degraded signal 1, and the reference tone of degraded signal 2 (step S701), and records it in the data storage unit 432 (step S707). 【0032】 <<Sound to be evaluated - Reference signal>> Furthermore, the data generation device 4 outputs the output of the playback unit 423 to the output unit 438 (step S703) after the time adjustment processing unit 431 applies a delay equivalent to the network delay (step S702), and records it in the data storage unit 432 (step S707). The pair of signals obtained from the output unit 437 and the output unit 438 is hereinafter referred to as the reference signal pair. 【0033】 <<Sound to be evaluated - Degraded signal 1>> The data generation device 4 records the signals received by the near-end terminal 414 in conjunction with the processing of the far-end system. In other words, if the signal processing unit 429 of the far-end terminal does not perform signal processing such as echo cancellation or noise cancellation, the signal processing unit 429 outputs a "signal processing OFF signal" to the recording processing unit 430 (step S609). The recording processing unit 430 controls the switch 439 according to the signal processing OFF signal (step S704) and outputs the audio received from the far end to the output unit 434 (step S705). The audio signal output from the output unit 434 is recorded in the data storage unit 432 (step S707). The pair of audio signals obtained from the output unit 433 and the audio signals obtained from the output unit 434 is hereinafter referred to as degraded signal pair 1. 【0034】 <<Sound to be evaluated - Degraded signal 2>> When the signal processing unit 429 of the far-end terminal performs signal processing such as echo cancellation or noise cancellation, the signal processing unit 429 outputs a "signal processing ON signal" to the recording processing unit 430 (step S609). The recording processing unit 430 controls the switch 439 according to the signal processing ON signal (step S704) and outputs the audio received from the far end to the output unit 436 (step S706). The audio signal output from the output unit 436 is recorded in the data storage unit 432 (step S707). The pair of audio signals obtained from the output unit 435 and the audio signals obtained from the output unit 436 is hereinafter referred to as the degraded signal pair 2. Note that degraded signal pair 1 and degraded signal pair 2 are sometimes collectively referred to as the degraded signal pair. 【0035】 [Acoustic Quality Evaluation Equipment] The evaluator uses a binaural audio playback device such as headphones or earphones to alternately listen to and compare the sound that should be output from the speaker on the near-end speaker side when there is no sound leakage on the far-end speaker side (i.e., the reference tone) and the sound that should be output from the speaker on the near-end speaker side when there is sound leakage on the far-end speaker side (i.e., the sound to be evaluated), and then subjectively evaluates the call quality (opinion evaluation). 【0036】 Furthermore, the evaluators are presented with the test sounds in the stereo configuration described above. In this embodiment, the channel of the reference sound is denoted as "Rch," and the channel of the sound to be evaluated is denoted as "Lch." 【0037】 Figure 8 shows a functional block diagram of the acoustic quality evaluation device 8 according to the first embodiment. The sound quality evaluation device 8 can conduct tests simultaneously for multiple (N) evaluators. Therefore, the N sound processing output units, display units, input units, and sound playback devices are shown as "XXX-1...XXX-N" in the diagram, but below, the notation "XXX" without a hyphen will refer collectively to the N units. For example, "evaluator 850" will refer collectively to evaluators 850-1 through 850-N. 【0038】 The sound quality evaluation device 8 acquires test sounds from the data storage unit 432 and ambient noise from the near-end ambient noise signal storage unit 410, and supplies them to the evaluator 850. Evaluator 850 will wear headphones or earphones, or other dual-ear acoustic playback devices, of the open-type (hereinafter referred to as "open-type headphones") 840. "Open-type" refers to a type of headphones with low sound insulation to prevent sound from leaking to the outside, and therefore, a structure that allows ambient sounds to easily reach the user's ears. 850 evaluators are provided with 860 speakers supplying ambient noise. As described later, it is also possible to present ambient noise to multiple evaluators through a common speaker, and the number of speakers does not necessarily have to match the number of evaluators. Evaluator 850 listens to the reference sound and the sound to be evaluated in stereo while wearing open-type headphones 840. As mentioned above, since open-type headphones 840 do not block ambient noise, evaluator 850 will listen to the reference sound and the sound to be evaluated in an environment with ambient noise. 【0039】 Figure 9 is a flowchart illustrating an example of the operation of the sound quality evaluation device 8. The sound quality evaluation device 8 acquires a signal from the near-end ambient noise signal storage unit 410, reproduces it in the playback unit 806, and outputs it from the speaker 860 (step S901). The playback control unit 801 determines the signal to be evaluated from among the signals recorded in the data storage unit (step S902). The display control unit 802 displays an evaluation input screen on the display unit 820 for evaluating the signal determined by the playback control unit (step S903). The evaluation input screen is, for example, as shown in Figure 10. 【0040】 The playback control unit 801 acquires a reference signal pair from the data storage unit 432 and outputs it to the open-type headphones 840 from the acoustic output processing unit 810 (step S904). Next, the playback control unit 801 acquires a degraded signal pair from the data storage unit 432 and outputs it to the open-type headphones 840 from the sound output processing unit 810 (step S905). The evaluator 850 inputs the evaluation using the display unit 820 and the input unit 830 (step S906). The sound quality evaluation device 8 determines whether all evaluations are complete. If not, it performs the next evaluation (No in step S907). If all evaluations are complete, it terminates the evaluation procedure (Yes in step S907). The aggregation unit 803 aggregates the evaluation results and records them in the aggregation result storage unit 805. 【0041】 [Auditing Room] <Example of a testing room 1> The disclosed technology is intended for evaluating public address communication systems under ambient noise. To simulate the conditions in which evaluators are present under ambient noise, listening tests are conducted in a sealed space such as a soundproof room with speakers. Figure 11 shows an example of a test room for conducting the acoustic quality evaluation of the first embodiment. The left figure 1101 is a top view of the soundproof room 1100, and the right figure 1102 is a side view of the soundproof room 1100. Inside the soundproof room, an evaluator 1104 wearing open-type headphones 1105 and multiple speakers 1103 are placed, with ambient noise being output from the speakers 1103. In this case, it is desirable for the evaluators to be positioned approximately equidistant from multiple speakers. Furthermore, it is desirable to place the speaker at a sufficient distance from the evaluator and at a height equal to or higher than the evaluator's ear level. The speaker output can be either mono (all the same sound) or stereo. In either case, the signal-to-noise ratio and volume level should be measured near the evaluator's ear. In this context, stereo sound refers to sound that expresses the left-right position and depth of a sound source through the cooperation of two speakers, and stereo sound can more accurately simulate real-world ambient sounds. 【0042】 <Example of a laboratory 2> Figure 12 shows a second example of a test room for performing the acoustic quality evaluation of the first embodiment. Figure 1201 is a top view of the soundproof room 1100. Inside the soundproof room, there are multiple evaluators 1104 wearing open-type headphones 1105 and multiple speakers, which emit ambient noise. Figure 12 shows an example where four speakers 1202, 1203, 1204, and 1205 are arranged. Speakers 1202 to 1205 should be placed at a sufficient distance from the evaluator 1104 and ideally installed at the top of the soundproof room 1100. The speaker output can be either all the same sound (monaural) or stereo. In either case, the signal-to-noise ratio and volume level should be measured near the ear of evaluator 1104. When using stereo sound, the number of speakers should be even, and the speakers outputting the left channel and the speakers outputting the right channel should be arranged alternately along the walls of the soundproof room. In the case of Figure 12, for example, speakers 1202 and 1205 should output the left channel sound, and speakers 1203 and 1204 should output the right channel sound. It is desirable for evaluator 1104 to be positioned in the center of the multiple speakers 1202 to 1205, but the position of evaluator 1104 does not necessarily have to be in the center of the multiple speakers 1202 to 1205, as long as they are far enough away from the speakers. By arranging speakers 1202 through 1205 and evaluator 1104 in this way, simultaneous listening by multiple people becomes possible. 【0043】 <Example of a laboratory 3> Figure 13 shows a third example of a test room for performing the acoustic quality evaluation of the first embodiment. Figure 1301 is a top view of the soundproof room 1100. Inside the soundproof room are multiple evaluators 1104 wearing open-type headphones 1105 and multiple speakers 1103, from which ambient noise is output. Speaker 1103 should be placed at a sufficient distance from evaluator 1104 and at a position equal to or higher than the evaluator 1104's ear level. The output of speaker 1103 can be either all the same sound (monaural) or stereo. In either case, the signal-to-noise ratio and volume level are measured near the ear of evaluator 1104. It is desirable that the evaluator 1104 be positioned approximately equidistant from the speakers 1103, but the evaluator's position does not necessarily have to be equidistant from the speakers, as long as they are sufficiently far from the speakers 1103. By arranging the speaker 1103 and evaluator 1104 in this manner, simultaneous listening by multiple people becomes possible. 【0044】 The first embodiment of the disclosed technology has been described above in the order of specific examples of a data generation device, an acoustic quality evaluation device, and a listening test room, but there may be several variations of the disclosed technology. For example, in the first embodiment, ambient noise signals were prepared separately for the near end and the far end. However, for generating data for evaluation tests, both the near and far end signals may be supplied from a common noise signal storage unit. Furthermore, in order to faithfully simulate a public address system in a noisy environment, the first embodiment was configured to transmit the sound that would cause the acoustic echo from the near end to the far end. However, for listening tests in a noisy environment, it is also possible to receive only the far-end sound superimposed with ambient noise from the far end at the near end, and conduct the listening test in a noisy environment on the near end. In this case, the evaluator does not need to be supplied with the near-end sound that would cause the acoustic echo. 【0045】 <Regarding the combination of open-back headphones and speakers> To elaborate on the disclosed technology, we have adopted a configuration in which evaluation sounds are supplied from open-type headphones and ambient noise is supplied from speakers placed around the evaluator. The disclosed technology aims to evaluate "communication sound output from speakers of a public address system in a noisy environment," and conducts listening evaluations by simulating that environment. In real-world environments, ambient noise and the target sound generally originate at different locations. In such cases, humans can distinguish between ambient noise and the target sound through the cocktail party effect. One possible method for simulating a noisy environment is to supply headphones with an electronically mixed version of ambient noise and the sound to be evaluated. However, in this case, the ambient noise and the sound to be evaluated would be in the same location. Separating multiple sounds originating from the same location is not easy even for humans. For example, the sound to be evaluated would be buried in the noise, making it difficult to obtain a stable evaluation value. In contrast, the disclosed technology supplies ambient noise from a speaker (external localization) and the sound to be evaluated from headphones (internal localization). By supplying each sound from different devices, it simulates a situation where the ambient noise and the sound to be evaluated occur at different locations. As a result, the cocktail party effect is achieved, making it possible to distinguish between ambient noise and the sound to be evaluated, and thus enabling the acquisition of stable evaluation values. 【0046】 [Programs, recording media] The various processes described above can be carried out by loading a program that executes each step of the above method into the recording unit 2020 of the computer 2000 shown in Figure 14, and then causing the control unit 2010, input unit 2030, output unit 2040, display unit 2050, etc. to operate. 【0047】 The program describing this process can be recorded on a computer-readable recording medium. Any computer-readable recording medium can be used, such as a magnetic recording device, optical disc, magneto-optical recording medium, or semiconductor memory. 【0048】 Furthermore, this program may be distributed, for example, by selling, transferring, or lending portable recording media such as DVDs or CD-ROMs on which the program is recorded. Alternatively, the program may be stored in the storage device of a server computer and distributed by transferring the program from the server computer to other computers via a network. 【0049】 A computer executing such a program may, for example, first store the program recorded on a portable storage medium or a program transferred from a server computer in its own storage device. Then, when processing is to be executed, the computer reads the program stored on its own storage medium and executes the processing according to the read program. Alternatively, the computer may directly read the program from the portable storage medium and execute the processing according to that program, or it may sequentially execute the processing according to the received program each time a program is transferred to it from a server computer. Furthermore, the above processing may be executed by a so-called ASP (Application Service Provider) type service, where the server computer does not transfer programs to this computer, but the processing function is realized only by execution instructions and result acquisition. In this form, the program includes information used for processing by an electronic computer that is equivalent to a program (data that is not a direct instruction to the computer but has the property of defining the processing of the computer, etc.). 【0050】 Furthermore, in this configuration, the device is configured by executing a predetermined program on a computer, but at least a part of these processes may be implemented in hardware.

Claims

[Claim 1] A device for evaluating the acoustic quality of a public address communication system including a first terminal device and a second terminal device, A data storage unit that records the sound to be evaluated, which was received by the first terminal device and received by the second terminal device, An open-type acoustic device worn on the head includes an acoustic output processing unit that outputs the sound to be evaluated, The aforementioned open-type acoustic device includes a noise output processing unit that outputs ambient noise surrounding the acoustic device. Acoustic quality evaluation device including [Claim 2] The acoustic quality evaluation apparatus according to claim 1, The sound to be evaluated is a first degraded sound in which the voice of the user of the first terminal device is superimposed with incidental sounds including acoustic echo and / or ambient noise of the first terminal device. Acoustic quality evaluation device. [Claim 3] The acoustic quality evaluation apparatus according to claim 1, The sound to be evaluated is a second degraded sound obtained by signal processing a sound in which the voice of the user of the first terminal device is superimposed with accompanying sounds including acoustic echo and / or ambient noise of the first terminal device. Acoustic quality evaluation device. [Claim 4] A sound quality evaluation measure according to claim 2 or 3, The data storage unit further records a reference tone, which is the voice of the user of the first terminal device and does not include the aforementioned incidental sounds. The aforementioned sound output processing unit outputs the reference tone and the tone to be evaluated in sequence. Acoustic quality evaluation device. [Claim 5] A method for evaluating the acoustic quality of a public address communication system including a first terminal device and a second terminal device, The steps include recording the sound to be evaluated, which is received by the first terminal device and then received by the second terminal device, in the data storage unit, The steps include: outputting the sound to be evaluated from the acoustic output processing unit to a head-mounted open-type acoustic device; The noise output unit outputs ambient noise that surrounds the open-type acoustic device. A method for evaluating acoustic quality, including the following: [Claim 6] A method for evaluating acoustic quality according to claim 5, The sound to be evaluated is a first degraded sound in which the voice of the user of the first terminal device is superimposed with incidental sounds including acoustic echo and / or ambient noise of the first terminal device. Acoustic quality evaluation method. [Claim 7] A method for evaluating acoustic quality according to claim 5, The sound to be evaluated is a second degraded sound obtained by signal processing a sound in which the voice of the user of the first terminal device is superimposed with accompanying sounds including acoustic echo and / or ambient noise of the first terminal device. Acoustic quality evaluation method. [Claim 8] A method for evaluating acoustic quality according to claim 6 or 7, The data storage unit includes the step of recording a reference tone, which is the voice of the user of the first terminal device and does not include the accompanying sound. The step in which the sound output processing unit outputs the sound to be evaluated is a step in which the reference sound and the sound to be evaluated are output in sequence. Acoustic quality evaluation method. [Claim 9] A program for causing a computer to function as an acoustic quality evaluation device according to any one of claims 1 to 3.