A hearing aid pickup regulation method based on spatial unmasking

By constructing virtual spatial audio and beamforming algorithm control, the problem of insufficient speech recognition in bone conduction hearing aids for hearing people with poor spatial demasking ability is solved, and the speech recognition effect of hearing aids is improved.

CN116367061BActive Publication Date: 2026-06-26GUANGZHOU UNIVERSITY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
GUANGZHOU UNIVERSITY
Filing Date
2023-03-28
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing bone conduction hearing aids do not significantly improve speech recognition ability in listeners with poor spatial demasking ability because the hearing aid chosen by the listener is not matched with their own spatial demasking ability.

Method used

By constructing virtual spatial audio, adjusting the perceived loudness of bone conduction devices, performing speech recognition threshold tests, calculating the spatial separation angle between target sound and masking sound, and performing beamforming algorithms for adjustment when necessary, speech recognition is enhanced using GSC beamforming algorithms.

Benefits of technology

It improves the speech recognition ability of hearing aids, especially for hearing people with poor spatial demasking ability, thus enhancing the practicality and speech recognition effect of hearing aids.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116367061B_ABST
    Figure CN116367061B_ABST
Patent Text Reader

Abstract

The application provides a hearing aid pickup regulation method based on spatial unmasking, comprising the following steps: constructing virtual spatial audio; the virtual spatial audio comprises target sound and masking sound; adjusting the bone conduction stimulation perceived loudness of a bone conduction device under the same virtual spatial audio to be equal to the air conduction device; performing speech recognition threshold test under the virtual spatial audio, and calculating the spatial separation angle of the target sound and the masking sound starting to produce spatial unmasking; performing beam incidence interference to the spatial separation angle direction above the preset spatial separation angle threshold, the application increases the GSC beam forming algorithm for regulation, and increases the beam forming algorithm for listeners with poor spatial unmasking ability; while increasing the speech recognition ability of the listener using the hearing aid, the algorithm complexity can be minimized according to the measured separation angle starting to produce spatial unmasking for regulation, and the practicability of the hearing aid is increased.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of hearing aid technology, and specifically relates to a hearing aid pickup control method based on spatial demasking. Background Technology

[0002] Currently, sound perception pathways are generally considered to be divided into two types: air conduction and bone conduction. Unlike air conduction, sound waves in bone conduction do not need to pass through the external auditory canal and middle ear. Therefore, wearing bone conduction devices does not block the ear canal, a characteristic that has led to the increasingly widespread application of bone conduction technology in hearing protection and communication fields. However, because the transcranial attenuation of bone conduction devices is much smaller than that of air conduction headphones, the spatial sound reproduction effect of bone conduction devices is not as good as that of air conduction.

[0003] Because of the spatial separation between the target sound and the masking sound, which leads to masking release and thus improves speech intelligibility, this spatial benefit is often referred to as Spatial Demasking Reduction (SRM). Spatial demasking is quantified as the difference in speech recognition thresholds (SRTs) between the masking and target sounds at the same location and spatial separation, where the SRT is typically represented by the signal-to-noise ratio (SNR) at which 50% speech accuracy is achieved. Spatial demasking capability generally comprises two aspects. The first aspect is the spatial separation angle between the target and masking sounds when SRM begins to be generated. The second aspect involves the magnitude of the SRM generated by the target and masking sounds at different spatial separation angles. This invention primarily focuses on the first aspect of spatial demasking capability, namely, considering the spatial separation angle between the target and masking sounds when SRM begins to be generated, and using beamforming algorithms to improve speech intelligibility within the azimuth angle range where spatial demasking has not occurred.

[0004] Currently, most spatial demasking experiments, especially measurements using bone conduction devices, determine a listener's spatial demasking ability in an anechoic chamber or a listening room with appropriate acoustic treatment. Currently, listeners choose hearing aids based on their individual hearing results or use standardized hearing aids. However, listeners' spatial demasking abilities vary. Listeners with poor spatial demasking abilities only exhibit spatial demasking when the spatial separation angle between the target sound and the masking sound is large. This results in poor speech recognition ability when the spatial separation angle between the target sound and the masking sound is small. Choosing a standardized hearing aid may not match a listener's speech recognition ability, leading to insignificant improvements in hearing ability even after wearing hearing aids. Summary of the Invention

[0005] In view of the above-mentioned deficiencies of the prior art, the purpose of this invention is to provide a hearing aid pickup control method based on spatial demasking, comprising the following steps:

[0006] S1. Construct virtual space audio; the virtual space audio includes target sound and masking sound;

[0007] S2. Adjust the perceived loudness of bone conduction stimulation of the bone conduction device under the same virtual space audio to be equal to that of the air conduction device.

[0008] S3. Perform a speech recognition threshold test under the virtual space audio and calculate the spatial separation angle between the target sound and the masking sound that begin to generate spatial demasking.

[0009] S4. Perform beam incident interference in the direction of spatial separation angle above the preset spatial separation angle threshold.

[0010] Furthermore, in the virtual space audio, the target sound and the masking sound include the following settings:

[0011] Both the target sound and the cover sound are at 0°;

[0012] The target sound is at 0°, while the masking sounds are at 5°, 10°, 15°, 30°, 45°, 60°, 75°, 90°, 105°, 120°, 135°, 150°, 165°, and 180° respectively.

[0013] The 0° is the right side of the horizontal plane in front of the subject, with the subject as the reference frame.

[0014] Furthermore, in step S3, the speech recognition threshold test includes the following steps:

[0015] S101. At the start of each test, fix the masking sound level at the first value and set the signal-to-mask ratio to the second value;

[0016] S102. During the presentation of each group of N sentences during the experiment, the target sound level was adaptively adjusted, and the subjects repeated the speech based on the played speech;

[0017] S103. For each subject and each spatial condition, the arithmetic mean of the signal-to-mask ratios of the last n sentences out of N sentences is taken as SRT.

[0018] Further, in S102, the adaptive adjustment is as follows: a paraphrase accuracy threshold is set, experiments that exceed the preset accuracy threshold will cause the target sound level to decrease, and conversely, experiments that exceed the preset accuracy threshold will cause the target sound level to increase, and the target sound step size decreases after the target sound level decreases once and increases once.

[0019] Furthermore, the target sound level undergoes multiple reversals, wherein the reversal is a sequential decrease followed by an increase in the target sound level; each time a reversal is performed, the target sound level is halved in the next reversal.

[0020] Furthermore, in step S3, the spatial demasking measurement method specifically involves measuring the speech recognition threshold SRT, which measures the direction and spatial separation between the target sound and the masking sound. 同向 SRT 分离 By measuring the speech recognition threshold, the spatial demasking value SRM = SRT is calculated. 同向 -SRT 分离 .

[0021] Furthermore, in S4, when beam interference is applied, a generalized sidelobe elimination algorithm, a minimum variance distortionless response, or a linearly constrained minimum variance algorithm is used.

[0022] Further, when performing beam interference incident, the specific steps of the generalized sidelobe cancellation algorithm are as follows: The GSC beamforming structure includes three parts: a fixed beamformer, a blocking matrix, and an adaptive noise canceller. The multi-channel noisy signal collected by the microphone array first passes through the fixed beamformer of the upper branch and the blocking matrix of the lower branch. The fixed beamformer outputs the desired signal and some residual noise, and the blocking matrix blocks the desired signal to obtain reference noise. The outputs of the upper and lower branches are respectively input to the adaptive noise canceller to further eliminate the residual noise of the upper branch, thereby obtaining the enhanced signal. GSC decomposes the weight vector into two parts: adaptive weight and non-adaptive weight. The adaptive weight is located in the orthogonal space of the constraint space, and the non-adaptive weight is located in the constraint subspace. GSC mainly consists of two parts: the main path and the auxiliary path. The target signal passes through the main path, and noise and interference pass through the auxiliary path. The weight vector can be expressed as:

[0023] w = w q -Bw a

[0024] Among them, w q =(CC) H ) -1 Cf represents the non-adaptive weights, f denotes a P*1 dimensional constraint vector, and the constraint subspace is represented by an M*P dimensional constraint matrix C, where the number of microphones M must be less than the number of linear constraint conditions P; w a = (B H R x B) -1 B H R x w q For adaptive weights; the minimum variance subspace is represented by an M*(MP) dimensional blocking matrix B. The purpose of the blocking matrix B is to ensure that the target signal does not enter the auxiliary path. The output of the blocking matrix is ​​intended to allow only noise to pass through. The column vectors that make up B are located in the positive intercomplement space of the constraint subspace. Since the constraint matrix and the blocking matrix are mutually orthogonal, they satisfy B. HWhen C=0, the blocking matrix is ​​used to block the desired audio signal, causing the main output to... The output after projection of the blocking matrix is ​​z = B H If x, then the adaptive weight vector can be expressed as w a It is the Wiener solution that guarantees the minimum mean square error of the main and auxiliary roads, where R z =B H RB is the covariance matrix of z, p z =B H Rw q It is z and y c The cross-correlation vector.

[0025] Compared with the prior art, the present invention has the following obvious and prominent substantive features and significant advantages:

[0026] 1. The hearing aid pickup control method based on spatial demasking uses Mandarin speech test materials in the speech recognition threshold test, which can predict the subject's speech recognition ability.

[0027] 2. This hearing aid pickup control method based on spatial demasking adds a GSC beamforming algorithm for control. No algorithm control is needed for hearings with good spatial demasking ability, while a beamforming algorithm is added for hearings with poor spatial demasking ability. While increasing the hearing aid's speech recognition ability, the method controls the pickup based on the measured separation angle at which spatial demasking begins, which can minimize the complexity of the algorithm and increase the practicality of the hearing aid. Attached Figure Description

[0028] Figure 1 This is a schematic diagram of the method flow of the present invention;

[0029] Figure 2 This is a diagram showing the location of the sound source;

[0030] Figure 3 This is a flowchart illustrating the speech recognition threshold test based on bone conduction oscillators;

[0031] Figure 4 This is a schematic diagram of the speech recognition threshold results of 11 subjects;

[0032] Figure 5 This is a schematic diagram of the spatial demasking results for 11 subjects;

[0033] Figure 6 This is a schematic diagram of the Generalized Sidelobe Cancellation (GSC) beamforming algorithm. Detailed Implementation

[0034] The embodiments of the present invention will be described in detail below. The embodiments described below are implemented based on the technical solution of the present invention, and detailed implementation methods and specific operation processes are given. However, the protection scope of the present invention is not limited to the embodiments described below.

[0035] The purpose of this invention is to provide a hearing aid pickup control method based on spatial demasking, such as... Figure 1 The above includes the following steps:

[0036] Virtual spatial audio is constructed using HRTF; spatial virtual sound with different directional angles is generated by convolving HRTF with target sound and masking sound. The virtual spatial audio includes target sound and masking sound. Specifically, the target sound is Mandarin speech test material, and the masking sound is babble noise. In fact, the target sound and masking sound in the virtual spatial audio include, for example, Figure 2 The sound source orientation diagram shown is set as follows: both the target sound and the masking sound are at 0°; the target sound is at 0° while the masking sounds are at 5°, 10°, 15°, 30°, 45°, 60°, 75°, 90°, 105°, 120°, 135°, 150°, 165°, and 180° respectively, and 0° is the right side of the horizontal plane in front of the subject with the subject as the reference frame.

[0037] The perceived loudness of bone conduction stimulation on the bone conduction device under the same virtual spatial audio was adjusted to be equal to that of the air conduction device. The equal loudness matching method was adopted, specifically: air conduction stimulation was presented at a level of 65 dB SPL using air conduction headphones, and bone conduction stimulation was presented by the bone conduction transducer at the mastoid process. Then, noise stimulation (located at a horizontal angle of 0°) was played alternately through the bone conduction device and air conduction headphones. The subject adjusted the signal applied to the bone conduction transducer to match the perceived loudness of the 65 dB SPL bone conduction stimulation.

[0038] Speech recognition threshold tests were performed under the virtual spatial audio; specifically, speech recognition thresholds of subjects were measured under 15 spatial conditions using bone conduction stimulation, such as... Figure 3The diagram shows a flowchart of the speech recognition threshold test using a bone conduction oscillator. An adaptive procedure was used for the SRT test to obtain the speech recognition threshold at 50% accuracy. In each experimental condition, the masking sound level was consistently fixed at 65 dB SPL, and the signal-to-mask ratio (SPR) was initially set to 5 dB, then adaptively adjusted during the presentation of 20 sentences per group throughout the experiment. Subjects repeated the played speech; a test was considered correct when more than 5 words were correctly repeated, and incorrect when 5 or fewer words were correctly repeated. In subsequent tests, correct tests resulted in a decrease in the SPR, while incorrect tests resulted in an increase. The alternating decrease and increase of the SPR was defined as a reversal process. The initial step size was 8 dB, decreasing to 4 dB after the first reversal and then to 2 dB after the second reversal. For each subject and each spatial condition, the arithmetic mean of the SPRs of the last 8 sentences was taken as the SRT. Subjects were asked to repeat the sentence as many times as possible, with each sentence presented one to four times based on the subject's response. Throughout the process, no feedback was provided to the participants regarding the correctness of their answers.

[0039] The spatial separation angles of the target sound and the masking sound at the start of spatial demasking are calculated; spatial demasking is calculated based on the subject's speech recognition threshold results. Figure 4 The speech recognition threshold results are shown for a total of 11 subjects. The speech recognition threshold (SRT) was measured using bone conduction virtual sound playback. 同向 SRT 分离 The spatial demasking value SRM = SRT is calculated. 同向 -SRT 分离 The spatial separation angle between the target sound and the masking sound at the beginning of spatial demasking was obtained. The spatial demasking results of 11 subjects are as follows: Figure 5 As shown.

[0040] Beamforming interference is applied to spatial separation angles above a preset threshold. Based on the initial spatial separation angle at which spatial demasking begins, it is determined whether a beamforming algorithm is needed for adjustment. Specifically, a spatial separation angle α is set as the standard for evaluating the quality of spatial demasking capability. If the initial spatial separation angle at which the subject begins to produce spatial demasking is less than α, no beamforming algorithm is needed; otherwise, a beamforming algorithm is required to enhance speech recognition capability at small spatial separation angles.

[0041] When performing beam interference incident, the generalized sidelobe cancellation algorithm, the minimum variance distortionless response algorithm, or the linearly constrained minimum variance algorithm can be used. Specifically, during beam interference incident, the generalized sidelobe cancellation (GSC) beamforming algorithm is controlled using MATLAB. The structure of the GSC algorithm is as follows... Figure 6As shown, specifically, the GSC beamforming structure comprises three parts: a fixed beamformer, a blocking matrix, and an adaptive noise canceller. The multi-channel noisy signal acquired by the microphone array first passes through the fixed beamformer in the upper branch and the blocking matrix in the lower branch. The fixed beamformer outputs the desired signal and some residual noise, while the blocking matrix blocks the desired signal to obtain reference noise. The outputs of the upper and lower branches are input to the adaptive noise canceller to further eliminate the residual noise in the upper branch, thus obtaining the enhanced signal. GSC decomposes the weight vector into adaptive weights and non-adaptive weights. The adaptive weights reside in the orthogonal space of the constraint space, while the non-adaptive weights reside in the constraint subspace. GSC mainly consists of a main path and auxiliary paths. The target signal passes through the main path, while noise and interference pass through the auxiliary path. The weight vector can be expressed as...

[0042] w = w q -Bw a

[0043] Among them, w q =(CC) H ) -1 Cf represents the non-adaptive weights, f denotes a P*1 dimensional constraint vector, and the constraint subspace is represented by an M*P dimensional constraint matrix C, where the number of microphones M must be less than the number of linear constraint conditions P; w a = (B H R x B) -1 B H R x w q For adaptive weights; the minimum variance subspace is represented by an M*(MP) dimensional blocking matrix B. The purpose of the blocking matrix B is to ensure that the target signal does not enter the auxiliary path. The output of the blocking matrix is ​​intended to allow only noise to pass through. The column vectors that make up B are located in the positive intercomplement space of the constraint subspace. Since the constraint matrix and the blocking matrix are mutually orthogonal, they satisfy B. H When C=0, the blocking matrix is ​​used to block the desired audio signal, causing the main output to... The output after projection of the blocking matrix is ​​z = B H If x, then the adaptive weight vector can be expressed as w a It is the Wiener solution that guarantees the minimum mean square error of the main and auxiliary roads, where R z =B H RB is the covariance matrix of z, p z =B H Rw q It is z and y cThe cross-correlation vector. When the branch contains fewer target signals, GSC performs well; however, when the sound source moves or reverberation is severe, the target signal in z exceeds a certain level, which will cause leakage of the desired signal. In the subsequent adaptive filtering process, the noise signal and the desired speech signal of the upper branch will cancel each other out, resulting in distortion of the desired speech and a decrease in algorithm performance.

[0044] The preferred embodiments of the present invention have been described in detail above. It should be understood that those skilled in the art can make numerous modifications and variations based on the concept of the present invention without creative effort. Therefore, all technical solutions that can be obtained by those skilled in the art based on the concept of the present invention through logical analysis, reasoning, or limited experimentation on the basis of existing technology should be within the scope of protection defined by the claims.

Claims

1. A hearing aid pickup control method based on spatial demasking, characterized in that, Includes the following steps: S1. Construct virtual space audio; the virtual space audio includes target sound and masking sound; S2. Adjust the perceived loudness of the bone conduction device under the same virtual space audio to be equal to that of the air conduction device. S3. Perform a speech recognition threshold test under the virtual space audio and calculate the spatial separation angle between the target sound and the masking sound that begin to generate spatial demasking. S4. Perform beam incident interference in the direction of spatial separation angle above the preset spatial separation angle threshold.

2. The hearing aid pickup control method based on spatial demasking according to claim 1, characterized in that, In the virtual space audio, the target sound and masking sound include the following settings: Both target sound and cover sound are at 0 ; Target sound at 0 The covering sound was respectively at 5 10 15 30 45 60 75 90 105 120 135 150 165 180 ; The 0 With the subject as the frame of reference, the point is directly to the right of the horizontal plane in front of the subject.

3. The hearing aid pickup control method based on spatial demasking according to claim 1, characterized in that, In step S3, the speech recognition threshold test includes the following steps: S101. At the start of each test, fix the masking sound level at the first value and set the signal-to-mask ratio to the second value; S102. During the presentation of each group of N sentences during the experiment, the target sound level was adaptively adjusted, and the subjects repeated the speech based on the played speech; S103. For each subject and each spatial condition, the arithmetic mean of the signal-to-mask ratios of the last n sentences out of N sentences is taken as SRT.

4. The hearing aid pickup control method based on spatial demasking according to claim 3, characterized in that, In S102, the adaptive adjustment is as follows: a paraphrase accuracy threshold is set. Experiments with a paraphrase accuracy threshold higher than the preset accuracy threshold will cause the target sound level to decrease, and vice versa, the target sound level will increase. The target sound step size decreases after the target sound level decreases once and increases once.

5. The hearing aid pickup control method based on spatial demasking according to claim 4, characterized in that, The target sound level is reversed multiple times, and the reversal is that the target sound level decreases and increases sequentially; after each reversal, the target sound level is halved in the next reversal.

6. The hearing aid pickup control method based on spatial demasking according to claim 1 or 3, characterized in that, In S3, the spatial demasking measurement method specifically involves measuring the speech recognition thresholds for the co-directional orientation and spatial separation of the target sound and the masking sound. , Spatial demasking values ​​are calculated by measuring the speech recognition threshold. .

7. The hearing aid pickup control method based on spatial demasking according to claim 1, characterized in that, In S4, when performing beam incidence interference, the generalized sidelobe elimination algorithm, the minimum variance distortionless response, or the linearly constrained minimum variance algorithm are used.

8. The hearing aid pickup control method based on spatial demasking according to claim 7, characterized in that, When performing beam incidence interference, the specific steps of the generalized sidelobe cancellation algorithm are as follows: The GSC beamforming structure consists of three parts: a fixed beamformer, a blocking matrix, and an adaptive noise canceller. The multi-channel noisy signal acquired by the microphone array first passes through the fixed beamformer of the upper branch and the blocking matrix of the lower branch. The fixed beamformer outputs the desired signal and some residual noise, and the blocking matrix blocks the desired signal to obtain reference noise. The outputs of the upper and lower branches are respectively input to the adaptive noise canceller to further cancel the residual noise of the upper branch, thereby obtaining the enhanced signal. GSC decomposes the weight vector into two parts: adaptive weight and non-adaptive weight. The adaptive weight is located in the orthogonal space of the constraint space, and the non-adaptive weight is located in the constraint subspace. GSC mainly consists of two parts: the main path and the auxiliary path. The target signal passes through the main path, and noise and interference pass through the auxiliary path. The weight vector can be expressed as: in, For non-adaptive weights, f represents a P*1 dimensional constraint vector, and the constraint subspace is represented by an M*P dimensional constraint matrix C, where the number of microphones M needs to be less than the number of linear constraint conditions P. For adaptive weights; the minimum variance subspace is represented by an M*(MP) dimensional blocking matrix B. The purpose of the blocking matrix B is to ensure that the target signal does not enter the auxiliary path. The output of the blocking matrix is ​​intended to allow only noise to pass through. The column vectors constituting B are located in the positive intercomplement space of the constraint subspace. Since the constraint matrix and the blocking matrix are mutually orthogonal, they satisfy... The blocking matrix is ​​used to block the desired speech signal, causing the main output to... The output after projection of the blocking matrix is Then the adaptive weight vector is expressed as , This is the Wiener solution that guarantees the minimum mean square error of the main and auxiliary roads, where It is the covariance matrix of z. It is z and The cross-correlation vector.