Environmental audio processing method, apparatus and earphone for earphone

CN116347287BActive Publication Date: 2026-06-16SHENZHEN JIELI MICROELECTRONICS TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SHENZHEN JIELI MICROELECTRONICS TECH CO LTD
Filing Date: 2023-03-17
Publication Date: 2026-06-16

Application Information

Patent Timeline

17 Mar 2023

Application

16 Jun 2026

Publication

CN116347287B

IPC: H04R1/10

CPC: H04R1/10; H04R2201/10; Y02D30/70

AI Tagging

Application Domain

Microphones Loudspeakers

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Dipole low frequency acoustic wave delivery system
US20260164154A1MicrophonesLoudspeaker transducer fixing
Sound conversion device and microphone
US20260164186A1Microphones Loudspeakers
earphone
CN224356224UImprove contact effectreduce bad contactMicrophones Loudspeakers
Sound field related rendering
US20260172773A1Microphones Electrical transducers
A vibration sensor
CN116171582BPiezoelectric/electrostrictive microphonesMicrophones

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN116347287B_ABST

Patent Text Reader

Abstract

The application discloses an environmental audio processing method and device for earphones and earphones, which utilizes multiple microphones to collect and identify environmental sound of a user to obtain a first environmental sound source; when target sound source information is identified in the first environmental sound source, the first environmental sound source is played to the user, and in the process of change of the user's head position, a second collection mode is switched to, and a preset number of microphones are utilized to collect environmental sound of the user to obtain a second environmental sound source; energy of a target sound source in the first environmental sound source and energy of the target sound source in the second environmental sound source are inversely superimposed to obtain superimposed energy difference; a prompt signal that changes according to the superimposed energy difference is output, wherein the change intensity of the prompt signal is positively or negatively correlated with the size of the energy difference, so that the user knows whether the current orientation is close to the target sound source, and then the specific direction position of the target sound source is quickly located, and the user experience when the user wears the earphones is improved.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of audio processing technology, and more specifically to an environmental audio processing method, apparatus, and headphones for headphones. Background Technology

[0002] Currently, headphones are among the best-selling consumer electronics products, carried around for listening to music, making calls, and playing online games. Existing headphones offer good active noise cancellation, effectively eliminating most external noise signals. However, when users wear noise-canceling headphones to listen to music or play games, they often become immersed in their own world. This can prevent them from catching greetings or exchanging information with others, making it difficult to respond promptly to conversations from friends or superiors. This can lead to delays and negatively impact communication efficiency in daily life, significantly reducing the user experience of active noise cancellation.

[0003] In existing technologies, there are generally two approaches: one is the passive approach, which does not process external sounds, requiring users to remove their active noise-canceling headphones for further communication or conversation; the other is the direct signal amplification approach, which amplifies specific voice signals upon capture to enable smooth communication while the user is listening to music. However, the passive approach can lead to misidentification of specific voice signals, resulting in a poor experience for users listening to music or playing games, and it also causes inconvenience.

[0004] Therefore, how to indicate the target sound source to users when they are wearing headphones has become an urgent technical problem to be solved. Summary of the Invention

[0005] Based on the above situation, the main objective of this invention is to provide an environmental audio processing method, apparatus, and headphones for headphones, so as to prompt the user for the target sound source when the user wears the headphones.

[0006] To achieve the above objectives, the technical solution adopted by the present invention is as follows:

[0007] In a first aspect, embodiments of the present invention disclose an environmental audio processing method for headphones. The headphones are equipped with multiple microphones forming a microphone array, enabling the headphones to collect ambient audio from the user's surroundings when worn. The environmental audio processing method includes:

[0008] Step S100: In the first acquisition mode, multiple microphones are used to acquire and identify the ambient sound of the user's environment to obtain the first ambient sound source.

[0009] Step S200: When the target sound source information is detected in the first ambient sound source, play the first ambient sound source to the user and execute step S300.

[0010] Step S300: During the change of the user's head position, switch to the second acquisition mode and use a preset number of microphones to collect the ambient sound of the user's environment to obtain the second ambient sound source. In the second acquisition mode, the number of microphones collecting ambient sound is less than the number of microphones collecting ambient sound in the first acquisition mode.

[0011] Step S400: The energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source are superimposed in opposite phases to obtain the superimposed energy difference;

[0012] Step S500: Output a prompt signal indicating the change based on the superimposed energy difference. The intensity of the change in the prompt signal is positively or negatively correlated with the magnitude of the energy difference, so that the user can know whether the current orientation is close to the target sound source.

[0013] Optionally, between step S200 and step S400, the following step is further included:

[0014] When the target sound source information is detected in the first ambient sound source, noise reduction processing is performed on the first ambient sound source.

[0015] Calculate the energy of the target sound source after noise reduction in the first ambient sound source.

[0016] Optionally, in step S500, the prompt signal is a prompt tone and / or a vibration prompt from the headphones.

[0017] Optionally, the intensity of the change in the prompt signal is negatively correlated with the magnitude of the energy difference; the smaller the energy difference, the louder the prompt sound and / or the faster the vibration frequency.

[0018] Optionally, in step S300, the preset number of microphones is two microphones, and the audio ranges captured by the two microphones at least partially overlap.

[0019] Optionally, the process may further include the following after step S400:

[0020] Step S610: Compare the superimposed energy difference with a preset value; when the superimposed energy difference is less than the preset value, proceed to step S620.

[0021] Step S620: Switch the working mode of the headphones to transparency mode so that the user can know the sound of the user's current environment.

[0022] Optionally, between step S610 and step S620, the following step is further included:

[0023] Step S630: Detect whether the duration of the user's head posture is maintained exceeds the preset duration; if the duration of the user's head posture is maintained exceeds the preset duration, proceed to step S620.

[0024] In a second aspect, embodiments of the present invention disclose an environmental audio processing device for headphones. The headphones are equipped with multiple microphones forming a microphone array, enabling the headphones to collect ambient audio around the user when worn. The environmental audio processing device includes:

[0025] The first acquisition module is used to acquire and identify the ambient sound of the user's environment using multiple microphones in the first acquisition mode to obtain the first ambient sound source.

[0026] The target sound source identification module is used to play the first ambient sound source to the user and execute the second acquisition module when the target sound source information is identified in the first ambient sound source.

[0027] The second acquisition module is used to switch to the second acquisition mode when the user's head position changes, and use a preset number of microphones to collect the ambient sound of the user's environment to obtain the second ambient sound source. The number of microphones collecting ambient sound in the second acquisition mode is less than the number of microphones collecting ambient sound in the first acquisition mode.

[0028] The phase-inverse superposition module is used to superimpose the energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source to obtain the superimposed energy difference;

[0029] The prompting module is used to output a changing prompting signal based on the superimposed energy difference. The intensity of the change in the prompting signal is positively or negatively correlated with the magnitude of the energy difference, so that the user can know whether the current orientation is close to the target sound source.

[0030] Optionally, it also includes:

[0031] The noise reduction module is used to perform noise reduction processing on the first ambient sound source when the target sound source information is detected in the first ambient sound source.

[0032] The energy calculation module is used to calculate the energy of the target sound source in the first ambient sound source after noise reduction processing.

[0033] Optionally, the prompt signal may be a prompt tone and / or a vibration prompt from the headphones.

[0034] Optionally, the intensity of the change in the prompt signal is negatively correlated with the magnitude of the energy difference; the smaller the energy difference, the louder the prompt sound and / or the faster the vibration frequency.

[0035] Optionally, the preset number of microphones is two microphones, and the audio ranges captured by the two microphones at least partially overlap.

[0036] Optionally, it also includes:

[0037] The comparison module is used to compare the superimposed energy difference with a preset value; when the superimposed energy difference is less than the preset value, the mode switching module is executed.

[0038] The mode switching module is used to switch the headphones' working mode to transparency mode, so that users can be aware of the sound of their current environment.

[0039] Optionally, it also includes:

[0040] The posture detection module is used to detect whether the duration of the user's head posture exceeds the preset duration; when the duration of the user's head posture exceeds the preset duration, the mode switching module is executed.

[0041] Thirdly, embodiments of the present invention disclose a computer-readable storage medium having a computer program stored thereon, wherein the computer program stored in the storage medium is used to be executed to implement the method disclosed in the first aspect above.

[0042] Fourthly, embodiments of the present invention disclose a chip for an audio device having an integrated circuit, the integrated circuit being designed to implement the method disclosed in the first aspect above.

[0043] Fifthly, embodiments of the present invention disclose an earphone, comprising:

[0044] A processor for implementing the method disclosed in the first aspect above.

[0045] [Beneficial Effects]

[0046] According to embodiments of the present invention, an environmental audio processing method, apparatus, and headset for headphones are disclosed. The method utilizes multiple microphones to collect and identify ambient sounds in the user's environment to obtain a first ambient sound source. When a target sound source is detected within the first ambient sound source, the first ambient sound source is played to the user, thereby indicating the presence of a target sound source in the user's current environment. Furthermore, as the user's head position changes, a preset number of microphones are used to collect ambient sounds in the user's environment to obtain a second ambient sound source. By inversely superimposing the energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source, a superimposed energy difference is obtained. A prompt signal positively or negatively correlated with the magnitude of the energy difference is output. This allows the user to determine whether they are approaching a target sound source during movement, thereby achieving the goal of quickly locating the specific direction and position of the target sound source and improving the user experience when wearing headphones.

[0047] Other beneficial effects of the present invention will be explained in detail through the introduction of specific technical features and technical solutions in specific embodiments. Those skilled in the art should be able to understand the beneficial technical effects brought about by these technical features and technical solutions through the introduction of these technical features and technical solutions. Attached Figure Description

[0048] The embodiments of the present invention will now be described with reference to the accompanying drawings. In the drawings:

[0049] Figure 1 This is a flowchart of an environmental audio processing method for headphones disclosed in this embodiment;

[0050] Figure 2 This is a schematic diagram of the ambient sound acquisition in the first acquisition mode of the headphones disclosed in this embodiment;

[0051] Figure 3 This is a schematic diagram of the second acquisition mode of the headphones disclosed in this embodiment for acquiring ambient sound;

[0052] Figure 4 This is a schematic diagram of a two-microphone acquisition method disclosed in this embodiment;

[0053] Figure 5 This is a schematic diagram illustrating an example of a superimposed energy difference waveform disclosed in this embodiment;

[0054] Figure 6 This is a schematic diagram of an environmental audio processing device for headphones disclosed in this embodiment. Detailed Implementation

[0055] The present invention is described below based on embodiments, but the present invention is not limited to these embodiments. In the following detailed description of the present invention, some specific details are described in detail, but well-known methods, processes, procedures, and elements are not described in detail in order to avoid obscuring the essence of the present invention.

[0056] Furthermore, those skilled in the art should understand that the accompanying drawings provided herein are for illustrative purposes only and are not necessarily drawn to scale.

[0057] Unless the context explicitly requires it, the words "comprising," "including," and similar terms throughout the specification and claims should be interpreted as encompassing rather than being exclusive or exhaustive; that is, meaning "including but not limited to."

[0058] In the description of this invention, it should be understood that the terms "first," "second," etc., are used for descriptive purposes only and should not be construed as indicating or implying relative importance. Furthermore, in the description of this invention, unless otherwise stated, "a plurality of" means two or more.

[0059] To alert users to target sound sources when they wear headphones, this embodiment discloses an environmental audio processing method for headphones. In a specific embodiment, multiple microphones are provided on the headphones to form a microphone array, enabling the headphones to collect ambient audio around the user when worn. Specifically, when multiple microphones are provided, they can be arranged around the headphones to allow the headphones to collect ambient audio outside the ears. For a pair of headphones, microphones are provided on the left and right earpieces respectively, thereby achieving 360° sound pickup around the paired headphones. This embodiment does not limit the arrangement of the microphones, as long as the microphone array can collect ambient audio over a sufficiently wide angle.

[0060] Please refer to Figure 1 This is a flowchart of an environmental audio processing method for headphones disclosed in this embodiment. The environmental audio processing method for headphones includes steps S100, S200, S300, S400, and S500, wherein:

[0061] Step S100: In the first acquisition mode, multiple microphones are used to acquire and identify the ambient sound of the user's surroundings to obtain the first ambient sound source. Please refer to... Figure 2 This diagram illustrates the acquisition of ambient sound in the first acquisition mode of the headphones disclosed in this embodiment. When multiple microphones are arranged to form a microphone array, the microphone array can achieve 360-degree positioning, that is, it can acquire ambient audio around the user in 360°. In this embodiment, the first ambient sound source refers to the ambient audio acquired by the multiple microphones. After acquiring the first ambient sound source, speech recognition can be performed on the first ambient sound source to extract the target sound source.

[0062] Step S200: When the target sound source information is detected in the first ambient sound source, the first ambient sound source is played to the user. Please refer to... Figure 2 When a user is wearing the headphones, and a target sound source is present in the surrounding environment (e.g., a friend calling the user), this target sound source will be captured by multiple microphones in the first ambient sound source. Voice recognition technology can then identify the target sound source information from the first ambient sound source. In this embodiment, the target sound source information can be keywords. Specifically, the user can set keywords according to their actual needs, such as "Zhang San," "Zhang Gong," "Xiao Zhang," etc. In a specific embodiment, when the target sound source information is detected in the first ambient sound source, the first ambient sound source can be played to the user through the headphones' speaker, thereby attracting the user's attention and executing step S300.

[0063] It should be noted that, in the specific implementation process, when the user plays the first ambient sound source, the target sound source can be amplified a second time. The amplification can be to enhance the energy of the target sound source, or to reduce the ambient noise in the first ambient sound source. Of course, it can also be a combination of the above enhancement and noise reduction processes, so that the user can better identify the target sound source.

[0064] In step S300, during the change in the user's head position, the system switches to a second acquisition mode, using a preset number of microphones to collect ambient sounds to obtain a second ambient sound source. Specifically, when the first ambient sound source is played to the user through the headphones' speakers, when the user identifies the target sound source and intends to communicate further with the speaker, the user will search for the target sound source, meaning the user's head position will change. During this process, the position of the headphones will also change simultaneously with the user's head position, and this change in head position can be detected, for example, by a gyroscope. In this embodiment, when a change in the user's head position is detected, the system switches to the second acquisition mode to collect the second ambient sound source.

[0065] Please refer to Figure 3 This diagram illustrates the acquisition of ambient sound in the second acquisition mode of the headphones disclosed in this embodiment. In this embodiment, the number of microphones acquiring ambient sound in the second acquisition mode is less than the number of microphones acquiring ambient sound in the first acquisition mode. Because fewer microphones are used to acquire ambient sound in the second acquisition mode, the acquisition angle is limited (i.e., it cannot achieve 360-degree acquisition around the user). For example, when a preset number of microphones are facing the target sound source, the target sound source is within the acquisition range of the preset number of microphones and can be acquired by the microphones. Conversely, when the preset number of microphones are not facing the target sound source (e.g., facing away from the target sound source), the target sound source is outside the acquisition range and cannot be acquired by the microphones.

[0066] Specifically, in step S300, the preset number of microphones is two microphones, and the audio ranges picked up by the two microphones at least partially overlap. Typically, each microphone has a pickup range of 360°. The partial overlap of the audio ranges picked up by the two microphones ensures that the two microphones are facing the same direction, rather than opposite directions. For example, this avoids having one microphone on the left earpiece and the other on the right earpiece, preventing the target sound source from being picked up when the user turns to one side and then the sound source is still picked up when turning to the opposite side. Please refer to [reference needed]. Figure 4This diagram illustrates a two-microphone acquisition method disclosed in this embodiment. In the diagram, "Microphone 1" and "Microphone 2" represent two different microphones. During the subsequent execution of the environmental noise reduction algorithm, the pointing area of the two microphones' acquisition range is the audio range acquired by a preset number of microphones. That is, when the target sound source is within this pointing area, it can be acquired by the preset number of microphones; conversely, when the target sound source is outside this pointing area, the preset number of microphones cannot acquire the target sound source. In this embodiment, the pointing area is defined as the range within a preset angle (e.g., 60°) with the line connecting "Microphone 1" and "Microphone 2" as the central axis. Figure 4 The area shown is indicated by the dashed line.

[0067] Step S400: The energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source are superimposed in opposite phases to obtain the superimposed energy difference. Because the range of the preset number of microphones collecting ambient audio is limited in the second acquisition mode, as the user's head turns, when the target sound source is outside the microphone's acquisition range, there will be no energy of the target sound source in the second ambient sound source; when the target sound source is within the microphone's acquisition range, there will be energy of the target sound source in the second ambient sound source; and when the target sound source is closer to the center of the preset number of microphones' acquisition range, the energy of the target sound source in the second ambient sound source will become stronger. In the first acquisition mode, because multiple microphones are used to collect the first ambient sound source, the energy of the target sound source in the first ambient sound source is basically the same. Therefore, by superimposing the energy difference, it can be determined whether the orientation of the preset number of microphones' acquisition range is close to the target sound source.

[0068] Step S500: Output a prompt signal indicating changes based on the superimposed energy difference. As described in step S400, the superimposed energy difference indicates whether the orientation of a preset number of microphone acquisition ranges is close to the target sound source. Therefore, the intensity of the output prompt signal is positively or negatively correlated with the magnitude of the energy difference. The intensity of the prompt signal can convey information about whether the orientation of the microphone acquisition range is close to the target sound source, meaning the user can know whether their current orientation is close to the target sound source. Specifically, during the user's search for the target sound source, the headphones will simultaneously turn with the user's head position to acquire the current second ambient sound source. The energy of the target sound source in the second ambient sound source is compared with the energy of the target sound source in the first ambient sound source. This comparison reflects the angle between the preset number of microphone acquisition angles and the target sound source. As the current acquisition angle decreases relative to the target sound source, the user is getting closer to the correct target sound source direction, and the headphones can provide feedback.

[0069] In this embodiment, audio energy is obtained by superimposing the target sound source in the second ambient sound source at different locations with the target sound source in the first ambient sound source in reverse phase. The audio energy gradually decreases as it is superimposed, thus guiding the user to turn towards the correct sound source direction. During the turning process, within a preset time window interval, second ambient sound source information is acquired at different locations. Voice noise reduction is performed to extract the human voice, and the average sound energy threshold is calculated. This threshold is then superimposed with the target sound source in the first ambient sound source obtained from the multi-microphone array in reverse phase to obtain the superimposed audio signal. When the energy value of the superimposed audio signal is within a preset range (small range), the current angle is considered to be the angle where the target sound source exists; when the energy value of the superimposed audio signal exceeds the preset range, the current angle is considered not to be the angle where the target sound source exists.

[0070] Furthermore, in this solution, the dual microphones behind the headphones will only retain a certain angle in front. Therefore, if the target sound source is facing away from the sound source, the target sound source signal should be filtered out. The calculated audio signal energy value after the sound source is superimposed will exceed the preset range. Therefore, by superimposing the target sound source signal in reverse phase, the calculated superimposed energy value is within the preset range, which can correctly guide the user to the direction of the target sound source facing them.

[0071] To help those skilled in the art better understand the principle of superposition energy difference, please refer to [reference needed]. Figure 5 This is a schematic diagram illustrating an example of a superimposed energy difference waveform disclosed in this embodiment. It shows the audio signal energy values of the second ambient sound source and the first ambient sound source, collected by a preset number of microphones in the headphones at different positions, in opposite phase. P0 represents the energy value of the target sound source in the first ambient sound source; P1 represents the energy value of the second ambient sound source and the first ambient sound source in opposite phase when the target sound source is directly to the left or right of the user wearing headphones; when the user turns to the opposite direction of the sound source, that is, the target sound source is directly behind the user, the energy value of the inverse superimposed signal is as shown in P2, and its inverse superimposed signal is within a preset range; P1 represents the target sound source being directly in front of the user wearing headphones, the superimposed audio signal energy value is within a preset range and is less than P2, that is, the user shown in P1 is closer to the target sound source.

[0072] In an optional embodiment, in step S500, the prompt signal is a prompt tone and / or a vibration prompt from the headphones. When the prompt signal is a prompt tone, the prompt tone can be played through the headphones' speaker; it can be a semantic prompt tone, or a prompt tone such as a beep or a whistle. When the prompt signal is a vibration prompt, a vibration motor can be used to achieve the vibration prompt.

[0073] In an optional embodiment, the intensity of the change in the prompt signal is negatively correlated with the magnitude of the energy difference; the smaller the energy difference, the louder the prompt sound and / or the faster the vibration frequency.

[0074] To more accurately calculate the energy of the target sound source, in an optional embodiment, between steps S200 and S400, the method further includes: when target sound source information is detected in the first ambient sound source, performing noise reduction processing on the first ambient sound source; and calculating the energy of the target sound source in the first ambient sound source after noise reduction processing. In this embodiment, by performing noise reduction processing on the first ambient sound source, the object of calculation when calculating the energy of the target sound source is the target sound source in the first ambient sound source after noise reduction processing, thereby better reflecting the energy of the target sound source. Consequently, when obtaining the superimposed energy difference through inverse superposition, a more accurate superimposed energy difference can be obtained.

[0075] In an optional embodiment, when performing step S200 to play the first ambient sound source to the user, the target sound source can be played after noise reduction, so that the user can better identify the target sound source.

[0076] To facilitate communication between the user and the speaker of the target sound source, in an optional embodiment, after step S400, the following steps are included: Step S610, comparing the superimposed energy difference with a preset value; when the superimposed energy difference is less than the preset value, step S620 is executed; Step S620, the working mode of the headphones is switched to transparency mode so that the user can be aware of the sound of the user's current environment. Specifically, when the superimposed energy difference is less than the preset value, it is assumed that the user is facing a target sound source. At this time, the headphones can be switched to transparency mode so that the user can hear the ambient sound. In specific implementation, transparency mode can be a non-active noise cancellation mode or a mode where the microphone collects the ambient sound and plays the ambient sound through the speaker. Specifically, when the headphones are active noise cancellation headphones, the active noise cancellation function can be turned off so that the ambient sound can be transmitted to the user's ears; when the headphones are passive noise cancellation headphones, the ambient sound can be collected by the microphone and played through the speaker, thereby also transmitting the ambient sound to the user, so that the user can communicate with the speaker of the target sound source. Of course, for active noise-canceling headphones, the same mode can be achieved by playing ambient sounds through the speakers and using microphones to capture them.

[0077] To reduce the misidentification of target sound sources, in an optional embodiment, between step S610 and step S620, the following step is further included:

[0078] Step S630: Detect whether the duration of the user's head posture exceeds a preset duration; if the duration exceeds the preset duration, proceed to step S620. Specifically, when step S610 initially determines that a target sound source exists in the current direction, it can be further determined whether the user's dwell time reaches the preset duration. If the duration exceeds the preset duration, the headset switches to transparency mode. This avoids keyword misjudgments caused by factors such as advertising words or duplicate names, reducing the probability of accidental switching to transparency mode. In this embodiment, the above steps prevent the headset from mistakenly switching to transparency mode or interrupting the user's multimedia experience when the user does not expect a response from an external call source. It also avoids misjudgments that may contain content similar to the wake word, such as advertisements.

[0079] It should be noted that when the headphone's working mode is switched to transparency mode, multimedia can be further paused (music playback is paused), allowing users to better understand the sound of their current environment.

[0080] This embodiment also discloses an environmental audio processing device for headphones. The headphones are equipped with multiple microphones forming a microphone array, enabling the headphones to collect ambient audio from the user's surroundings when worn. Please refer to [reference needed]. Figure 6 This is a schematic diagram of an environmental audio processing device for headphones disclosed in this embodiment. The environmental audio processing device includes: a first acquisition module 100, a target sound source identification module 200, a second acquisition module 300, an inversion superposition module 400, and a prompting module 500, wherein:

[0081] The first acquisition module 100 is used in the first acquisition mode to acquire and identify the ambient sound of the user's environment using multiple microphones to obtain the first ambient sound source.

[0082] The target sound source identification module 200 is used to play the first ambient sound source to the user and execute the second acquisition module 300 when the target sound source information is identified in the first ambient sound source.

[0083] The second acquisition module 300 is used to switch to the second acquisition mode during the change of the user's head position, and use a preset number of microphones to collect the ambient sound of the user's environment to obtain the second ambient sound source. In the second acquisition mode, the number of microphones collecting ambient sound is less than the number of microphones collecting ambient sound in the first acquisition mode.

[0084] The phase-inverse superposition module 400 is used to superimpose the energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source to obtain the superimposed energy difference;

[0085] The prompt module 500 is used to output a changing prompt signal based on the superimposed energy difference. The intensity of the change in the prompt signal is positively or negatively correlated with the magnitude of the energy difference, so that the user can know whether the current orientation is close to the target sound source.

[0086] In an optional embodiment, it further includes:

[0087] The noise reduction module is used to perform noise reduction processing on the first ambient sound source when the target sound source information is detected in the first ambient sound source.

[0088] The energy calculation module is used to calculate the energy of the target sound source in the first ambient sound source after noise reduction processing.

[0089] In an optional embodiment, the prompt signal is a prompt tone and / or a vibration prompt from the headphones.

[0090] In an optional embodiment, the intensity of the change in the prompt signal is negatively correlated with the magnitude of the energy difference; the smaller the energy difference, the louder the prompt sound and / or the faster the vibration frequency.

[0091] In an optional embodiment, the preset number of microphones is two microphones, and the audio ranges captured by the two microphones at least partially overlap.

[0092] In an optional embodiment, it further includes:

[0093] The comparison module is used to compare the superimposed energy difference with a preset value; when the superimposed energy difference is less than the preset value, the mode switching module is executed.

[0094] The mode switching module is used to switch the headphones' working mode to transparency mode, so that users can be aware of the sound of their current environment.

[0095] In an optional embodiment, it further includes:

[0096] The posture detection module is used to detect whether the duration of the user's head posture exceeds the preset duration; when the duration of the user's head posture exceeds the preset duration, the mode switching module is executed.

[0097] This embodiment also discloses a computer-readable storage medium, such as a chip or optical disc, on which a computer program is stored. The computer program stored in the storage medium is used to be executed to implement the method disclosed in the above embodiment.

[0098] This embodiment also discloses a chip for an audio device having an integrated circuit, which is designed to implement the methods disclosed in the above embodiments.

[0099] This embodiment also discloses an earphone, which can be a wired earphone or a wireless earphone. The earphone includes a processor for implementing the method disclosed in the above embodiment.

[0100] According to embodiments of the present invention, an environmental audio processing method, apparatus, and headset for headphones are disclosed. The method utilizes multiple microphones to collect and identify ambient sounds in the user's environment to obtain a first ambient sound source. When a target sound source is detected within the first ambient sound source, the first ambient sound source is played to the user, thereby indicating the presence of a target sound source in the user's current environment. Furthermore, as the user's head position changes, a preset number of microphones are used to collect ambient sounds in the user's environment to obtain a second ambient sound source. By inversely superimposing the energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source, a superimposed energy difference is obtained. A prompt signal positively or negatively correlated with the magnitude of the energy difference is output. This allows the user to determine whether they are approaching a target sound source during movement, thereby achieving the goal of quickly locating the specific direction and position of the target sound source and improving the user experience when wearing headphones.

[0101] This solution enables users wearing active noise-canceling headphones to quickly locate the specific direction and position of the speaker they intend to communicate with, as well as the content of the speech, thus assisting and guiding users to quickly find the location of the target speaker.

[0102] Furthermore, this avoids the headphones mistakenly switching to transparency mode or interrupting the user's multimedia experience when the user does not expect a response from an external audio source.

[0103] It should be noted that the computer-readable storage medium described in the embodiments of this disclosure is not limited to the embodiments given above. For example, it can also be an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections having one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In the embodiments of this disclosure, the computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

[0104] It will be understood by those skilled in the art that the above-described preferred solutions can be freely combined and superimposed without conflict. The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings; for example, two consecutively indicated blocks may actually be executed substantially in parallel, or sometimes in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions. The numbering of each step in this document is for ease of explanation and reference only and is not intended to limit the order of execution. The specific execution order is determined by the technology itself, and those skilled in the art can determine various permissible and reasonable orders based on the technology itself.

[0105] It should be noted that the use of step numbers (letters or numbers) to refer to certain specific method steps in this invention is merely for the purpose of convenience and brevity in description, and is by no means intended to restrict the order of these method steps. Those skilled in the art will understand that the order of the relevant method steps should be determined by the technology itself and should not be unduly restricted by the existence of step numbers. Those skilled in the art can determine various permissible and reasonable orderings of steps based on the technology itself.

[0106] Those skilled in the art will understand that, without conflict, the above-mentioned preferred solutions can be freely combined and superimposed.

[0107] It should be understood that the above embodiments are merely exemplary and not restrictive. Various obvious or equivalent modifications or substitutions that can be made by those skilled in the art regarding the above details without departing from the basic principles of the present invention will be included within the scope of the claims of the present invention.

Claims

1. An environmental audio processing method for headphones, wherein the headphones are equipped with multiple microphones forming a microphone array, so that the headphones can collect ambient audio around the user when worn by the user, characterized in that... The environmental audio processing method includes: Step S100: In the first acquisition mode, multiple microphones are used to acquire and identify the ambient sound of the user's environment to obtain the first ambient sound source. Step S200: When the presence of target sound source information in the first ambient sound source is detected, the first ambient sound source is played to the user, and step S300 is executed; the target sound source information includes keywords; Step S300: During the change of the user's head position, switch to the second acquisition mode and use a preset number of microphones to collect the ambient sound of the user's environment to obtain the second ambient sound source. The number of microphones collecting ambient sound in the second acquisition mode is less than the number of microphones collecting ambient sound in the first acquisition mode. Step S400: The energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source are superimposed in reverse phase to obtain the superimposed energy difference; the superimposed energy difference can reflect the size of the angle between the sampling angle of a preset number of microphones and the angle of the target sound source. As the current sampling angle and the angle of the target sound source become smaller, the user gets closer and closer to the correct direction of the target sound source. Step S500: Output a prompt signal indicating a change based on the superimposed energy difference, wherein the intensity of the change in the prompt signal is positively or negatively correlated with the magnitude of the energy difference, so that the user can know whether the current orientation is close to the target sound source.

2. The environmental audio processing method as described in claim 1, characterized in that, Between step S200 and step S400, the following is also included: When the presence of target sound source information in the first ambient sound source is detected, noise reduction processing is performed on the first ambient sound source. Calculate the energy of the target sound source in the first ambient sound source after noise reduction processing.

3. The environmental audio processing method as described in claim 1, characterized in that, In step S500, the prompt signal is a prompt tone and / or a vibration prompt from the headphones.

4. The environmental audio processing method as described in claim 3, characterized in that, The intensity of the change in the prompt signal is negatively correlated with the magnitude of the energy difference. The smaller the energy difference, the louder the prompt sound and / or the faster the vibration frequency.

5. The environmental audio processing method according to any one of claims 1-4, characterized in that, In step S300, the preset number of microphones is two microphones, and the audio ranges collected by the two microphones at least partially overlap.

6. The environmental audio processing method according to any one of claims 1-4, characterized in that, The process further includes the following after step S400: Step S610: Compare the superimposed energy difference with a preset value; when the superimposed energy difference is less than the preset value, proceed to step S620. Step S620: Switch the working mode of the headphones to transparency mode so that the user can know the sound of the user's current environment.

7. The environmental audio processing method as described in claim 6, characterized in that, Between steps S610 and S620, the following is also included: Step S630: Detect whether the duration of the user's head posture is maintained exceeds the preset duration; when the duration of the user's head posture is maintained exceeds the preset duration, execute step S620.

8. An ambient audio processing device for headphones, wherein the headphones are provided with a plurality of microphones forming a microphone array, so as to enable the headphones to collect ambient audio around the user when worn by the user, characterized in that, The environmental audio processing device includes: The first acquisition module (100) is used to acquire and identify the ambient sound of the user's environment using multiple microphones in the first acquisition mode to obtain the first ambient sound source. The target sound source identification module (200) is used to play the first ambient sound source to the user and execute the second acquisition module (300) when the target sound source information is identified in the first ambient sound source; the target sound source information includes keywords; The second acquisition module (300) is used to switch to the second acquisition mode during the change of the user's head position, and use a preset number of microphones to collect the ambient sound of the user's environment to obtain the second ambient sound source. The number of microphones collecting ambient sound in the second acquisition mode is less than the number of microphones collecting ambient sound in the first acquisition mode. The phase-inverting superposition module (400) is used to invert and superimpose the energy of the target sound source in the first ambient sound source and the energy of the target sound source in the second ambient sound source to obtain the superimposed energy difference; the superimposed energy difference can reflect the size of the angle between the sampling angle of a preset number of microphones and the angle of the target sound source. As the current sampling angle and the angle of the target sound source become smaller, the user gets closer and closer to the correct direction of the target sound source. The prompting module (500) is used to output a changing prompting signal based on the superimposed energy difference, wherein the intensity of the change in the prompting signal is positively or negatively correlated with the magnitude of the energy difference, so as to inform the user whether the current orientation is close to the target sound source.

9. The environmental audio processing apparatus as described in claim 8, characterized in that, Also includes: The noise reduction module is used to perform noise reduction processing on the first ambient sound source when the presence of target sound source information is detected in the first ambient sound source. The energy calculation module is used to calculate the energy of the target sound source in the first ambient sound source after noise reduction processing.

10. The environmental audio processing apparatus as claimed in claim 8, characterized in that, The prompt signal is a prompt tone and / or a vibration prompt from the headphones.

11. The environmental audio processing apparatus as claimed in claim 10, characterized in that, The intensity of the change in the prompt signal is negatively correlated with the magnitude of the energy difference. The smaller the energy difference, the louder the prompt sound and / or the faster the vibration frequency.

12. The environmental audio processing apparatus according to any one of claims 8-11, characterized in that, The preset number of microphones is two microphones, and the audio ranges captured by the two microphones at least partially overlap.

13. The environmental audio processing apparatus according to any one of claims 8-11, characterized in that, Also includes: The comparison module is used to compare the superimposed energy difference with a preset value; When the superimposed energy difference is less than a preset value, the mode switching module is executed; The mode switching module is used to switch the working mode of the headphones to transparency mode so that the user can know the sound of the user's current environment.

14. The environmental audio processing apparatus as described in claim 13, characterized in that, Also includes: The posture detection module is used to detect whether the duration of the user's head posture exceeds the preset duration; When the user maintains a head posture for a longer than a preset time, the mode switching module is executed.

15. A computer-readable storage medium having a computer program stored thereon, characterized in that, The computer program stored in the storage medium is used to be executed to implement the method as described in any one of claims 1-7.

16. A chip for an audio device, having an integrated circuit thereon, characterized in that, The integrated circuit is designed to implement the method as described in any one of claims 1-7.

17. An earphone, characterized in that, include: A processor for implementing the method as described in any one of claims 1-7.