Method and device for activation tone detection
A sound frame, mute technology, applied in the field of communication, can solve the problem of inaccurate detection of VAD solutions, and achieve the effect of improving user experience and accuracy
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0090] This embodiment provides a VAD method, such as Figure 4 As shown, the method includes:
[0091] Step S402: Obtain the output results of the two existing VADs.
[0092] Step S404: Obtain the subband signal and spectrum amplitude of the current frame;
[0093] In the embodiment of the present invention, an audio stream with a frame length of 20 ms and a sampling rate of 32 kHz is taken as an example for specific description. Under other conditions of frame length and sampling rate, the activation tone joint detection method provided by the embodiment of the present invention is also applicable.
[0094] The current frame time-domain signal is input to the filter bank unit, and the sub-band filtering calculation is performed to obtain the filter bank sub-band signal.
[0095] In this embodiment, a 40-channel filter bank is used, and the technical solution provided by the embodiment of the present invention is also applicable to filter banks with other numbers of channe...
Embodiment 2
[0239] In step S432 of Embodiment 1, it may also be implemented in the following manner:
[0240] A final joint VAD decision result is obtained according to at least one feature in feature group 1, at least one feature in feature group 2 and two existing activation tone detection (VAD) decision results.
[0241] Assume that the existing two VADs are VAD_A and VAD_B, the output flags are vada_flag and vadb_flag, the output flag of the combined VAD is vad_flag, and the VAD flag is 0 to indicate an inactive audio frame, and 1 to indicate an active audio frame. The specific judgment process is as follows:
[0242] a) Select vadb_flag as the initial value of vad_flag;
[0243] b) If the noise type is silent, and the frequency-domain signal-to-noise ratio is greater than a set threshold such as 0.2, and the initial value vad_flag of the joint VAD is 0, select vada_flag as the output of the joint VAD, and the judgment ends, otherwise, perform step c);
[0244] c) If the smooth long...
Embodiment 3
[0251] In step S432 of Embodiment 1, it may also be implemented in the following manner:
[0252] A final joint VAD decision result is obtained according to at least one feature in feature group 1, at least one feature in feature group 2 and two existing activation tone detection (VAD) decision results.
[0253] Assume that the existing two VADs are VAD_A and VAD_B, the output flags are vada_flag and vadb_flag, the output flag of the combined VAD is vad_flag, and the VAD flag is 0 to indicate an inactive audio frame, and 1 to indicate an active audio frame. The specific judgment process is as follows:
[0254] a) Select vadb_flag as the initial value of vad_flag;
[0255] b) If the noise type is silent, go to step c), otherwise go to step d)
[0256] c) If the smooth long-term frequency domain signal-to-noise ratio is greater than 12.5 and music_backgound_f is 0, then vad_flag is set to vada_flag, otherwise the initial value of vad_flag selected in step a) is used as the joint...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


