Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for activation tone detection

A sound frame, mute technology, applied in the field of communication, can solve the problem of inaccurate detection of VAD solutions, and achieve the effect of improving user experience and accuracy

Active Publication Date: 2018-08-31
ZTE CORP
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at technical problems such as inaccurate detection of existing VAD schemes in the related art, the present invention provides a method and device for activation tone detection to at least solve the above technical problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for activation tone detection
  • Method and device for activation tone detection
  • Method and device for activation tone detection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0090] This embodiment provides a VAD method, such as Figure 4 As shown, the method includes:

[0091] Step S402: Obtain the output results of the two existing VADs.

[0092] Step S404: Obtain the subband signal and spectrum amplitude of the current frame;

[0093] In the embodiment of the present invention, an audio stream with a frame length of 20 ms and a sampling rate of 32 kHz is taken as an example for specific description. Under other conditions of frame length and sampling rate, the activation tone joint detection method provided by the embodiment of the present invention is also applicable.

[0094] The current frame time-domain signal is input to the filter bank unit, and the sub-band filtering calculation is performed to obtain the filter bank sub-band signal.

[0095] In this embodiment, a 40-channel filter bank is used, and the technical solution provided by the embodiment of the present invention is also applicable to filter banks with other numbers of channe...

Embodiment 2

[0239] In step S432 of Embodiment 1, it may also be implemented in the following manner:

[0240] A final joint VAD decision result is obtained according to at least one feature in feature group 1, at least one feature in feature group 2 and two existing activation tone detection (VAD) decision results.

[0241] Assume that the existing two VADs are VAD_A and VAD_B, the output flags are vada_flag and vadb_flag, the output flag of the combined VAD is vad_flag, and the VAD flag is 0 to indicate an inactive audio frame, and 1 to indicate an active audio frame. The specific judgment process is as follows:

[0242] a) Select vadb_flag as the initial value of vad_flag;

[0243] b) If the noise type is silent, and the frequency-domain signal-to-noise ratio is greater than a set threshold such as 0.2, and the initial value vad_flag of the joint VAD is 0, select vada_flag as the output of the joint VAD, and the judgment ends, otherwise, perform step c);

[0244] c) If the smooth long...

Embodiment 3

[0251] In step S432 of Embodiment 1, it may also be implemented in the following manner:

[0252] A final joint VAD decision result is obtained according to at least one feature in feature group 1, at least one feature in feature group 2 and two existing activation tone detection (VAD) decision results.

[0253] Assume that the existing two VADs are VAD_A and VAD_B, the output flags are vada_flag and vadb_flag, the output flag of the combined VAD is vad_flag, and the VAD flag is 0 to indicate an inactive audio frame, and 1 to indicate an active audio frame. The specific judgment process is as follows:

[0254] a) Select vadb_flag as the initial value of vad_flag;

[0255] b) If the noise type is silent, go to step c), otherwise go to step d)

[0256] c) If the smooth long-term frequency domain signal-to-noise ratio is greater than 12.5 and music_backgound_f is 0, then vad_flag is set to vada_flag, otherwise the initial value of vad_flag selected in step a) is used as the joint...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a method and device for detecting activation sounds, wherein the method includes: acquiring at least one first-type feature parameter in the first feature group, at least one second-type feature parameter in the second feature group, and At least two existing VAD decision results, wherein, the first type of characteristic parameters and the second type of characteristic parameters are characteristic parameters for VAD detection; according to the first type of characteristic parameters, the second type of The characteristic parameters and the at least two existing activation tone detection and decision results are used to detect the activation tone to obtain a joint VAD decision result. It solves technical problems such as inaccurate detection of VAD solutions in related technologies, improves the accuracy of VAD, and further improves user experience.

Description

technical field [0001] The present invention relates to the communication field, in particular to a method and device for Voice Activity Detection (VAD for short). Background technique [0002] In a normal voice call, the user is sometimes talking and sometimes listening. At this time, there will be an inactive tone phase during the call. Under normal circumstances, the total non-voice activation phase of both parties in the call exceeds 50% of the total voice coding time of both parties in the call. %. During the inactive tone phase, there is only background noise, which usually does not have any useful information. Taking advantage of this fact, in the process of speech and audio signal processing, the active and inactive sounds are detected by the VAD algorithm, and different methods are used to process them respectively. Many modern speech coding standards, such as AMR and AMR-WB, support the VAD function. In terms of efficiency, the VAD of these encoders does not per...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L25/78
CPCG10L25/78G10L21/0208G10L2025/783G10L21/038G10L25/21G10L25/84
Inventor 朱长宝袁浩
Owner ZTE CORP