Voice activation detection (VAD), and method and apparatus for the VAD

A feature parameter, current frame technology, applied in the field of activation sound detection, can solve the problems of good performance, error detection, low VAD efficiency, etc., and achieve the effect of good performance

Active Publication Date: 2014-07-02
ZTE CORP
View PDF5 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In terms of efficiency, the VAD of these encoders does not perform well in all typical background noises
Especially in non-stationary noise, the VAD efficiency of these

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice activation detection (VAD), and method and apparatus for the VAD
  • Voice activation detection (VAD), and method and apparatus for the VAD
  • Voice activation detection (VAD), and method and apparatus for the VAD

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 2

[0136] Embodiment 2 of the activation tone detection (VAD) method of the present invention performs polyphase filtering on the input audio signal in sub-frames to obtain the filter bank sub-band signal, and further performs time-frequency conversion on the filter bank sub-band signal, and calculates Spectrum amplitude, signal feature extraction is performed on each filter bank sub-band signal and spectrum amplitude respectively, and each characteristic parameter value is obtained. According to the value of the characteristic parameter, the background noise mark and the tonality mark of the current frame are obtained. According to the current frame energy parameter value and background noise energy calculation, the SNR parameter of the current frame is obtained, and according to the calculated SNR parameter of the current frame, the VAD (Voice Activity Detection, Voice Activity Detection) judgment result and each feature of the previous frame Parameter to determine whether the ...

Embodiment 1

[0213] In Embodiment 1 and Embodiment 2, the process of obtaining the VAD judgment result is calculated according to the tonality flag, the signal-to-noise ratio parameter, the spectral center of gravity characteristic parameter, and the frame energy parameter, such as image 3 Shown include the following steps:

[0214] Step 301: Calculate the long-term signal-to-noise ratio lt_snr through the ratio of the average long-term activation tone signal energy and the average long-term background noise energy calculated in the previous frame;

[0215] Average long-duration activation tone signal energy E fg and the average long-term background noise energy E bg See step 307 for the calculation and definition of . The long-term signal-to-noise ratio lt_snr calculation equation is as follows:

[0216] In this formula, the long-term signal-to-noise ratio lt_snr is expressed in logarithm.

[0217] Step 302: Calculate the average value of the full-band SNR SNR2 of several recent fr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to voice activation detection (VAD), and a method and apparatus for the VAD. The method comprises: obtaining the sub-band signal and the frequency spectrum amplitude of a current frame; according to the sub-band signal, calculating to obtain the frame energy parameter of the current frame and the value of a spectrum gravity center characteristic parameter; according to the background noise energy obtained through estimation of a previous frame, the frame energy parameter of the current frame, and signal-to-noise ratio sub-band energy, calculating to obtain the signal-to-noise ratio parameter of the current frame; and according to a tonal sign, the signal-to-noise ratio parameter, the spectrum gravity center characteristic parameter and the frame energy parameter, calculating to obtain a VAD determination result. The method and apparatus provided by the invention can improve the detection accuracy of unstable noise (such as office noise) and music.

Description

technical field [0001] The present invention relates to an activation tone detection (VAD) and a method for activation tone detection (including background noise detection, tonal signal detection, correction of the current frame activation tone retention frame number in VAD judgment, signal-to-noise ratio threshold in VAD judgment adjustment methods) and devices. Background technique [0002] In a normal voice call, the user is sometimes talking and sometimes listening. At this time, there will be an inactive tone phase during the call. Under normal circumstances, the total non-voice activation phase of both parties in the call exceeds 50% of the total voice coding time of both parties in the call. %. During the inactive tone phase, there is only background noise, which usually does not have any useful information. Taking advantage of this fact, in the process of speech and audio signal processing, the active tone and the inactive tone are detected by the activation tone d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L25/93G10L21/0208
Inventor 江东平袁浩朱长宝
Owner ZTE CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products