Voice activity detection method in complex background noise

A voice-activated detection and complex background noise technology, applied in voice analysis, instruments, etc., can solve the problem of unsatisfactory detection results of statistical judgments, reduce the probability of false detection and missed detection, simple and reliable algorithm, and good real-time performance Effect

Active Publication Date: 2011-09-21
西安烽火电子科技有限责任公司
View PDF3 Cites 28 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Generally speaking, the detection effect of a single statistical judgment is not ideal, and it is often only suitable for certain occasions

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice activity detection method in complex background noise
  • Voice activity detection method in complex background noise
  • Voice activity detection method in complex background noise

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The present invention will be described in detail below in conjunction with specific embodiments.

[0031] Due to the randomness of noise, its autocorrelation value is small on average and its standard deviation is also small. On the contrary, the average value of the autocorrelation of the speech signal is relatively large, and its standard deviation is also large, and the variation of the variance of the autocorrelation between different frame signals of the speech signal is also relatively large. Therefore, the feature of variance of autocorrelation and the corresponding statistics are used to judge whether there is speech or not, and perform VAD detection.

[0032] Normally, the voice sampling frequency is 8kHz, the data frame length is 20ms (it is generally believed that the voice signal is basically stable within 10ms~30ms), and the number of processing points each time is N=8000*0.02=160 points. The overlap between frames is 20%-50%, therefore, the actual length...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice activity detection method in complex background noise. The method sequentially comprises the following steps of: (1) performing TEO (Teager Energy Operator) operation on data; (2) pre-weighting input data x(n); (3) performing band-pass filtering; (4) framing and windowing; (5) calculating an evolution value of autocorrelation of each frame and a standard variance thereof; (6) calculating Stati of 20 frames at the initial stage, and a mean (Stati) and a standard variance std (Stati) thereof, comparing the std (Stati) with a preset threshold to judge whether voice is available; (7) calculating subsequent data; (8) calculating Stati of continuous FrameN frames, and performing secondary determination according to the mean (Stati) and the standard variance std (Stati) thereof; (9) considering that the speech interval Speechmin is equal to 100-200ms and duration Silencemin is equal to 500-1,000ms, judging that voice occurs under the condition that Statusfinalis equal to 0 when continuous Ns (the value is related to the FrameN) atatus is equal to 1; and judging that the voice is ended under the condition that Statusfinal is equal to 1 when continuous NE (the value is also related to the FrameN) atatus is equal to 0, and finally judging actual end points of the voice.

Description

technical field [0001] The present invention generally relates to a digital signal processing system, and more specifically, the present invention relates to a voice activation detection (VAD, Voice Activity Detection) method in complex background noise, especially for real-time voice detection occasions with limited computing resources, such as military radio voice business etc. Background technique [0002] Voice Activation Detection (VAD), also known as End-Point Detection (EPD), aims to correctly distinguish speech from various background noises. It plays a very important role in the field of speech signal processing (more generally, acoustic signal processing). application. In speech recognition, the voiced segment and the silent segment in the voice signal are usually segmented according to a certain endpoint detection algorithm, and then the voiced segment is recognized according to some specific features of the voice. Studies have shown that: Even in a quiet enviro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L11/02G10L25/48
Inventor 梁峰张凡曹军勤杨勇
Owner 西安烽火电子科技有限责任公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products