Sound end detecting method for sound identifying system

A speech recognition and endpoint detection technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problems of multi-band, difficult to accurately detect speech endpoints, unreliable signal-to-noise ratio estimation, etc., and achieve a small amount of calculation Effect

Active Publication Date: 2006-05-17
INST OF ACOUSTICS CHINESE ACAD OF SCI +1
View PDF0 Cites 41 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the large number of sub-bands, the estimation of the signal-to-noise ratio is not reliable, and because a large number of empirical thresholds are used, the debugging is complicated and the applicable noise types are less
In short, it is difficult to accurately detect the endpoint of speech by purely using energy in a strong noise environment
[0010] In the patent document

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sound end detecting method for sound identifying system
  • Sound end detecting method for sound identifying system
  • Sound end detecting method for sound identifying system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further described below in conjunction with the accompanying drawings and preferred embodiments.

[0037] figure 2 It is a flow chart of the speech endpoint detection method applied to the speech recognition system provided by the present invention, as shown in the figure:

[0038] Step 101: Input the digitized voice data, and process the voice data into frames. Generally speaking, the frame length is 25ms, and the frame shift is 10ms. Then enter step 102 and step 105 respectively. Wherein step 102 and step 105 can be performed at the same time, or step 102 can be performed first, and then step 105 can be performed when the formant trajectory needs to be used for endpoint detection.

[0039] Step 102: Perform FFT operation on the speech data in units of frames.

[0040] Step 103: Divide the subbands according to the speech of the first 10 frames. The principle and specific process of dividing subbands are as follows:

[0041] For most...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A method for detecting voice end- point used in voice identification system includes carrying out framing process on inputted voice data, carrying out FFT operation on voice data by using frame as unit, dividing the whole voice spectrum to be sub band with different S ¿C N ratio in high and low and calculating out noise threshold for each sub band, carrying out preliminary judgment of voice end ¿C point according to noise threshold of each sub band and carrying out accurate judgment of voice end ¿C point according to resonant peak value.

Description

technical field [0001] The invention relates to the field of automatic speech recognition, in particular to a speech endpoint detection method. Background technique [0002] In the speech recognition system, the input signal includes speech and background noise, etc. Finding the speech segment in the input signal is called endpoint detection, start and end point detection or "Voice Activity Detection". Find the start and end of speech segments. Whether the endpoint detection is accurate or not will directly affect the performance of the speech recognition system. This is manifested in two aspects of accuracy and speed: firstly, a good endpoint detection is conducive to the system to accurately extract the characteristics of speech and improve the accuracy of speech recognition; secondly, if the speech recognition system only performs calculations when inputting speech, and removes the calculation of noise segments, Then the calculation amount will be greatly reduced, and t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/05
Inventor 潘接林国雁萌韩疆刘晓星颜永红
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products