Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Time-domain self-adaptive speech detection method based on dynamic noise estimation

A dynamic noise and speech detection technology, applied in speech analysis, instruments, etc., can solve the problems of low recognition rate, poor adaptation to quiet environment, shortening system processing time, etc., and achieve the effect of improving accuracy.

Active Publication Date: 2016-11-09
成都启英泰伦科技有限公司
View PDF5 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Some literature shows that the low recognition rate in practical applications is largely due to the lack of correct processing of speech, and a large amount of non-speech information seriously affects the accuracy of the speech recognition system, especially speech recognition with a lot of noise in the application environment , the correct voice detection technology can effectively reduce the amount of system computation, shorten the system processing time, reduce the transmission power of the mobile terminal and save channel resources, improve the accuracy of voice recognition, especially in the complex background noise, the performance of the voice recognition system To a large extent depends on the pros and cons of speech detection technology, so robust, accurate, real-time, adaptive and robust speech detection technology is necessary for every speech recognition system
[0003] At present, when the speech recognition technology is applied on the mobile terminal, especially the mobile phone or the voice remote control, it mainly depends on the button to determine the start and end of the speech. However, this method is very inconvenient for a large number of remote speaking applications. For smart devices and robots that support voice recognition in your hands, the automatic voice detection system is an essential component
[0004] The current mainstream method of automatic speech detection is to rely on the short-term energy in the time domain, the zero-crossing rate, and the mean square error of the frequency band energy in the frequency domain. The specific method is to find out the short-term energy, zero-crossing rate or frequency band The energy mean square error is compared with an empirical threshold. Experiments have proved that this method of comparing short-term energy or zero-crossing rate alone is not suitable for noisy environments, especially when the application environment changes. The background noise of the environment will also change accordingly, and the frequency band energy mean square error method is not suitable for quiet environments

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time-domain self-adaptive speech detection method based on dynamic noise estimation
  • Time-domain self-adaptive speech detection method based on dynamic noise estimation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] The present invention will be described in further detail below in conjunction with the examples and specific implementation methods, but this should not be interpreted as the scope of the above-mentioned subject of the present invention being limited to the following examples, and all technologies realized based on the content of the present invention belong to the present invention scope.

[0031] Such as figure 1 Shown, a kind of time-frequency domain adaptive speech detection method based on dynamic noise estimation, it comprises the following steps:

[0032] Step 1, load the current frame data, the current frame data is voice data in the time domain;

[0033] Step 2, calculating the energy summation of each frame of voice data in the time domain as the short-term energy in the time domain, and transforming the voice data in each frame of time domain into frequency domain data by FFT;

[0034] Step 3, select sub-band data in a certain frequency range of the freque...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the field of information processing technology and sensor signal processing, and particularly relates to a time-domain self-adaptive automatic speech detection method based on dynamic noise estimation. According to the invention, based on time-domain short time energy of sound and frequency-domain short time energy variation in a certain range, speech detection is carried out; and finally, the optimal result is selected according to dynamically estimated magnitude of background noise energy. The accuracy of speech recognition is greatly improved. The adaptability of speech recognition to environmental change is improved.

Description

technical field [0001] The invention relates to the fields of information processing technology and sensory signal processing, in particular to a time-frequency domain adaptive voice detection method based on dynamic noise estimation. Background technique [0002] A hot spot in the field of artificial intelligence applications is speech recognition, which has been widely used in various fields. The realization of speech detection is an important part of the real-time realization of the speech recognition system, and its purpose is to distinguish speech segments from non-speech segments in complex actual environments. Some literature shows that the low recognition rate in practical applications is largely due to the lack of correct processing of speech, and a large amount of non-speech information seriously affects the accuracy of the speech recognition system, especially speech recognition with a lot of noise in the application environment , the correct voice detection tech...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L21/0224G10L21/0232G10L25/60G10L25/75
CPCG10L21/0224G10L21/0232G10L25/60G10L25/75G10L2025/786
Inventor 何云鹏
Owner 成都启英泰伦科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products