Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation

A noise estimation and speech recognition technology, which is applied in speech recognition, speech analysis, instruments, etc., can solve the problems of real-time system acceptance, application range limitation, and noise estimation inability to achieve the effect of saving power and prolonging battery life

Inactive Publication Date: 2016-02-24
HOHAI UNIV
View PDF14 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The model domain method mainly includes model adaptation and model combination. The former adjusts the parameters of the acoustic model through a small number of test speeches in the actual environment, which can be used to deal with arbitrary speech variability; the latter models the pure speech acoustic model and single Gaussian noise model. Combination, which generates an acoustic model of noisy speech for acoustic decoding, can only be used to deal with speech variability caused by environmental noise
[0004] Compared with eigendomain methods, model adaptation can achieve higher compensation accuracy, but it will result in a huge amount of computation
This is because the large vocabulary speech recognition system has a lot of basic speech units, usually reaching hundreds, and each basic speech unit corresponds to an acoustic model. In model adaptation, each acoustic model must participate in the adaptive parameter estimation operation. , its calculation amount is difficult to be accepted by the real-time system
The noise parameters of the traditional model combination come from the noise estimation of the speech gap period, but in a continuous speech in a non-stationary environment, there may not be enough speech gap periods, the noise estimation cannot be performed, and the model parameters cannot be updated in time, so its Applications are limited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation
  • Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation
  • Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0016] A model combination speech recognition method based on Gaussian mixture model noise estimation, its overall framework is as follows figure 1 shown. The core content of the present invention is the noise estimation module, and its concrete structure is as figure 2 shown. The specific implementations of the noise estimation module and the model combination module are described in detail below.

[0017] 1. Noise estimation

[0018] The present invention only considers the additive backgro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation. According to the method, a GMM containing fewer Gaussian units is used for real-time estimation of noise parameters in noisy testing speech and monitoring change of noise. The noise parameters are estimated according to specific time intervals and are updated once at every time interval, and mute segments are processed as noisy speech. Except for use for model combination, the estimated noise parameters are stored in an internal storage to be used for making noise change judgment of next time interval. The noise monitoring includes firstly, reading the noise parameters of last time interval from the storage; then, combining the noise parameters with a clean speech GMM so as to obtain a noisy speech GMM, subjecting noisy testing speech of current time interval to probability calculation, comparing an output average log likelihood value with an average log likelihood value outputted by a noise parameter estimation submodule, considering that noise changes if the likelihood value is larger than a threshold value, and considering that noise is unchanged if not.

Description

technical field [0001] The invention relates to a model-combined speech recognition method based on GMM noise estimation. Specifically, the noise parameters extracted in the test environment are used to adjust the parameters of the acoustic model of the speech recognition system to match the noise-containing speech feature parameters extracted in the actual environment. , a model combination method for improving system noise robustness; it belongs to the technical field of speech recognition. Background technique [0002] Automatic speech recognition technology can provide convenient input interfaces for electronic devices, and has been widely used in mobile devices such as mobile phones, tablet computers, and navigators. However, in practical applications, speech variability such as environmental noise is inevitable, which usually leads to a sharp decline in the performance of the speech recognition system, so it is necessary to take measures to improve the environmental ro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/14
CPCG10L15/144
Inventor 吕勇
Owner HOHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products