Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A noise estimation and speech recognition technology, which is applied in speech recognition, speech analysis, instruments, etc., can solve the problems of real-time system acceptance, application range limitation, and noise estimation inability to achieve the effect of saving power and prolonging battery life

Inactive Publication Date: 2016-02-24

HOHAI UNIV

View PDF14 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The model domain method mainly includes model adaptation and model combination. The former adjusts the parameters of the acoustic model through a small number of test speeches in the actual environment, which can be used to deal with arbitrary speech variability; the latter models the pure speech acoustic model and single Gaussian noise model. Combination, which generates an acoustic model of noisy speech for acoustic decoding, can only be used to deal with speech variability caused by environmental noise

[0004] Compared with eigendomain methods, model adaptation can achieve higher compensation accuracy, but it will result in a huge amount of computation

This is because the large vocabulary speech recognition system has a lot of basic speech units, usually reaching hundreds, and each basic speech unit corresponds to an acoustic model. In model adaptation, each acoustic model must participate in the adaptive parameter estimation operation. , its calculation amount is difficult to be accepted by the real-time system

The noise parameters of the traditional model combination come from the noise estimation of the speech gap period, but in a continuous speech in a non-stationary environment, there may not be enough speech gap periods, the noise estimation cannot be performed, and the model parameters cannot be updated in time, so its Applications are limited

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0015] Below in conjunction with specific embodiment, further illustrate the present invention, should be understood that these embodiments are only used to illustrate the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various equivalent forms of the present invention All modifications fall within the scope defined by the appended claims of the present application.

[0016] A model combination speech recognition method based on Gaussian mixture model noise estimation, its overall framework is as follows figure 1 shown. The core content of the present invention is the noise estimation module, and its concrete structure is as figure 2 shown. The specific implementations of the noise estimation module and the model combination module are described in detail below.

[0017] 1. Noise estimation

[0018] The present invention only considers the additive backgro...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation. According to the method, a GMM containing fewer Gaussian units is used for real-time estimation of noise parameters in noisy testing speech and monitoring change of noise. The noise parameters are estimated according to specific time intervals and are updated once at every time interval, and mute segments are processed as noisy speech. Except for use for model combination, the estimated noise parameters are stored in an internal storage to be used for making noise change judgment of next time interval. The noise monitoring includes firstly, reading the noise parameters of last time interval from the storage; then, combining the noise parameters with a clean speech GMM so as to obtain a noisy speech GMM, subjecting noisy testing speech of current time interval to probability calculation, comparing an output average log likelihood value with an average log likelihood value outputted by a noise parameter estimation submodule, considering that noise changes if the likelihood value is larger than a threshold value, and considering that noise is unchanged if not.

Description

technical field [0001] The invention relates to a model-combined speech recognition method based on GMM noise estimation. Specifically, the noise parameters extracted in the test environment are used to adjust the parameters of the acoustic model of the speech recognition system to match the noise-containing speech feature parameters extracted in the actual environment. , a model combination method for improving system noise robustness; it belongs to the technical field of speech recognition. Background technique [0002] Automatic speech recognition technology can provide convenient input interfaces for electronic devices, and has been widely used in mobile devices such as mobile phones, tablet computers, and navigators. However, in practical applications, speech variability such as environmental noise is inevitable, which usually leads to a sharp decline in the performance of the speech recognition system, so it is necessary to take measures to improve the environmental ro...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/14

CPCG10L15/144

Inventor 吕勇

Owner HOHAI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Model combination type speech recognition method based on GMM (Gaussian mixture model) noise estimation

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology