An acoustic model training method for high-precision continuous speech recognition

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An acoustic model and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as inability to infinitely fit the characteristics of training data, poor long sentence recognition results, and difficulty in improving continuous speech recognition effects. Decoding accuracy, the effect of reducing the overall difference

Active Publication Date: 2021-12-31

成都启英泰伦科技有限公司

View PDF15 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In order to solve the problem of poor long sentence recognition results, the usual practice in the industry is to use a large amount of speech corpus for training, which can improve the overall performance, but the training method of neural network and hidden Markov model and the final performance of the decoding model There is a bottleneck, and the root cause of the bottleneck in the final performance is that the gradient descent algorithm of deep learning represented by the neural network cannot infinitely fit all the characteristics of the training data, and the decoding model is only a mathematical modeling of the limited space of training data with a certain sample size , does not represent the possibility of unlimited data in reality, and it is difficult to improve the continuous speech recognition effect in the case of limited data training and decoding models

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0114] Identify one of the four short sentences (hello, smart, housekeeper, open) in the content, assuming that there are 4 different paths in the decoding space corresponding to the possible recognition results of the four short sentences (hello, smart, housekeeper, open) , the given text annotation is a "housekeeper" phrase.

[0115] The objective function of the maximum likelihood in the traditional processing method is only to maximize the value of logP (housekeeper). To give a popular example, the training of the traditional method is like repeatedly teaching the process of reading pictures and literacy, repeatedly emphasizing that this is "housekeeper", this is " Steward", it's "steward". However, the method proposed by the present invention is different from the traditional teaching method. While repeatedly emphasizing the correct content, it also repeatedly emphasizes the negative content. . Then, based on the above example, the optimization objective function of the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of speech recognition, and discloses an acoustic model training method for high-precision continuous speech recognition, comprising the following steps: Step 1. Prepare training corpus and extract speech features; Step 2: Calculate the acoustic model; Step 3. Acoustic model Initialization, step 4. Iterate the initialization acoustic model in step 4 according to the number of training iterations determined in step 4. Step 5. After the training, select the acoustic model with the highest decoding accuracy before merging, average and merge the acoustic model parameters for the final acoustic model. The invention optimizes the voice model recognition sequence expression, and adjusts the model parameters while determining the correct marked text through pre-decoding, so as to reduce the overall difference between the easily confused correct result and the wrong result as much as possible, and improves the decoding accuracy of the acoustic model parameters.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, in particular to an acoustic model training method for high-precision continuous speech recognition. Background technique [0002] The traditional acoustic modeling method is based on the Hidden Markov framework, and the Gaussian mixture model (GMM) is used to describe the probability distribution of speech acoustic features. Since the hidden Markov model belongs to a typical shallow learning structure, it is a kind of The simple structure of transforming the original input signal into the feature space is limited in its performance under massive data. In the later period, the academic community used the neural network and the hidden Markov model together, that is, the hybrid model models the output probability distribution to improve the overall effect, but the improvement is relatively still very limited. [0003] Continuous speech recognition is similar to natural speech. It is a p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/06G10L15/26

CPCG10L15/063G10L15/26

Inventor 游萌高君效

Owner 成都启英泰伦科技有限公司

An acoustic model training method for high-precision continuous speech recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology