Improvement method of Ngram model for voice recognition

A speech recognition and model technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as abnormal query results, confusing language model memory, and inappropriate multi-word query, to achieve the effect of improving the recognition rate

Active Publication Date: 2013-03-13
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF2 Cites 51 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But its time complexity is more than a hundred times that of the Ngram model
Query speed is completely unacceptable for speech recognition
In addition, since RNN has a time memory function, it is not suitable to query multiple words at the same time
Otherwise, the memory of the language model is confused, and the query results are seriously abnormal

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Improvement method of Ngram model for voice recognition
  • Improvement method of Ngram model for voice recognition
  • Improvement method of Ngram model for voice recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0026] figure 1 It is a flow chart of an Ngram model improvement method for speech recognition according to the present invention. The Ngram model improvement method for speech recognition is used for speech recognition in a specific field, and has a significant effect on improving the performance of the Ngram model under the condition of less training corpus, such as the speech navigation function, place name recognition function, control function in the vehicle navigation system. Command recognition function, etc.

[0027] The Ngram model improvement method for speech recognition specifically includes steps:

[0028] Step S101: Convert Ngram into an equivalent WFSA (Weighted Finite State Automata, Weighted Finite Stat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an improvement method of a Ngram model for voice recognition, comprising the following steps of: converting an original Ngram model for voice recognition into an equivalent WFSA (Weighted Finite-State Automaton) network NET1; optimizing the NET1 by using an RNN (Recurrent Neural Network) to ensure that the output possibility for each sentence in a training text is maximized when the training text is marked by using the NET1; converting the NET1 into a WFST (Weighted Finite State Transducer) pronunciation network NET2 with voice model possibility by utilizing a pronunciation dictionary; optimizing the pronunciation network NET2 by utilizing a phoneme confusion matrix to ensure that the error rate of sentences is minimized; and reversely converting the pronunciation network NET2 into an improved Ngram model, and carrying out voice recognition by using the improved Ngram model.

Description

technical field [0001] The invention discloses an Ngram model improvement method for speech recognition, in particular an Ngram model improvement method for specific speech recognition tasks under the condition of small corpus. Background technique [0002] 1. Language model plays a pivotal role in speech recognition. Acoustic model, language model and decoding algorithm constitute a complete speech recognition framework. [0003] 2. The traditional Ngram (N-gram, N-gram, Ngram) model is the most widely used language model. Its advantage is that the query speed is fast and it can be easily converted into WFST (Weighted Finite State Transducer). . After converting to WFST, the recognition speed can be increased by an order of magnitude. However, since the Ngram model makes a series of assumptions on the data distribution, when the training data distribution is different from the assumptions, especially when the amount of training data is small, its performance will be grea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/16
Inventor 柯登峰徐波
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products