Supercharge Your Innovation With Domain-Expert AI Agents!

Recognition model optimization method and system for improving speech recognition accuracy

A speech recognition model and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as unsuitable for processing open corpus, low recognition accuracy, poor consistency of knowledge expression, etc.

Active Publication Date: 2021-08-31
西安博达软件股份有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Language models commonly used today can generally be divided into two types: one is a statistical language model based on a large-scale corpus; Because its implementation is limited by the space and time of the system, it can only reflect the closeness of the language, and cannot deal with the long-distance recursion of the language.
One is a rule-based language model; this method is based on the classification of the Chinese vocabulary system according to the syntax and semantics, and by determining the morphological, syntactic and semantic relations of natural language, it tries to achieve a large-scale basic and unique identification of homophones; Its characteristic is that it is suitable for processing closed corpus and can reflect the long-distance constraint relationship and recursive phenomenon of language, but this method has poor robustness and is not suitable for processing open corpus, and the consistency of knowledge expression is not good
[0004] In the existing methods of speech recognition text, there are cases where the words in the original text are recognized as words with the same pronunciation and different words, and the accuracy of recognition is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Recognition model optimization method and system for improving speech recognition accuracy
  • Recognition model optimization method and system for improving speech recognition accuracy
  • Recognition model optimization method and system for improving speech recognition accuracy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0049] Such as figure 1 As shown, in the first aspect, the embodiment of the present invention provides a recognition model optimization method for improving the accuracy of speech recognition, including the following steps:

[0050] S1, input speech training data in the CTC model of DeepSpeech speech recognition system, obtain the speech recognition sequence of CTC model output;

[0051] In some embodiments of the present invention, speech training data is input in the CTC model of DeepSpeech speech recognition system (end-to-end automatic speech recognition system), and CTC model (temporal class classification model based on neural network) is trained and optimized, Calculate the output loss of the CTC model, the loss (loss) is: L(S)=-ln∏(x,z)∈Sp(z|x)=-∑(x,z)∈Slnp(z|x), and the CTC final The most likely output speech recognition sequence z will be calculated, and this sequence z is a text data.

[0052] S2. Input the speech recognition sequence into the preset language mod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a recognition model optimization method for improving speech recognition accuracy. The method comprises the steps of inputting speech training data into a CTC model of a DeepSpeech speech recognition system, and obtaining a speech recognition sequence outputted by the CTC model; inputting the voice recognition sequence into a preset language model to obtain an output probability value of each single word in the voice recognition sequence; and performing optimization adjustment on the CTC model according to the output probability value of each single word to obtain an optimized speech recognition model. The invention also discloses a recognition model optimization system for improving the speech recognition accuracy. The invention relates to the technical field of speech recognition. According to the invention, the speech recognition accuracy is effectively improved by combining the language model.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a recognition model optimization method and system for improving the accuracy of speech recognition. Background technique [0002] Speech recognition is generally divided into two stages: 1) Speech recognition stage: This stage uses the acoustic model of speech to convert natural sound signals into syllable forms of digital expressions that can be processed by machines. 2) Speech understanding stage: In this stage, the result of the previous stage, that is, the syllables are converted into Chinese characters. This stage needs to use the knowledge of the language model for understanding. The most important part in speech recognition is to establish a language model to improve the accuracy of speech recognition. [0003] Language models commonly used today can generally be divided into two types: one is a statistical language model based on a large-scale corpus; Becaus...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/00G10L15/06G10L15/18G10L15/26
CPCG10L15/005G10L15/063G10L15/18G10L15/26
Inventor 李传咏赵莉卢颖陈宁刘睿
Owner 西安博达软件股份有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More