Recognition model optimization method and system for improving speech recognition accuracy

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition model and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as unsuitable for processing open corpus, low recognition accuracy, poor consistency of knowledge expression, etc.

Active Publication Date: 2021-08-31

西安博达软件股份有限公司

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Language models commonly used today can generally be divided into two types: one is a statistical language model based on a large-scale corpus; Because its implementation is limited by the space and time of the system, it can only reflect the closeness of the language, and cannot deal with the long-distance recursion of the language.

One is a rule-based language model; this method is based on the classification of the Chinese vocabulary system according to the syntax and semantics, and by determining the morphological, syntactic and semantic relations of natural language, it tries to achieve a large-scale basic and unique identification of homophones; Its characteristic is that it is suitable for processing closed corpus and can reflect the long-distance constraint relationship and recursive phenomenon of language, but this method has poor robustness and is not suitable for processing open corpus, and the consistency of knowledge expression is not good

[0004] In the existing methods of speech recognition text, there are cases where the words in the original text are recognized as words with the same pronunciation and different words, and the accuracy of recognition is not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0049] Such as figure 1 As shown, in the first aspect, the embodiment of the present invention provides a recognition model optimization method for improving the accuracy of speech recognition, including the following steps:

[0050] S1, input speech training data in the CTC model of DeepSpeech speech recognition system, obtain the speech recognition sequence of CTC model output;

[0051] In some embodiments of the present invention, speech training data is input in the CTC model of DeepSpeech speech recognition system (end-to-end automatic speech recognition system), and CTC model (temporal class classification model based on neural network) is trained and optimized, Calculate the output loss of the CTC model, the loss (loss) is: L(S)=-ln∏(x,z)∈Sp(z|x)=-∑(x,z)∈Slnp(z|x), and the CTC final The most likely output speech recognition sequence z will be calculated, and this sequence z is a text data.

[0052] S2. Input the speech recognition sequence into the preset language mod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a recognition model optimization method for improving speech recognition accuracy. The method comprises the steps of inputting speech training data into a CTC model of a DeepSpeech speech recognition system, and obtaining a speech recognition sequence outputted by the CTC model; inputting the voice recognition sequence into a preset language model to obtain an output probability value of each single word in the voice recognition sequence; and performing optimization adjustment on the CTC model according to the output probability value of each single word to obtain an optimized speech recognition model. The invention also discloses a recognition model optimization system for improving the speech recognition accuracy. The invention relates to the technical field of speech recognition. According to the invention, the speech recognition accuracy is effectively improved by combining the language model.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a recognition model optimization method and system for improving the accuracy of speech recognition. Background technique [0002] Speech recognition is generally divided into two stages: 1) Speech recognition stage: This stage uses the acoustic model of speech to convert natural sound signals into syllable forms of digital expressions that can be processed by machines. 2) Speech understanding stage: In this stage, the result of the previous stage, that is, the syllables are converted into Chinese characters. This stage needs to use the knowledge of the language model for understanding. The most important part in speech recognition is to establish a language model to improve the accuracy of speech recognition. [0003] Language models commonly used today can generally be divided into two types: one is a statistical language model based on a large-scale corpus; Becaus...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/00G10L15/06G10L15/18G10L15/26

CPCG10L15/005G10L15/063G10L15/18G10L15/26

Inventor 李传咏赵莉卢颖陈宁刘睿

Owner 西安博达软件股份有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Recognition model optimization method and system for improving speech recognition accuracy

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology