Voice recognition method and system based on incremental word graph re-scoring

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology of speech recognition and word map, which is applied in speech recognition, speech analysis, instruments, etc., can solve the problems of insufficient recognition speed, large final word map, and difference influence, so as to improve recognition accuracy, realize self-adaptation, Speed up the generated effect

Pending Publication Date: 2020-11-10

XI AN JIAOTONG UNIV

View PDF0 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In the existing method, in order to ensure the recognition accuracy of one-pass decoding, it is necessary to use a larger beam search width, which will make the final word map too large and the recognition speed is still not fast enough

There is a method to replace the one-step decoding method with a larger beam search width. Although the decoding speed is about 2-3 times higher at lower WERs, there may be a large difference between the two-step decoding because the beam value is too small to affect its final result. use

The method of using GPU parallel computing is relatively expensive, and the wide-scale use of this decoder in industrial scenarios is still open to question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0058] First, the methods and terms involved in the present invention will be described.

[0059] 1) Finite state receiver (FSA): Weighted finite state transcription machine (WFST) consists of a set of states and directed jumps between states, in which three kinds of information are saved on each jump, namely input label, output label and weight, recorded in the format of "input_label:output_label / weight", the decoding network mentioned in the present invention is a WFST. FSA can be seen as a simplification of FST, each of its jumps has only input labels.

[0060] 2) State-level word graph: a directed acyclic graph with input labels, output labels, and weight values on transition edges. The input label is the alignment information, and the output label is the word result.

[0061] 3) Word-level word graph: also called compressed word graph, which is obtained by determinizing the state-level word graph. The difference from the state-level word graph is that its alignment in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice recognition method and system based on incremental word graph re-scoring. The method comprises the steps that: a to-be-recognized voice signal is obtained and acousticfeatures are extracted; a likelihood probability corresponding to the acoustic features is calculated by using a trained acoustic model; a decoder constructs a corresponding decoding network, obtainsa word graph of a state level from the decoding network and obtains a word graph of a word level by updating the word graph and determining the word graph; state-level word graphs of remaining decoding networks are determined, and the determined state-level word graphs are combined with the obtained word-level word graphs to generate a decoded word graph; a target word graph is obtained through afinite-state transcriber merging algorithm according to a re-scoring language model obtained through one-time decoded word graph and small corpus training; and an optimal cost path word graph of the target word graph is obtained, then a corresponding word sequence is obtained, and the word sequence is taken as a final recognition result. According to the invention, the calculation amount of determination after the decoding of a common decoder is finished is reduced, and the decoding speed is accelerated; the word error rate of speech recognition in a specific scene is reduced, and the accuracyis improved.

Description

technical field [0001] The invention belongs to the technical field of speech recognition, and in particular relates to a speech recognition method and system based on incremental word graph re-scoring. Background technique [0002] In recent years, with the rapid development of the artificial intelligence industry, speech recognition technology has received more and more attention from academia and industry. As a front-end technology in the field of speech interaction, speech recognition plays a vital role. It is widely used in many human-computer interaction systems, such as intelligent customer service systems, chat robots, personal intelligent assistants, and smart homes. [0003] At present, the traditional speech recognition technology is mainly built based on the HMM-DNN framework. The advantage of such modeling is that a speech recognition system with good accuracy can be obtained through relatively small data training. The decoder is an extremely important compone...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/02G10L15/183G10L15/26

CPCG10L15/02G10L15/183

Inventor范建存马一航

OwnerXI AN JIAOTONG UNIV

Voice recognition method and system based on incremental word graph re-scoring

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology