Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training method and system for RNN transducer model, and device

A training method and technology of speech training, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of learning speech data alignment information, low speech recognition accuracy, etc., and achieve the effect of accelerating model convergence, improving accuracy, and improving performance.

Active Publication Date: 2020-01-14
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF1 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In order to solve the above-mentioned problems in the prior art, that is, in order to solve the problem that the end-to-end speech transcription model cannot learn the alignment information of speech data well, resulting in low speech recognition accuracy, the first aspect of the present invention proposes an end-to-end A training method for an end-to-end speech transcription model, the method comprising:

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method and system for RNN transducer model, and device
  • Training method and system for RNN transducer model, and device
  • Training method and system for RNN transducer model, and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to make the purpose, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention, rather than Full examples. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0038] The application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain related inventions, not to limit the invention. It should also be noted that, for the convenience of description, only the parts related to the related invention are sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of electronic signal processing, particularly relates to a training method and system for an RNN transducer model, and a device, and aims to solve the problem that the RNN transducer model cannot learn alignment information of speech data well. The training method comprises the following steps of extracting the features of speech training data to obtaina speech feature sequence; forcedly aligning the speech feature sequence through a GMM-HMM (Gaussian Mixture Model-Hidden Markov Model) model to obtain an alignment annotation, and splicing differentframes of speech features; training the RNN transducer model based on the spliced speech feature sequence and text annotation training data to obtain the probability distribution and the negative logarithmic loss value of each word in a preset word list; acquiring an alignment loss value; carrying out weighted averaging on the alignment loss value and the negative logarithmic loss value to obtaina combined loss value, and updating the parameters of the model through a backward propagation algorithm; and iteratively training the model. Through the adoption of the training method and system, the alignment information of the speech data can be learned accurately.

Description

technical field [0001] The invention belongs to the technical field of electronic signal processing, and in particular relates to a training method, system and device for an end-to-end speech transcription model. Background technique [0002] As the entrance of human-computer interaction, speech recognition is an important research direction in the field of artificial intelligence. Traditional speech recognition methods generally use a Gaussian mixture model-hidden Markov model-based hybrid model (GMM-HMM). There are many components in the entire system, which are trained separately, and the performance cannot meet the requirements. With the in-depth application of deep learning technology in speech recognition, end-to-end speech recognition has achieved remarkable results. In particular, the recently proposed end-to-end speech transcription model (RNN Transducer Model) based on the cyclic neural network not only greatly simplifies the steps of the speech recognition system...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/14G10L15/16G10L15/06G10L15/02G10L15/26G10L25/24
CPCG10L15/144G10L15/148G10L15/16G10L15/063G10L15/02G10L15/26G10L25/24
Inventor 陶建华田正坤易江燕
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products