Streaming phonetic transcription system based on self-attention mechanism

An attention-based and stream-based technology, applied in speech analysis, speech recognition, biological neural network models, etc., can solve problems such as low efficiency and inability to apply stream-based sequence modeling tasks, so as to improve modeling capabilities, improve training and Computational Efficiency Effects

Active Publication Date: 2019-11-19
北京中科智极科技有限公司
View PDF5 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the recursive calculation of the cyclic neural network is relatively inefficient during the training process.
The self-attention mechanism can also model long-distance dependencies, but requires a complete sequence as input. Although it has high computational efficiency, it cannot be applied to streaming sequence modeling tasks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Streaming phonetic transcription system based on self-attention mechanism
  • Streaming phonetic transcription system based on self-attention mechanism
  • Streaming phonetic transcription system based on self-attention mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. However, it should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

[0052]In the streaming speech transcription system based on the self-attention mechanism of the present invention, the self-attention mechanism is used instead of the cyclic neural network to model the sequential information, and the streaming speech construction is realized by limiting the scope of the self-attention mechanism and stacking multi-layer structures. With only little performance ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a streaming phonetic transcription system based on a self-attention mechanism. The streaming phonetic transcription system based on the self-attention mechanism comprises a feature front-end processing module, a self-attention audio coding network module, a self-attention prediction network module and a united network module. The feature front-end processing module is usedfor receiving an input acoustic feature and converting into a vector with specific dimensionality; the self-attention audio coding network module is connected with the feature front-end processing module and is used for receiving the processed acoustic feature and obtaining an coded acoustic state vector; the self-attention prediction network module is used for generating a language state vector according to an input prediction mark of the last moment; and the united network module is connected with the self-attention audio coding network module and the self-attention prediction network module, and is used for combining with an acoustic state and a language state and calculating the probability of a new prediction mark. The invention provides a streaming feedforward voice coder based on the self-attention mechanism, so that the calculation efficiency and the precision of a traditional voice coder are improved.

Description

technical field [0001] The invention relates to the technical field of signal processing in the electronics industry, in particular to a streaming speech transcription system based on a self-attention mechanism. Background technique [0002] Speech is one of the main means for human to exchange information. Speech recognition technology mainly enables computers to recognize human speech and transcribe it into corresponding text. In the development of speech recognition technology, the early research mainly used the method based on the mixed Gaussian model and hidden Markov model. The development of deep neural network made the Gaussian model replaced by deep neural network. In recent years, with the development of computer technology, the end-to-end model has attracted more and more attention because of its simplified process and elegant model structure. [0003] The recurrent neural network speech transcription system uses a recurrent neural network as the basic network fr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/16G10L15/183G06N3/04
CPCG10L15/16G10L15/183G06N3/045
Inventor 温正棋田正坤
Owner 北京中科智极科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products