Streaming phonetic transcription system based on self-attention mechanism

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An attention-based and stream-based technology, applied in speech analysis, speech recognition, biological neural network models, etc., can solve problems such as low efficiency and inability to apply stream-based sequence modeling tasks, so as to improve modeling capabilities, improve training and Computational Efficiency Effects

Active Publication Date: 2019-11-19

北京中科智极科技有限公司

View PDF5 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the recursive calculation of the cyclic neural network is relatively inefficient during the training process.

The self-attention mechanism can also model long-distance dependencies, but requires a complete sequence as input. Although it has high computational efficiency, it cannot be applied to streaming sequence modeling tasks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0051] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. However, it should be understood that the specific embodiments described here are only used to explain the present invention, and are not intended to limit the scope of the present invention. Also, in the following description, descriptions of well-known structures and techniques are omitted to avoid unnecessarily obscuring the concept of the present invention.

[0052]In the streaming speech transcription system based on the self-attention mechanism of the present invention, the self-attention mechanism is used instead of the cyclic neural network to model the sequential information, and the streaming speech construction is realized by limiting the scope of the self-attention mechanism and stacking multi-layer structures. With only little performance ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a streaming phonetic transcription system based on a self-attention mechanism. The streaming phonetic transcription system based on the self-attention mechanism comprises a feature front-end processing module, a self-attention audio coding network module, a self-attention prediction network module and a united network module. The feature front-end processing module is usedfor receiving an input acoustic feature and converting into a vector with specific dimensionality; the self-attention audio coding network module is connected with the feature front-end processing module and is used for receiving the processed acoustic feature and obtaining an coded acoustic state vector; the self-attention prediction network module is used for generating a language state vector according to an input prediction mark of the last moment; and the united network module is connected with the self-attention audio coding network module and the self-attention prediction network module, and is used for combining with an acoustic state and a language state and calculating the probability of a new prediction mark. The invention provides a streaming feedforward voice coder based on the self-attention mechanism, so that the calculation efficiency and the precision of a traditional voice coder are improved.

Description

technical field [0001] The invention relates to the technical field of signal processing in the electronics industry, in particular to a streaming speech transcription system based on a self-attention mechanism. Background technique [0002] Speech is one of the main means for human to exchange information. Speech recognition technology mainly enables computers to recognize human speech and transcribe it into corresponding text. In the development of speech recognition technology, the early research mainly used the method based on the mixed Gaussian model and hidden Markov model. The development of deep neural network made the Gaussian model replaced by deep neural network. In recent years, with the development of computer technology, the end-to-end model has attracted more and more attention because of its simplified process and elegant model structure. [0003] The recurrent neural network speech transcription system uses a recurrent neural network as the basic network fr...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/16G10L15/183G06N3/04

CPCG10L15/16G10L15/183G06N3/045

Inventor 温正棋田正坤

Owner 北京中科智极科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Streaming phonetic transcription system based on self-attention mechanism

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology