Speech recognition method and device, equipment and storage medium

A speech recognition and speech recognition model technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as poor recognition effect, difficulty in improving model recognition accuracy, and difficulty in ensuring monotonicity, so as to facilitate speech recognition, The effect of improving accuracy and promoting monotonicity

Pending Publication Date: 2022-03-08
UNIV OF SCI & TECH OF CHINA +1
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the attention mechanism of the conventional end-to-end model based on the attention mechanism is difficult to guarantee monotonicity. Specifically, the attention of the model is unconstrained and out of order, which makes it difficult to improve the accuracy of model recognition, especially When faced with streaming recognition requirements, the recognition effect is often not good

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method and device, equipment and storage medium
  • Speech recognition method and device, equipment and storage medium
  • Speech recognition method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] The technical solutions of the embodiments of the present application are applicable to speech recognition application scenarios, and the accuracy and recognition efficiency of end-to-end speech recognition can be improved by adopting the technical solutions of the embodiments of the present application.

[0067] Speech recognition is widely used in home appliances, communications, automotive electronics, medical care, home services, consumer electronics and other fields.

[0068] Currently, end-to-end speech recognition is the most commonly used speech recognition solution. The mainstream end-to-end models mainly include three types: end-to-end ASR based on CTC (Connectionist Temporal Classification), encoder-decoder model based on attention (attention mechanism), and end-to-end model based on RNN-T (Recurrent NeuralNetwork-Transducer). Terminal ASR. Although these three end-to-end models have shown excellent performance in the field of speech recognition, each has it...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a voice recognition method and device, equipment and a storage medium. The method comprises the following steps: acquiring a coding feature obtained by coding an acoustic feature of a to-be-recognized voice by a coder; wherein the encoder is obtained according to a first recognition result of a voice sample and text label training of the voice sample, and the first recognition result of the voice sample is obtained according to encoding features obtained by encoding acoustic features of the voice sample by the encoder; determining the attention coefficient of the recognition result of the voice sample on the coding feature of each frame output by the encoder; and determining a voice recognition result of the to-be-recognized voice according to the coding features of the to-be-recognized voice. According to the technical scheme, the speech recognition accuracy can be improved.

Description

technical field [0001] The present application relates to the technical field of speech recognition, and more specifically, proposes a speech recognition method, device, equipment and storage medium. Background technique [0002] Automatic Speech Recognition (ASR) technology is a technology that allows machines to convert voice signals into corresponding text or commands through the process of recognition and understanding, that is, to allow machines to understand human voices. [0003] At present, end-to-end speech recognition is the mainstream scheme, among which, the end-to-end speech recognition scheme based on attention mechanism has the best recognition effect. However, the attention mechanism of the conventional end-to-end model based on the attention mechanism is difficult to guarantee monotonicity. Specifically, the attention of the model is unconstrained and out of order, which makes it difficult to improve the accuracy of model recognition, especially However, wh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/02G10L15/06G10L15/26G10L19/00
CPCG10L15/02G10L15/063G10L15/26G10L19/0018
Inventor 刘丹韩凯魏思
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products