Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech recognition method based on simplified and improved Transform model

A speech recognition and model technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as high complexity, difficulty in speech recognition model training, model inoperability, etc., to reduce equipment requirements, shorten training time, and improve computing Effects of Velocity and Convergence Rate

Pending Publication Date: 2021-12-21
WUHAN XINCHANG TECH CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The Transformer model with a deep network structure is more likely to fall into a local optimum, making it more difficult to train the Transformer-based end-to-end speech recognition model
Secondly, due to the self-attention mechanism in Transformer, it has high complexity when computing in parallel between frames
Therefore, the Transformer with a deep structure has a large number of parameters and requires powerful computing power, and there is a problem that the model cannot be run in the deployment of edge devices and embedded devices.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method based on simplified and improved Transform model
  • Speech recognition method based on simplified and improved Transform model
  • Speech recognition method based on simplified and improved Transform model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to explain the present invention more fully, the present invention will be described in detail below in conjunction with the drawings and specific embodiments. The following implementations are only used to illustrate the technical solution of the present invention more clearly, but not to limit the protection scope of the present invention.

[0038] The present invention adopts following technical scheme in order to solve the above-mentioned technical problem: designed a kind of speech recognition method based on the simplified and improved Transformer model, is used for realizing speech feature is recognized, comprises the steps:

[0039] Step A: Preprocessing the speech signal to extract a 40-dimensional mfcc Fbank (Mel frequency cepstral coefficient filter bank);

[0040] Step B: Convolving the extracted 40-dimensional Fbank (filter bank) feature with a CNN (Convolutional Neural Network) convolutional network;

[0041] Step C: Input the features after conv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a speech recognition method based on a simplified and improved Transform model, and belongs to the technical field of speech recognition. The method comprises the following steps: A, preprocessing a voice signal, and extracting a multi-dimensional mfcc Fbank; B, carrying out convolution on the extracted Fbank features by using a CNN (Convolutional Neural Network); C, inputting the features obtained after convolution into the improved Transform network structure; D, taking the CTC loss as a loss function of the acoustic model, performing prediction by adopting a Beam search algorithm, and performing optimization by using an Adam optimizer; E, starting to train the model to continuously iteratively optimize the model to an optimal model structure; and F, carrying out model verification. According to the method, the parameter quantity of the model is greatly reduced on the premise of ensuring that the word error rate is reduced, the calculation speed and the convergence speed of the original model are improved, the training time of the model is shortened, and the working efficiency of speech recognition is effectively improved.

Description

technical field [0001] The invention relates to a voice recognition method based on a simplified and improved Transformer model, belonging to the technical field of voice recognition. Background technique [0002] Transformer, as a new deep learning algorithm framework, has attracted the attention of more and more researchers and has become a current research hotspot. The self-attention mechanism in the Transformer model is inspired by humans who only focus on important things, and only learn important information in the input sequence. For the speech recognition task, the focus is on transcribing the information of the input speech sequence into the corresponding speech text. In the past, the speech recognition system was composed of acoustic model, pronunciation dictionary and speech model to realize speech recognition tasks, while Transformer can integrate acoustics, pronunciation and speech models into a single neural network to form an end-to-end speech recognition sys...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L25/24G10L25/30G10L15/06
CPCG10L25/24G10L25/30G10L15/063
Inventor 李玮刘鑫吕锋罗幼喜
Owner WUHAN XINCHANG TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products