Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Continuous voice recognition method based on deep long and short term memory recurrent neural network

A technology of cyclic neural network and long-term short-term memory, which is applied in speech recognition, speech analysis, instruments, etc., can solve the problem of interfering continuous speech recognition system performance, acoustic model noise resistance and robustness need to be improved, deep neural network method parameters large scale issues

Active Publication Date: 2015-04-22
TSINGHUA UNIV
View PDF5 Cites 63 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the complexity of the actual acoustic environment still seriously affects and interferes with the performance of the continuous speech recognition system. Even with the best deep neural network method, continuous speech recognition data sets under complex conditions including noise, music, spoken language, repetition, etc. It can only obtain a recognition rate of about 70%, and the noise resistance and robustness of the acoustic model in the continuous speech recognition system need to be improved
In addition, the parameters of the deep neural network method are large, and most of the calculation work needs to be completed on the GPU device, which is difficult for ordinary CPUs. Therefore, this type of method still has a certain distance from the requirements of large-scale commercialization.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Continuous voice recognition method based on deep long and short term memory recurrent neural network
  • Continuous voice recognition method based on deep long and short term memory recurrent neural network
  • Continuous voice recognition method based on deep long and short term memory recurrent neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0035] The present invention proposes a method and device for a robust deep long short-term memory neural network acoustic model, especially for continuous speech recognition scenarios. These methods and devices are not limited to continuous speech recognition, and may be any methods and devices related to speech recognition.

[0036] Step 1. Establish two deep long-term short-term memory recurrent neural network modules with the same structure including multiple long-term short-term memory layers and linear recurrent projection layers, and send the original pure speech signal and noisy signal as input to the two modules.

[0037] figure 1 It is a flow chart of the depth long short-term memory recurrent neural network module of the present invention, including the following:

[0038] Input 101 is speech signal x=[x 1 ,...,x T ] (T is the ti...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a continuous voice recognition method based on a deep long and short term memory recurrent neural network. According to the method, a noisy voice signal and an original pure voice signal are used as training samples, two deep long and short term memory recurrent neural network modules with the same structure are established, the difference between each deep long and short term memory layer of one module and the corresponding deep long and short term memory layer of the other module is obtained through cross entropy calculation, a cross entropy parameter is updated through a linear circulation projection layer, and a deep long and short term memory recurrent neural network acoustic model robust to environmental noise is finally obtained. By the adoption of the method, by establishing the deep long and short term memory recurrent neural network acoustic model, the voice recognition rate of the noisy voice signal is improved, the problem that because the scale of deep neutral network parameters is large, most of calculation work needs to be completed on a GPU is avoided, and the method has the advantages that the calculation complexity is low, and the convergence rate is high. The continuous voice recognition method based on the deep long and short term memory recurrent neural network can be widely applied to the multiple machine learning fields, such as speaker recognition, key word recognition and human-machine interaction, involving voice recognition.

Description

technical field [0001] The invention belongs to the field of audio technology, in particular to a continuous speech recognition method based on a deep long-short-term memory cyclic neural network. Background technique [0002] With the rapid development of information technology, speech recognition technology has the conditions for large-scale commercialization. At present, speech recognition mainly uses continuous speech recognition technology based on statistical models, and its main goal is to find the word sequence with the highest probability represented by a given speech sequence. A continuous speech recognition system usually includes an acoustic model, a language model, and a decoding method. As the core technology of continuous speech recognition, the acoustic modeling method has developed rapidly in recent years. The commonly used acoustic model is the Gaussian Mixture Model-Hidden Markov Model (GMM-HMM). The principle is: train the Gaussian Mixture Model to obtai...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/16
CPCG10L15/16
Inventor 杨毅孙甲松
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products