Continuous voice recognition method based on deep long and short term memory recurrent neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of cyclic neural network and long-term short-term memory, which is applied in speech recognition, speech analysis, instruments, etc., can solve the problem of interfering continuous speech recognition system performance, acoustic model noise resistance and robustness need to be improved, deep neural network method parameters large scale issues

Active Publication Date: 2015-04-22

TSINGHUA UNIV

View PDF5 Cites 63 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] However, the complexity of the actual acoustic environment still seriously affects and interferes with the performance of the continuous speech recognition system. Even with the best deep neural network method, continuous speech recognition data sets under complex conditions including noise, music, spoken language, repetition, etc. It can only obtain a recognition rate of about 70%, and the noise resistance and robustness of the acoustic model in the continuous speech recognition system need to be improved

In addition, the parameters of the deep neural network method are large, and most of the calculation work needs to be completed on the GPU device, which is difficult for ordinary CPUs. Therefore, this type of method still has a certain distance from the requirements of large-scale commercialization.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] The implementation of the present invention will be described in detail below in conjunction with the drawings and examples.

[0035] The present invention proposes a method and device for a robust deep long short-term memory neural network acoustic model, especially for continuous speech recognition scenarios. These methods and devices are not limited to continuous speech recognition, and may be any methods and devices related to speech recognition.

[0036] Step 1. Establish two deep long-term short-term memory recurrent neural network modules with the same structure including multiple long-term short-term memory layers and linear recurrent projection layers, and send the original pure speech signal and noisy signal as input to the two modules.

[0037] figure 1 It is a flow chart of the depth long short-term memory recurrent neural network module of the present invention, including the following:

[0038] Input 101 is speech signal x=[x 1 ,...,x T ] (T is the ti...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a continuous voice recognition method based on a deep long and short term memory recurrent neural network. According to the method, a noisy voice signal and an original pure voice signal are used as training samples, two deep long and short term memory recurrent neural network modules with the same structure are established, the difference between each deep long and short term memory layer of one module and the corresponding deep long and short term memory layer of the other module is obtained through cross entropy calculation, a cross entropy parameter is updated through a linear circulation projection layer, and a deep long and short term memory recurrent neural network acoustic model robust to environmental noise is finally obtained. By the adoption of the method, by establishing the deep long and short term memory recurrent neural network acoustic model, the voice recognition rate of the noisy voice signal is improved, the problem that because the scale of deep neutral network parameters is large, most of calculation work needs to be completed on a GPU is avoided, and the method has the advantages that the calculation complexity is low, and the convergence rate is high. The continuous voice recognition method based on the deep long and short term memory recurrent neural network can be widely applied to the multiple machine learning fields, such as speaker recognition, key word recognition and human-machine interaction, involving voice recognition.

Description

technical field [0001] The invention belongs to the field of audio technology, in particular to a continuous speech recognition method based on a deep long-short-term memory cyclic neural network. Background technique [0002] With the rapid development of information technology, speech recognition technology has the conditions for large-scale commercialization. At present, speech recognition mainly uses continuous speech recognition technology based on statistical models, and its main goal is to find the word sequence with the highest probability represented by a given speech sequence. A continuous speech recognition system usually includes an acoustic model, a language model, and a decoding method. As the core technology of continuous speech recognition, the acoustic modeling method has developed rapidly in recent years. The commonly used acoustic model is the Gaussian Mixture Model-Hidden Markov Model (GMM-HMM). The principle is: train the Gaussian Mixture Model to obtai...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/16

CPCG10L15/16

Inventor 杨毅孙甲松

Owner TSINGHUA UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Continuous voice recognition method based on deep long and short term memory recurrent neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology