Voice identification method using long-short term memory model recurrent neural network

A technology of recurrent neural network and long-term short-term memory, which is applied in speech recognition, speech analysis, instruments, etc., and can solve problems such as inconsistency in training goals

Inactive Publication Date: 2017-01-11
SHENZHEN WEITESHI TECH
View PDF4 Cites 108 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] For the problem of network performance in speech recognition and the training target of each part of the traditional speech recognition system is inconsistent with the training target of the whole system, the object of the present invention is to provide a speech recognition method utilizing the long-short-term memory model recursive neural network, Model parameters can be obtained through training, and then used for speech and text data recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice identification method using long-short term memory model recurrent neural network
  • Voice identification method using long-short term memory model recurrent neural network
  • Voice identification method using long-short term memory model recurrent neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present invention will be further described in detail below in conjunction with the drawings and specific embodiments.

[0059] figure 1 It is a flowchart of the training process of the present invention, including voice data and text data, acoustic model and language model, RNN sensor, decoding, and model parameters.

[0060] Speech data and text data are for training on speech data and text data.

[0061] The acoustic model and the language model are to process speech data and text data using the acoustic model and the language model.

[0062] The RNN sensor predicts the correspondence between each phoneme and the previous phoneme, thereby producing a jointly trained acoustic and language model. The RNN sensor determines a separate distribution Pr(k| t, u), for a length U and the target sequence z, t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice identification method using a long-short term memory model recurrent neural network. The voice identification method comprises training and identification. The training process comprises steps of introducing voice data and text data to generate a commonly-trained acoustic and language mode, and using an RNN sensor to perform decoding to form a model parameter. The identification process comprises steps of converting voice input to a frequency spectrum graph through Fourier conversion, using the recursion neural network of the long-short term memory model to perform orientational searching decoding and finally generating an identification result. The voice identification method adopts the recursion neural network (RNNs) and adopts connection time classification (CTC) to train RNNs through an end-to-end training method. These LSTM units combining with the long-short term memory have good effects and combines with multi-level expression to prove effective in a deep network; only one neural network model (end-to-end model) exits from a voice characteristic (an input end) to a character string (an output end) and the neural network can be directly trained by a target function which is a some kind of a proxy of WER, which avoids to cost useless work to optimize an individual target function.

Description

technical field [0001] The invention relates to the field of speech recognition, in particular to a speech recognition method using a long-short-term memory model recursive neural network. Background technique [0002] Speech recognition is often used in various smart devices, smart homes and other fields. So far, the error of speech recognition is relatively large, and the effect is not ideal. Recurrent neural network (RNNs) models can be used to model the relationship between two sequences. However, in traditional RNNs, there is a one-to-one correspondence between the labeled sequence and the input sequence. Not suitable for sequence modeling in speech recognition: the length of the recognized character sequence or phoneme sequence is much smaller than the input feature frame sequence. So it cannot be modeled directly with RNN. [0003] The present invention adopts recursive neural network (RNNs), and adopts connection time classification (CTC) to train RNNs through an ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06G10L15/08G10L15/16
CPCG10L15/063G10L15/08G10L15/16G10L2015/0631
Inventor 夏春秋
Owner SHENZHEN WEITESHI TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products