Voice identification method using long-short term memory model recurrent neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of recurrent neural network and long-term short-term memory, which is applied in speech recognition, speech analysis, instruments, etc., and can solve problems such as inconsistency in training goals

Inactive Publication Date: 2017-01-11

SHENZHEN WEITESHI TECH

View PDF4 Cites 108 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] For the problem of network performance in speech recognition and the training target of each part of the traditional speech recognition system is inconsistent with the training target of the whole system, the object of the present invention is to provide a speech recognition method utilizing the long-short-term memory model recursive neural network, Model parameters can be obtained through training, and then used for speech and text data recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0058] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present invention will be further described in detail below in conjunction with the drawings and specific embodiments.

[0059] figure 1 It is a flowchart of the training process of the present invention, including voice data and text data, acoustic model and language model, RNN sensor, decoding, and model parameters.

[0060] Speech data and text data are for training on speech data and text data.

[0061] The acoustic model and the language model are to process speech data and text data using the acoustic model and the language model.

[0062] The RNN sensor predicts the correspondence between each phoneme and the previous phoneme, thereby producing a jointly trained acoustic and language model. The RNN sensor determines a separate distribution Pr(k| t, u), for a length U and the target sequence z, t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice identification method using a long-short term memory model recurrent neural network. The voice identification method comprises training and identification. The training process comprises steps of introducing voice data and text data to generate a commonly-trained acoustic and language mode, and using an RNN sensor to perform decoding to form a model parameter. The identification process comprises steps of converting voice input to a frequency spectrum graph through Fourier conversion, using the recursion neural network of the long-short term memory model to perform orientational searching decoding and finally generating an identification result. The voice identification method adopts the recursion neural network (RNNs) and adopts connection time classification (CTC) to train RNNs through an end-to-end training method. These LSTM units combining with the long-short term memory have good effects and combines with multi-level expression to prove effective in a deep network; only one neural network model (end-to-end model) exits from a voice characteristic (an input end) to a character string (an output end) and the neural network can be directly trained by a target function which is a some kind of a proxy of WER, which avoids to cost useless work to optimize an individual target function.

Description

technical field [0001] The invention relates to the field of speech recognition, in particular to a speech recognition method using a long-short-term memory model recursive neural network. Background technique [0002] Speech recognition is often used in various smart devices, smart homes and other fields. So far, the error of speech recognition is relatively large, and the effect is not ideal. Recurrent neural network (RNNs) models can be used to model the relationship between two sequences. However, in traditional RNNs, there is a one-to-one correspondence between the labeled sequence and the input sequence. Not suitable for sequence modeling in speech recognition: the length of the recognized character sequence or phoneme sequence is much smaller than the input feature frame sequence. So it cannot be modeled directly with RNN. [0003] The present invention adopts recursive neural network (RNNs), and adopts connection time classification (CTC) to train RNNs through an ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/06G10L15/08G10L15/16

CPCG10L15/063G10L15/08G10L15/16G10L2015/0631

Inventor 夏春秋

Owner SHENZHEN WEITESHI TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voice identification method using long-short term memory model recurrent neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology