Depth bidirectional LSTM acoustic model based on Maxout nerve cells

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An acoustic model and neuron technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems that cannot be directly applied, speech recognition delays, etc.

Active Publication Date: 2017-10-27

CHONGQING UNIV OF POSTS & TELECOMM

View PDF8 Cites 33 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, due to the bidirectional dependence of DBLSTM on each time step, the above training algorithm cannot be directly applied to DBLSTM training

Because in LVCSR, DBLSTM is not suitable for low-latency recognition, which may cause a delay in the entire speech recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0054] The technical solutions in the embodiments of the present invention will be described clearly and in detail below with reference to the drawings in the embodiments of the present invention. The described embodiments are only some of the embodiments of the invention.

[0055] The technical scheme that the present invention solves the problems of the technologies described above is:

[0056] The present invention as figure 1 Shown is a diagram of a single LSTM cell structure, which differs from a standard RNN. For standard RNN, given an input sequence x=(x 1 ,x 2 ,...,x T ), the state vector h=(h 1 , h 2 ,..., h T ) and output vector y=(y 1 ,y 2 ,...,y T ),which is

[0057] h t =H(W xh x t +W hh h t-1 +b h )

[0058] the y t =W hy h t +b y

[0059] Among them, W represents the weight matrix between each layer; b h and b y are the bias vectors of the hidden layer and the output layer respectively; H is the activation function of the output layer. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an acoustic model based on a depth bidirectional long short-term memory (DBLSTM) recurrent neural network (RNN). The DBLSTM network is mainly divided into three parts: in a full-connection part of the DBLSTM, Maxout nerve cells are used for replacing original Sigmoid nerve cells to solve the problems of gradient disappearance and explosion commonly appearing in RNN; simultaneously, Dropout regularization training algorithm is used for preventing the neural network from causing overfitting in the training process. In the multi-layer BLSTM part, in order to adapt to the bidirectional dependency of DBLSTM on each time step length, a Context-sensitive-chunk Back-propagation through time (CSC-BPTT) algorithm is provided for training the network. A selection link layer is adopted on the back of the multi-layer BLSTM part, and is used for carrying out conversion on the output of the DBLSTM to obtain the input of the full-connection part. According to the acoustic model, higher voice recognition rate can be obtained.

Description

technical field [0001] The invention belongs to the field of artificial intelligence speech recognition, and mainly relates to the application of a deep neural network in a speech acoustic model. Background technique [0002] The research and application of Deep Neural Network (DNN) has greatly promoted the development of Automatic Speech Recognition (ASR) technology. In the large vocabulary continuous speech recognition (LVCSR) system, compared with the traditional Gaussian mixture model-hidden Markov model (GMM-HMM) acoustic model, based on DNN Acoustic models of , show better advantages. Seide F et al. used the DNN-HMM acoustic model for the Switchboard phone transcription task, and its word error rate (Word error rate, WER) dropped by 33%. The research and extension of the DNN-HMM acoustic model has brought unprecedented development to ASR technology. [0003] Previous research has mainly focused on feed-forward neural networks for processing contextual acoustic featu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/16G10L15/14G10L15/06

CPCG10L15/063G10L15/144G10L15/16

Inventor 罗元刘宇张毅

Owner CHONGQING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Depth bidirectional LSTM acoustic model based on Maxout nerve cells

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology