Speech recognition method based on model pre-training and bidirectional LSTM

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition and pre-training technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problem of poor anti-noise ability of neural network

Active Publication Date: 2018-10-19

BEIJING INSTITUTE OF TECHNOLOGYGY

View PDF7 Cites 24 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0013] The purpose of the present invention is to solve the problem of poor anti-noise ability of neural network under high noise conditions, and propose a speech recognition method of model pre-training and bidirectional LSTM

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0078] This embodiment describes the speech recognition method based on pre-training and bidirectional LSTM according to the present invention.

[0079] Step A: Input the speech signal to be processed;

[0080] Be specific to the present embodiment, adopt matlab to superimpose noise signal for pure speech according to SNR 9:1, 7:3, the format of the file of each input speech signal to be processed is '.wav';

[0081] Step B: speech signal preprocessing;

[0082] In this embodiment, the voice signal input in step A is passed through a high-pass filter, wherein the coefficient of the filter is 0.96;

[0083] Select 25ms, divide the speech signal processed by the high-pass filter into frames, and set a frame shift of 12.5ms, and convert the speech signal to be processed input in step A into a short-term speech signal T(n) in units of frames ;

[0084] Each frame of short-term speech signal is multiplied by the Hamming window function with a value of 0.46 to obtain the frame si...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech recognition method based on model pre-training and bidirectional LSTM and belongs to the field of deep learning and speech recognition. The method comprises steps of 1) inputting a to-be-processed speech signal; 2) preprocessing the to-be-processed speech signal; 3) extracting a Mel-frequency cepstrum coefficient and a dynamic difference to obtain a speech feature;4) constructing a bidirectional LSTM structure; 5) optimizing the bidirectional LSTM by using an maxout function to obtain maxout-biLSTM; 6) performing model pre-training; 7) training the noise-containing speech signal by using the pre-trained maxout-biLSTM to obtain a result. The method improves the original activation function of the bidirectional LSTM by using the maxout activation function, and uses the model pre-training method to improve the robustness of an acoustic model in a noisy environment, and can be used for building and training a speech recognition model in a high-noise environment.

Description

technical field [0001] The present invention relates to a speech recognition method of model pre-training and bidirectional LSTM, in particular to a speech recognition method based on pre-training, maxout activation function and bidirectional LSTM model, which can significantly improve the anti-noise ability of neural network in high noise environment , which belongs to the field of deep learning and speech recognition. Background technique [0002] With the continuous development and wide application of computer software and hardware technology, speech recognition technology has been developed rapidly, and the research of speech recognition has attracted more and more people's attention. In recent years, the successful application of deep learning in the field of speech recognition has also made good results in the field of speech recognition. However, the performance of the speech recognition system tends to drop sharply in the high-noise environment of real life. The ess...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/20G10L15/16G10L15/06G10L25/24G10L25/18G10L25/45G10L25/30

CPCG10L15/063G10L15/16G10L15/20G10L25/18G10L25/24G10L25/30G10L25/45

Inventor 金福生王茹楠张俊逸韩翔宇

Owner BEIJING INSTITUTE OF TECHNOLOGYGY

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech recognition method based on model pre-training and bidirectional LSTM

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology