Speech recognition method based on convolution neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and speech recognition technology, applied in the field of speech recognition based on convolutional neural network, can solve the problems of long training time of acoustic model, complex modeling process, limited application, etc. The effect of simple modeling process and easy training

Active Publication Date: 2019-01-25

JIANGNAN UNIV

View PDF2 Cites 36 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In order to solve the problems of long training time, complex modeling process and limited application of existing acoustic models, the present invention provides a speech recognition method based on convolutional neural network, which is better at extracting high-level features, and the modeling process is simple and easy to train , The generalization performance of the model is better, and it can be more widely applied to various speech recognition scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0049] Such as Figure 1~Figure 5 As shown, the technical solution of the present invention is based on DCNN (Deep Convolutional NeuralNetwork) network model and CTC (Connectionist Temporal Classification, connectionist temporal classifier) method to realize the acoustic model of end-to-end mode; comprising the following steps:

[0050] S1: Input the original voice, preprocess the original voice signal, and perform related transformation processing;

[0051] S2: Extract key feature parameters reflecting the features of the speech signal to form a sequence of feature vectors;

[0052] S3: Build an acoustic model; use the DCNN network model as the basis and use the connectionist time classifier CTC as the loss function to build an end-to-end acoustic model;

[0053] The structure of the acoustic model includes multiple convolutional layers, two fully connected layers, and CTC loss function set in sequence; the structure of multiple convolutional layers is: the first layer and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a speech recognition method based on a convolution neural network, which is more good at extracting high-level features, has simple modeling process, is easy to train, has better generalization performance of the model, and can be more widely applied to various speech recognition scenes. The method comprises the following steps: S1, preprocessing the input original speech signal; S2, extracting the key feature parameters reflecting the characteristics of the speech signal to form a feature vector sequence;S 3, base on that DCNN network model, taking the connected time classifier CTC as a loss function, constructing an acoustic model of an end-to-end mode; S4, training the acoustic model to obtain the trained acoustic model; S5, inputting the feature vector sequence to be recognized obtained in the step S2 into the trained acoustic model to obtain a recognition result; and S6, performing a subsequent operation on the basis of the recognition result obtained in step S5, that is, obtaining a word string capable of outputting the speech signal with a maximum probability, that is, a language character after the original speech is recognized.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a speech recognition method based on a convolutional neural network. Background technique [0002] In speech recognition technology, the GMM-HMM (Gaussian Mixed Model-Hiddden Markov Model) model has always occupied a dominant position as the acoustic model of speech, but due to the characteristics of the GMM-HMM model itself, the GMM-HMM acoustic model is First of all, the alignment operation needs to be performed, and the data of each frame needs to be aligned with the corresponding label. The alignment process is cumbersome and complicated, resulting in a long training time, and because the model is a combination model of GMM and HMM, the specific modeling process is relatively simple. It is complex and has certain limitations in the specific application of speech recognition technology. Contents of the invention [0003] In order to solve the problems of long tra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/06G10L15/16

CPCG10L15/063G10L15/16G10L2015/0631

Inventor 曹毅张威翟明浩刘晨黄子龙李巍

Owner JIANGNAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech recognition method based on convolution neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology