Speech recognition modeling method and speech recognition modeling device

A speech recognition model and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as poor speech recognition performance, constraint modeling methods, complex speech signals, etc., to improve recognition speed and recognition accuracy Effect

Active Publication Date: 2016-05-04
BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
View PDF5 Cites 39 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Speech signals are typical time-series signals with short-term stationary characteristics, but due to factors such as background noise, channel, speaker (ie gender, age, speech rate and / or accent, etc.), the speech signal is very complex
[0004] However, existing speech recognition methods are all based on hybrid methods, and the modeling units in speech recognition are all based on traditional state modeling units. State modeling greatly restricts all existing modeling methods. Using the above The speech recognition model established by the state modeling unit has poor recognition performance for speech recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition modeling method and speech recognition modeling device
  • Speech recognition modeling method and speech recognition modeling device
  • Speech recognition modeling method and speech recognition modeling device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals designate the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention. On the contrary, the embodiments of the present invention include all changes, modifications and equivalents coming within the spirit and scope of the appended claims.

[0021] figure 1 It is a flowchart of an embodiment of the modeling method for speech recognition of the present invention, such as figure 1 As shown, the above-mentioned modeling method for speech recognition may include:

[0022] Step 101, converting the speech signal into a sequence of feature vectors, and converting the marked text corresponding to the speech signal i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a speech recognition modeling method and a speech recognition modeling device. The above speech recognition modeling method comprises steps: speech signals are converted into a feature vector sequence and an annotated text corresponding to the speech signals is converted into a modeling unit sequence, wherein each modeling unit in the modeling unit sequence is a complete initial consonant or vowel producing unit or an initial consonant or vowel producing unit related to a context; a blank label is added in front of or behind any modeling unit in the modeling unit sequence; and based on connectionist temporal classification (CTC), the feature vector sequence and the modeling unit sequence after the blank label is added are trained, and a speech recognition model is built. According to the CTC-based deep neural network initial consonant and vowel modeling of the invention, the recognition speed and the recognition accuracy for the built speech recognition model can be improved.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a modeling method and device for speech recognition. Background technique [0002] Existing large vocabulary Chinese speech recognition methods are mainly based on hybrid methods, such as: Gaussian Mixture Model (GaussianMixtureModel; hereinafter referred to as: GMM) + Hidden Markov Model (HiddenMarkovModel; hereinafter referred to as: HMM), deep neural network (DeepNeuralNetwork; hereinafter Abbreviation: DNN)+HMM, etc. Specifically, in speech recognition based on statistical mixed method modeling, methods for estimating the state posterior probability of hidden Markov models include: Gaussian mixture model, deep neural network (specifically, deep multi-layer perceptron) , deep convolutional neural network and deep recurrent neural network, etc., as well as a combination model of several. [0003] Speech signals are typical time-series signals with short-term statio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/06
CPCG10L15/06G10L15/063
Inventor 白锦峰苏丹胡娜贾磊
Owner BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products