Speech recognition optimization decoding method integrating guide probability

A technology of speech recognition and decoding method, which is applied in speech recognition, speech analysis, instruments, etc., and can solve the problems of lack of position information of the speech frame to be recognized, lack of partial local space enhanced search, etc.

Inactive Publication Date: 2013-03-20
INST OF AUTOMATION CHINESE ACAD OF SCI
View PDF4 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to solve the shortcomings of the lack of using the position information of the speech frame to be recognized in the acoustic feature space and the lack of enhanced search for some local spaces in the existing speech recognition decoding technology

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition optimization decoding method integrating guide probability
  • Speech recognition optimization decoding method integrating guide probability
  • Speech recognition optimization decoding method integrating guide probability

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0028] In order to add the position information of the speech frame in the acoustic feature space in the decoding process, the present invention is based on the response relationship between the statistical phoneme and each Gaussian component in the Universal Background Model (Universal Background Model, UBM), by establishing each Gaussian component in the UBM. The corresponding relationship between the components and the phonemes is used to obtain the position of the speech frame in the acoustic feature space, which is expressed as the probability that the speech frame belongs to different parts of the acoustic feature space, and the guidance probability is obtained. When decoding, use the position information of the spee...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a speech recognition decoding method integrating guide probability and provides a guide probability model for overcoming the insufficiency that the traditional speech recognition system insufficiently utilizes position information of a speech frame in an acoustic characteristic space, and for the purpose of describing the probability that the speech frame belongs to different parts of the acoustic characteristic space and guiding a decoding process. The method comprises the following steps of training a universal background model for describing the whole acoustic characteristic space, computing a main gaussian component of the speech frame in the universal background model, utilizing an acoustic model of a recognition system to forcibly segment a training corpus, obtaining a phoneme of the speech frame, counting response frequency of the phoneme and main gaussian, normalizing the respond frequency, obtaining the guide probability, fusing the guide probability to total score computation of a speech recognition path, and guiding a decoder to enhance or weaken the path.

Description

technical field [0001] The invention relates to the field of speech recognition, in particular to the fields of acoustic modeling and decoding for speech recognition. Background technique [0002] At present, speech recognition systems generally use Hidden Markov Model as the basic model of acoustic modeling and decoding. In order to consider the impact of contextual pronunciation on speech units, triphone models are often used to improve the system recognition rate. But after considering the context, the number of models and the size of the parameters increase dramatically. Taking the Chinese large vocabulary continuous speech recognition system as an example, the basic phoneme set only contains 191 initials and tonal finals, while the corresponding triphone models total more than 200,000. Even after the parameter sharing of the model layer, state layer and Gaussian component layer, the scale of parameters is still huge. This will not only bring about insufficient parame...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/00G10L15/06G10L15/08
Inventor 刘文举杨占磊
Owner INST OF AUTOMATION CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products