Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese speech recognition system based on heterogeneous model differentiated fusion

A speech recognition and distinguishing technology, applied in the system field of speech recognition technology, can solve the problems of increasing search space, unable to match well, unable to obtain continuous speech recognition effect, etc.

Inactive Publication Date: 2008-12-31
SHANGHAI JIAO TONG UNIV
View PDF0 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] After searching the prior art documents, it was found that Lei Xin et al published "Improved Tone Modeling for Mandarin Broadcast News" in "International Conference on Speech and Language Proceesing" (Speech Language Processing International Conference Proceedings) pp.1277-1280, Sep.2006 SpeechRecognition" (improved tone modeling in Chinese broadcast news speech recognition) and Wang Huanliang et al. in "The 5th International Symposium on Chinese Spoken Language Processing" (The Fifth International Conference on Chinese Spoken Language Processing) "Improved Mandarin SpeechRecognition by Lattice Rescoring with Enhanced Tone models".pp.445-443, 2006. (Using the improved tone model to improve lattice decoding in Chinese speech recognition) uses heuristic methods to select global spectral features based on experience or through search methods The weights of the model and tone model are fused with heterogeneous models. This method usually does not get the best continuous speech recognition effect. This is because the spectral feature model and the tone model are trained independently and cannot be matched well in the continuous speech recognition process. ; on the other hand, global model weights cannot model specific phonetic / semantic situations
Moreover, if the number of heterogeneous models increases, the search space also increases exponentially, which also increases the difficulty of manual selection.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese speech recognition system based on heterogeneous model differentiated fusion
  • Chinese speech recognition system based on heterogeneous model differentiated fusion
  • Chinese speech recognition system based on heterogeneous model differentiated fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The embodiments of the present invention are described in detail below: the present embodiment is implemented under the premise of the technical solution of the present invention, and detailed implementation and specific operation process are provided, but the protection scope of the present invention is not limited to the following implementation example.

[0030] This embodiment is further described under the tone syllable output and Chinese character output recognition system based on the 28,000 command words Chinese large vocabulary non-specific speech recognition system.

[0031] As shown in Figure 1, the present embodiment includes: a model probability weight distribution module, a discriminative model probability weight training module, a model probability weight smoothing module and a discriminatively fused speech recognition module, wherein:

[0032] The model probability weight assignment module is responsible for generating and initializing context-dependent ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a Chinese speech recognition system which pertains to the speech recognition technology field and is based on heterogeneous model differential fusion. The system comprises: a model-probability weighty-distribution module, a differential model-probability weighty-training module, a model-probability weighty-smoothing module and a speech recognition module of differential fusion. The model-probability weighty-distribution module is responsible for generating the relevant model-probability weight sets for the linguistic context of every arc of a lattice and carrying out initialization; the differential model-probability weighty-training module utilizes minimum tone error rule to differentially train the output of heterogeneous model and obtain a minimum tone error cumulant, and a differential model-probability weight sets is obtained according to the minimum tone error cumulant; the model-probability weighty-smoothing module carries out the smoothing process on the relevant model-probability weight sets which is input into the context; the speech recognition module of differential fusion carries out speech recognition output by the weight sets after the smoothing process. The system can reduce the relative error recognition rate of speech recognition.

Description

technical field [0001] The present invention relates to a system used in the technical field of speech recognition, in particular to a Chinese speech recognition system based on differential fusion of heterogeneous models. Background technique [0002] At present, continuous speech recognition systems with large vocabulary are increasingly developing towards the direction of multi-modal and multi-information fusion. Using a variety of heterogeneous models to reduce the confusion of speech recognition systems is an important means for current speech recognition systems to improve recognition performance. A special case of using multiple heterogeneous models is the Chinese speech recognition system. A big difference between Chinese speech recognition and English speech recognition is that the Chinese language is a tonal language. There are 6763 commonly used Chinese characters listed in the national standard. A syllable is the natural unit of Chinese pronunciation, and a squa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/08
Inventor 朱杰黄浩
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products