Chinese speech recognition system based on heterogeneous model differentiated fusion

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition and distinguishing technology, applied in the system field of speech recognition technology, can solve the problems of increasing search space, unable to match well, unable to obtain continuous speech recognition effect, etc.

Inactive Publication Date: 2008-12-31

SHANGHAI JIAO TONG UNIV

View PDF0 Cites 14 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] After searching the prior art documents, it was found that Lei Xin et al published "Improved Tone Modeling for Mandarin Broadcast News" in "International Conference on Speech and Language Proceesing" (Speech Language Processing International Conference Proceedings) pp.1277-1280, Sep.2006 SpeechRecognition" (improved tone modeling in Chinese broadcast news speech recognition) and Wang Huanliang et al. in "The 5th International Symposium on Chinese Spoken Language Processing" (The Fifth International Conference on Chinese Spoken Language Processing) "Improved Mandarin SpeechRecognition by Lattice Rescoring with Enhanced Tone models".pp.445-443, 2006. (Using the improved tone model to improve lattice decoding in Chinese speech recognition) uses heuristic methods to select global spectral features based on experience or through search methods The weights of the model and tone model are fused with heterogeneous models. This method usually does not get the best continuous speech recognition effect. This is because the spectral feature model and the tone model are trained independently and cannot be matched well in the continuous speech recognition process. ; on the other hand, global model weights cannot model specific phonetic / semantic situations

Moreover, if the number of heterogeneous models increases, the search space also increases exponentially, which also increases the difficulty of manual selection.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The embodiments of the present invention are described in detail below: the present embodiment is implemented under the premise of the technical solution of the present invention, and detailed implementation and specific operation process are provided, but the protection scope of the present invention is not limited to the following implementation example.

[0030] This embodiment is further described under the tone syllable output and Chinese character output recognition system based on the 28,000 command words Chinese large vocabulary non-specific speech recognition system.

[0031] As shown in Figure 1, the present embodiment includes: a model probability weight distribution module, a discriminative model probability weight training module, a model probability weight smoothing module and a discriminatively fused speech recognition module, wherein:

[0032] The model probability weight assignment module is responsible for generating and initializing context-dependent ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a Chinese speech recognition system which pertains to the speech recognition technology field and is based on heterogeneous model differential fusion. The system comprises: a model-probability weighty-distribution module, a differential model-probability weighty-training module, a model-probability weighty-smoothing module and a speech recognition module of differential fusion. The model-probability weighty-distribution module is responsible for generating the relevant model-probability weight sets for the linguistic context of every arc of a lattice and carrying out initialization; the differential model-probability weighty-training module utilizes minimum tone error rule to differentially train the output of heterogeneous model and obtain a minimum tone error cumulant, and a differential model-probability weight sets is obtained according to the minimum tone error cumulant; the model-probability weighty-smoothing module carries out the smoothing process on the relevant model-probability weight sets which is input into the context; the speech recognition module of differential fusion carries out speech recognition output by the weight sets after the smoothing process. The system can reduce the relative error recognition rate of speech recognition.

Description

technical field [0001] The present invention relates to a system used in the technical field of speech recognition, in particular to a Chinese speech recognition system based on differential fusion of heterogeneous models. Background technique [0002] At present, continuous speech recognition systems with large vocabulary are increasingly developing towards the direction of multi-modal and multi-information fusion. Using a variety of heterogeneous models to reduce the confusion of speech recognition systems is an important means for current speech recognition systems to improve recognition performance. A special case of using multiple heterogeneous models is the Chinese speech recognition system. A big difference between Chinese speech recognition and English speech recognition is that the Chinese language is a tonal language. There are 6763 commonly used Chinese characters listed in the national standard. A syllable is the natural unit of Chinese pronunciation, and a squa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/08

Inventor 朱杰黄浩

Owner SHANGHAI JIAO TONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Chinese speech recognition system based on heterogeneous model differentiated fusion

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology