A method for building an acoustic model and a speech decoding method based on the model

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An acoustic model and building method technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as inappropriate and aggravated cell myopia, troublesome sticky speech recognition, etc., and achieve the effect of improving overall performance and reducing confusion.

Inactive Publication Date: 2017-10-03

INST OF ACOUSTICS CHINESE ACAD OF SCI +1

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the number of word-level units in agglutinative language has increased dramatically due to the existence of agglutinative properties, and the number of common words has far exceeded the size that the dictionary can accommodate, so it is not suitable as the basic modeling unit of the language model; at the same time, the secondary natural language unit phoneme (or word , depending on the language and the sub-units are different) is not suitable as the basic modeling unit of the language model, because the sticky characteristics will aggravate the short-sighted phenomenon of this level unit

Second, in terms of acoustic models, the cohesion of phonemes in agglutinative languages will lead to a large number of co-articulation phenomena, that is, the same phoneme will have many different pronunciations depending on its location.

But the second problem has not yet found an effective solution, which is one of the difficulties that plague the speech recognition of agglutinative language

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0047]The embodiment of the present invention utilizes the method of isotopic phoneme separation to refine and classify the Korean phoneme set, and the steps include: extracting phonetic features from the Korean training data; calculating the three-factor Gaussian mixture model statistics of the basic phoneme set containing 40 phonemes in Korean; using The self-clustering method calculates the decision tree problem set according to the statistics; uses the decision tree to separate isophones, and the number of separated isophones is 30; according to the results of isophone separation, update the phoneme set, label and dictionary; use the labels containing isophones to train acoustics model, the acoustic model uses a new phoneme set containing 70 phonemes; decoding is performed using a new acoustic model and a dictionary containing allophones instead of an acoustic model and dictionary that only uses base phonemes.

[0048] The embodiments of the present invention use the isotop...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention provides a method for establishing an acoustic model and a speech decoding method based on the model. The method includes: step 101) calculating the statistics of the three-factor Gaussian mixture model required for the acoustic model based on the training data; step 102) using The self-clustering method calculates the decision tree problem set according to the statistics, uses the decision tree algorithm to segment and cluster the statistics based on the obtained decision tree problem set, and then obtains the isotopic phoneme; step 103) combines the basic phoneme set with the described The allophones are merged as phoneme sets containing allophones, and the original voice annotation is processed through the decision tree process, and the processed voice annotation is called phoneme annotation containing allophones; step 104) based on the phoneme set containing allophones and voice annotation , using an acoustic model training method to train the acoustic model to generate an acoustic model containing isotopes. The present invention will be dedicated to solving the problem of high acoustic model confusion in the agglutinative language speech recognition system.

Description

technical field [0001] The invention relates to the field of speech recognition, and is mainly aimed at an agglutinative speech recognition system. Background technique [0002] In language morphology, according to whether the language needs to rely on the change of word endings to express its grammatical relationship, it can be divided into analytic language and synthetic language. Classification. Adhesive language is a kind of synthetic language, which belongs to the comprehensive language with high inflection. Its word-level units are usually composed of a large number of morpheme connections, which are called agglutinative characteristics. Since the speech recognition system was originally designed for analytic languages and quasi-analytic languages, such as Chinese and English, the emergence of sticky features has brought many new problems to the traditional speech recognition system, which requires further improvement and improvement. improved. [0003] The proble...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/183

Inventor 颜永红徐及潘接林

Owner INST OF ACOUSTICS CHINESE ACAD OF SCI

A method for building an acoustic model and a speech decoding method based on the model

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology