Feature parameter vector grouping quantization method in speech recognition

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition and feature parameter technology, applied in the field of biometrics, can solve problems such as the inability to ensure that the algorithm converges to the global optimum, and the large amount of calculation.

Inactive Publication Date: 2009-11-18

HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

View PDF0 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The LBG algorithm is essentially a modification of the K value algorithm, both of which belong to the steepest descent algorithm (Steepest Descend Algorithm). The two algorithms can only reach the local optimum, and cann

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment approach 1

[0011] Specific embodiment one: referring to Fig. 1, this embodiment consists of the following steps:

[0012] Step A1: Group all speech data participating in training according to their characteristics.

[0013] Step A2: Select the voice recognition information group to be processed, and set the required number N of codewords.

[0014] Step A3: extract the Mel frequency cepstral coefficients (MFCCs) of all speech signals contained in the current speech recognition information group.

[0015] Step A4: Randomly select the first C MFCCs of the current speech signal as the initial central values of the N MFCC codebooks, and the initial average distortion degree D old =∞; and set a cycle improvement parameter step, the initial value is 10 6 .

[0016] Step A5: according to the nearest neighbor rule based on the Euler distance, all the feature parameters MFCC of the current speech signal are mapped to the current V codeword regions to form N regions;

[0017] Step A6: Find ea...

specific Embodiment approach 2

[0025] Specific implementation mode two: this implementation mode further illustrates the specific method of the nearest neighbor method described in step A5 on the basis of specific implementation mode one:

[0026] Step B1: Calculate the distortion measure based on the Euclidean distance, the distortion measure is defined as the square of the distance between the vector X and the corresponding feature point in the feature space, d 2 ( x , z i ) = Σ k = 1 N ( x i - ( Z i ) K ) ...

specific Embodiment approach 3

[0028] Specific implementation mode three: this implementation mode further illustrates on the basis of specific implementation mode one that the specific method for eliminating the empty cell cavity and the average number of vectors in the cell cavity described in step A6 is:

[0029] Step C1: Find each region, if some region has no input vector into this category, it is determined that this region is an empty cavity, and marked.

[0030] Step C2: Find out the area with the largest number of vectors by item-by-item comparison, find out the feature vectors belonging to this area, divide the vector into this area, and classify it into the empty cell cavity marked in the previous step until the empty cell The number of vectors in the cavity is half of the number of vectors in the largest area.

[0031] Step C3: Find the center of the re-divided area to form a new codebook.

[0032] Step C4: Skip back to C1 until there is no empty cavity.

[0033] The above algorithm steps illu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

PUM

Login to view more

Abstract

A feature parameter vector grouping quantization method in speech recognition relates to a vector quantization method in speech recognition methods, belonging to the field of biological recognition. The current feature parameter vector quantization method in the speech recognition generally establishes a code book for multiple words; as speech information differences among the words are greater, and the range of the code book is limited, frames of different words can fall in a same code word region by the quantization of the code book, thus causing problems for improving the subsequent speech recognition rate. The feature parameter vector grouping quantization method leads all speech training information of all words to constitute a speech group, the code book after the vector quantization has great relevance to the information of the corresponding group, feature frames containing the same information in the speech information of different people for the same word can fall in the same code word region with very large probability, thereby providing a basis for the follow-up accurate calculation of various parameters.

Description

technical field [0001] The invention relates to a speech recognition method, in particular to a vector quantization method in the speech recognition method, and belongs to the field of biological recognition. Background technique [0002] Speech recognition belongs to a kind of biometric technology, which is a technology for machines to convert human voice signals into corresponding text or commands through the process of recognition and understanding. Speech recognition mainly includes three modules: parameter description, model approximation, and inference judgment. After the speech signal feature parameters are extracted, it is an important step in speech recognition parameter description to carry out reasonable and effective vector quantization coding on the feature parameters. [0003] The vector quantization coding process mainly includes two parts: [0004] Optimal division: Under the premise of a given number of vector quantizations, all codebooks are mapped to eac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

Application Information

Patent Timeline

Login to view more

IPC IPC(8): G10L19/02G10L15/06

Inventor 王明江余芳张爽李硕刘德

Owner HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Try Eureka

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.

Feature parameter vector grouping quantization method in speech recognition

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment approach 1

specific Embodiment approach 2

specific Embodiment approach 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology