Unlock instant, AI-driven research and patent intelligence for your innovation.

Speech recognition system and method using noise padding and normalization in dynamic time warping

A speech recognition and noise technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as loss of efficiency, inaccurate end points of words, and inability to adopt

Inactive Publication Date: 2002-02-13
D S P C TECH
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 3. The phrase to be recognized is a single word;
[0007] 5. The ambient noise is unknown to the system until the instant the user presses the talk (PTT) button to start speaking;
[0009] 7. The system has only limited fast access memory, it is impossible to run the DTW matching algorithm against all reference templates in a real-time and word recognition manner
[0012] 2. In the recognition stage, the word endpoint estimated by VAD is not accurate
[0028] The above approach has two disadvantages: (a) it cannot be adopted for the tasks defined in items (2) to (7) above; and (b) the one-shot approach suffers from an efficiency loss in dealing with acoustic mismatch ( Problem 1), since there is no precise information on the noise level in the primary algorithm
This creates a problem with the exclusion mechanism, because double-syllable intonation is more likely to be excluded than single-syllable intonation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition system and method using noise padding and normalization in dynamic time warping
  • Speech recognition system and method using noise padding and normalization in dynamic time warping
  • Speech recognition system and method using noise padding and normalization in dynamic time warping

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] see image 3 The system of the present invention is shown. The system includes a feature extractor 50 , feature buffer 52 , voice activity detector (VAD) 54 , template database 56 , two feature transformers 58A and 58B, a comparison unit 60 and a decision unit 62 . According to the preferred embodiment of the present invention, the comparison unit 62 is a noise-adapted dynamic time warping (DTW) unit, and the system also includes a template filler 64, a wide language symbolizer 66, a noise and peak energy The estimator 68, and a gain and gain-to-noise adapter 70 will be described in detail below.

[0055] In operation, feature extractor 50 extracts features such as auto-correction coefficients or filter bank energies for each frame of the input signal and provides them to voice activity detector 54 and feature buffer 52 . The buffer 52 stores the characteristics of each frame in frame order, keeping records of these frames for a predetermined length of time. The voic...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Speech recognition uses a wide token builder (66), gain and noise adapter (70) and noise adapted Dynamic Time Warping (60). Wide token builder produces a padded test token expanded with at least one blank frame before and after the input test utterance. Gain and noise adapter adapts each padded reference template with noise and gain qualities producing adapted reference templates having noise frames wherever a blank frame was originally placed and noise adapted speech where speech exists. Dynamic Time Warping (DTW) is performed on the noise adapted templates.

Description

field of invention [0001] The present invention relates generally to speech recognition, and more particularly to specific speaker recognition techniques in noisy environments. Background of the invention [0002] Speech recognition in noisy environments is a long-studied and still unsolved task. This task is characterized by the following parameters: [0003] 1. The recognition is different from person to person, and the reference template is generated by the speaking tone of the user in a specified "training dialogue"; [0004] 2. It is desirable to reduce the number of training tones to a minimum number (1-3), which, in the prior art, would make the Dynamic Time Warping (DTW) matching algorithm more efficient than the Hidden Markov Model (HMM) algorithm: [0005] 3. The phrase to be recognized is a single word; [0006] 4. The training stage is relatively low-noise, but additional environmental noise needs to be dealt with in the recognition; [0007] 5. The ambient n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/12G10L15/20
CPCG10L21/0216G10L15/12G10L15/20
Inventor 阿多姆·艾瑞尔
Owner D S P C TECH