Unlock instant, AI-driven research and patent intelligence for your innovation.

Extracting method of MFCC coefficients of voice signal, device and Mel filtering method

A speech signal and coefficient technology, applied in speech analysis, speech recognition, instruments, etc.

Active Publication Date: 2009-11-11
VIMICRO ELECTRONICS CORP
View PDF0 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by this invention is to provide a kind of MFCC coefficient extraction method and device of speech signal, to solve the problem that the MFCC coefficient extraction method of HTK exists

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Extracting method of MFCC coefficients of voice signal, device and Mel filtering method
  • Extracting method of MFCC coefficients of voice signal, device and Mel filtering method
  • Extracting method of MFCC coefficients of voice signal, device and Mel filtering method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0056] refer to figure 1 , is a flowchart of a method for extracting MFCC coefficients of a speech signal described in Embodiment 1.

[0057] S101, when performing Mel filtering, increase the number of subbands of the Mel filter bank, perform Mel filtering in the frequency range, and obtain a Mel filtering output corresponding to each subband;

[0058] That is, the original dimension of the Mel filter (that is, the number of subbands) is extended, and then the signal in the full frequency band is filtered. In this way, according to the mapping relationship between the Mel frequency and the linear frequency, the number of sub-bands in the low-frequency range on the signal frequency band (ie, the linear frequency band) is correspondingly increased, thereby ensuring sufficient frequency resolution accuracy for low-frequency signals. But at the same time, the number of sub-bands in the high-frequency range also increases accordingly. Since high-frequency signals are susceptible t...

Embodiment 2

[0070] The present invention is mainly applied to broadband signal processing with a frequency range of 0-16kHz, because the 16kHz broadband signal can basically meet the feature information required for speech recognition. The following will take a 16kHz broadband signal as an example to describe in detail. Among them, 0-8k is the low frequency range, and 8k-16k is the high frequency range. Of course, the present invention is not limited to the frequency range of 0-16 kHz.

[0071] refer to figure 2 , is a flowchart of a method for extracting MFCC coefficients of a speech signal described in Embodiment 2.

[0072] S201, voice enhancement processing;

[0073] In this embodiment, speech enhancement processing is performed on signals in the range of 16 kHz at the same time. The purpose of speech enhancement is to extract the original speech as pure as possible from the noisy speech signal. Currently, there are many enhancement algorithms commonly used, such as spectral subt...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides an extracting method of MFCC coefficients of a voice signal and a device, which aim at solving the problem existing in the extracting method of MFCC coefficients of HTK. The method comprises the following steps of: preemphasis, windowing, fast Fourier transformation, power spectrum estimation, Mel filtering, non-linear transformation and discrete cosine transform, wherein when carrying out Mel filtering, increasing the subband quantity of a Mel filter group, carrying out Mel filtering in a frequency range and obtaining Mel filtering output corresponding to each subband; then carrying out polymerization to the subband quantity in a high frequency range and obtaining Mel filtering output corresponding to each subband after polymerization; continuing to carry out non-linear transformation and discrete cosine transform to the Mel filtering output in a low frequency range and the high frequency range after polymerization; and finally extracting the MFCC coefficients. The invention guarantees that low frequency signal has sufficient frequency resolving accuracy, simultaneously carries out polymerization to the subband quantity in the high frequency range, and improves the interference rejection of high frequency, thus optimizing the extracted MFCC coefficients and improving accuracy rate of voice recognition.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a method and device for extracting MFCC coefficients of a speech signal and a Mel filtering method. Background technique [0002] In the process of speech recognition processing, Mel-scale Frequency Cepstral Coefficients (MFCC for short) is one of the commonly used characteristic parameters. MFCC simulates the auditory characteristics of the human ear, can reflect the perceptual characteristics of human speech, and extracts the speaker's personality characteristics from the speaker's speech signal, and has achieved a high recognition rate in the practical application of speech recognition. The standard MFCC coefficient extraction process includes pre-emphasis, windowing, FFT transform (Fast Fourier Transform, fast Fourier transform), power spectrum estimation, Mel filter, nonlinear transform (calculate logarithm Log) and DCT transform (Discrete Cosine Transform , disc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/02
Inventor 张晨冯宇红
Owner VIMICRO ELECTRONICS CORP