Speech digital recognition method based on MFCC

A digital recognition and speech technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as the decrease of calculation accuracy, and achieve the effect of accurate and fast speech recognition

Inactive Publication Date: 2019-12-31
GUANGZHOU UNIVERSITY
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the nonlinear correspondence between the Mel frequency and the Hz freq

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech digital recognition method based on MFCC
  • Speech digital recognition method based on MFCC
  • Speech digital recognition method based on MFCC

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0024] The present invention relates to a method of recognizing digits in the audio of a speaker. First record the audio files of numbers 0-9 as training data, and each piece of audio needs to have a digital pronunciation with different tones; then sample the input audio data, preprocess the sampled speech signal, and then preprocess the preprocessed speech The signal is subjected to endpoint detection, and the voice signal of a single number is extracted, and then the MFCC feature corresponding to each voice number is extracted; finally, the mean square error MSE is used to compare the MFCC feature corresponding to the input voice number with the MFCC feature of each number in the training template. Error analysis, correct matching to identify the corresponding number.

[0025] Such as figure 1 As shown, the identification method of the present invention includes the following steps:

[0026] S1. Sampling the input voice signal, and preprocessing the sampled voice signal; ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the speech recognition technology, and in particular to a speech digital recognition method based on MFCC. The speech digital recognition method based on the MFCC comprises the following steps: firstly, sampling an input speech signal, and preprocessing the sampled speech signal; performing endpoint detection on the sampled and preprocessed speech signal to extract singledigital speech signals; extracting MFCC features of each digital speech signal; and matching the MFCC features of each digital speech signal with a MFCC digital speech signal parameter template obtained through training by using a mean square error MSE method to recognize numbers in the speech signal. The speech digital recognition method based on the MFCC combines the MFCC features with the MSE to realize speech digital recognition, which not only has a high recognition rate but also avoids a large amount of data calculation; therefore, the recognition efficiency is high, and the speech digital recognition method based on the MFCC can be applied in a complex environment.

Description

technical field [0001] The invention relates to speech recognition technology, in particular to a speech number recognition method based on MFCC. Background technique [0002] With the development of computer and information technology, voice interaction has become a necessary means of human-computer interaction. Speech recognition technology is an important development direction of computer technology. Speech recognition has formed a theoretical system of a certain scale, and its application fields are very broad. Voice dialing, emotion recognition, voiceprint recognition, etc. are closely related and have very broad prospects. [0003] Mel frequency cepstral coefficients MFCCs (Mel Frequency Cepstral Coefficients) are a feature widely used in automatic speech and speaker recognition. Mel (Mel) is the unit of subjective pitch, while Hz (Hertz) is the unit of objective pitch. The Mel frequency is proposed based on the auditory characteristics of the human ear, and it has ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/08G10L15/14G10L15/26G10L25/18G10L25/24G10L25/27G10L25/87
CPCG10L15/083G10L15/144G10L15/26G10L25/18G10L25/24G10L25/27G10L25/87
Inventor 朱静杨盛元尹邦政陈明希杨强魏慧棠何海城李浩明
Owner GUANGZHOU UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products