Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech recognition method and device and terminal

A speech recognition and speech signal technology, applied in the electronic field, can solve problems such as insufficient formant characteristics, low computational complexity, and no consideration of human auditory characteristics

Active Publication Date: 2016-06-15
シェンジェンインムーテクノロジーシーオーエルティーディー
View PDF8 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

LPC and LPCC do not take into account the auditory characteristics of the human ear, do not use nonlinear frequency transformation, and cannot accurately describe the characteristics of the speaker
MFCC parameters simulate the human ear's ability to perceive different spectrums of speech, taking into account the auditory characteristics of the human ear, MFCC feature performance is better, the computational complexity is low, and it has good recognition performance and robustness; but the traditional MFCC feature parameter spectrum The energy leakage is serious and the formant characteristics for describing the speech signal are not enough. Therefore, the traditional speech recognition process based on MFCC characteristic parameters has high redundancy, resulting in poor robustness of the speech recognition system with a low signal-to-noise ratio and a significant drop in recognition rate.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech recognition method and device and terminal
  • Speech recognition method and device and terminal
  • Speech recognition method and device and terminal

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] An embodiment of the present invention provides a speech recognition method, including:

[0048] S101. Acquire a frame of speech signal, and extract d-dimensional MFCC parameters from the speech signal; the value range of d is a positive integer, and generally d=24;

[0049] S102. Perform cepstrum calculation on the d-dimensional MFCC parameters to obtain d-dimensional cepstrum MFCC parameters;

[0050] S103. Perform iterative processing on the cepstrum MFCC parameters in each dimension according to the preset number of iterations to obtain d-dimensional iterative cepstrum MFCC parameters;

[0051] S104. Identify the speech signal based on the d-dimensional iterative cepstrum MFCC parameters.

[0052] The embodiment of the present invention realizes enhancing the anti-noise performance of speech recognition in the feature space, and iterates the traditional MFCC parameters through cepstrum calculation to obtain the dynamic change trajectory of the MFCC parameter featur...

Embodiment 2

[0101] The present invention provides a voice recognition device, which is the device embodiment of Embodiment 1, including:

[0102] Parameter extraction module 30, is used for obtaining a frame speech signal, extracts d dimension MFCC parameter from described speech signal;

[0103] The cepstrum module 32 is used for performing cepstrum calculation to the d-dimensional MFCC parameters to obtain d-dimensional cepstrum MFCC parameters;

[0104] The iterative module 34 is used to iteratively process the cepstrum MFCC parameters of each dimension according to the preset number of iterations to obtain d-dimensional iterative cepstrum MFCC parameters;

[0105] The identification module 36 is configured to identify the speech signal based on the d-dimensional iterative cepstrum MFCC parameters.

[0106] The embodiment of the present invention realizes enhancing the anti-noise performance of speech recognition in the feature space, and iterates the traditional MFCC parameters throu...

Embodiment 3

[0120] An embodiment of the present invention provides a terminal, where the terminal includes the speech recognition device described in Embodiment 2. The terminal in the embodiment of the present invention specifically refers to a terminal with a voice recognition function, including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a notebook computer, and the like.

[0121] In the specific implementation process of the embodiments of the present invention, refer to Embodiments 1 and 2, which have the technical effects of Embodiments 1 and 2, and will not be repeated here.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a speech recognition method and device and a terminal, and is to improve anti-noise performance of an existing speech recognition mode. The method comprises the following steps: obtaining a frame of speech signal, and extracting d-dimension MFCC parameters from the speech signal; carrying out cepstrum calculation on the d-dimension MFCC parameters to obtain d-dimension cepstrum MFCC parameters; carrying out iteration processing on each dimension of cepstrum MFCC parameters according to preset iteration times to obtain d-dimension iteration cepstrum MFCC parameters; and carrying out speech signal recognition based on the d-dimension iteration cepstrum MFCC parameters.

Description

technical field [0001] The present invention relates to the electronic field, in particular to a voice recognition method, device and terminal. Background technique [0002] Speech feature extraction is a very critical step in the speech recognition process. At present, the speech recognition algorithm mainly preprocesses the noise signal (such as filtering) in the signal space to obtain a purer speech signal, but speech recognition is difficult in noisy environments. The recognition rate is still unsatisfactory; therefore, how to accurately and effectively extract the feature parameters that reflect the characteristics of speech is an important research topic; the robustness and accuracy of feature parameters directly affect the accuracy of speech recognition; The real-time performance of the recognition system also has a great influence. [0003] At present, the feature parameter extraction methods mainly include pitch, formant, Linear Predictive Coding (LPC), Linear Pred...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/26
CPCG10L15/26
Inventor 黎小松傅文治胡绩强汪平炜
Owner シェンジェンインムーテクノロジーシーオーエルティーディー
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products