Speech recognition method and device and terminal

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speech recognition and speech signal technology, applied in the electronic field, can solve problems such as insufficient formant characteristics, low computational complexity, and no consideration of human auditory characteristics

Active Publication Date: 2016-06-15

シェンジェンインムーテクノロジーシーオーエルティーディー

View PDF8 Cites 8 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

LPC and LPCC do not take into account the auditory characteristics of the human ear, do not use nonlinear frequency transformation, and cannot accurately describe the characteristics of the speaker

MFCC parameters simulate the human ear's ability to perceive different spectrums of speech, taking into account the auditory characteristics of the human ear, MFCC feature performance is better, the computational complexity is low, and it has good recognition performance and robustness; but the traditional MFCC feature parameter spectrum The energy leakage is serious and the formant characteristics for describing the speech signal are not enough. Therefore, the traditional speech recognition process based on MFCC characteristic parameters has high redundancy, resulting in poor robustness of the speech recognition system with a low signal-to-noise ratio and a significant drop in recognition rate.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0047] An embodiment of the present invention provides a speech recognition method, including:

[0048] S101. Acquire a frame of speech signal, and extract d-dimensional MFCC parameters from the speech signal; the value range of d is a positive integer, and generally d=24;

[0049] S102. Perform cepstrum calculation on the d-dimensional MFCC parameters to obtain d-dimensional cepstrum MFCC parameters;

[0050] S103. Perform iterative processing on the cepstrum MFCC parameters in each dimension according to the preset number of iterations to obtain d-dimensional iterative cepstrum MFCC parameters;

[0051] S104. Identify the speech signal based on the d-dimensional iterative cepstrum MFCC parameters.

[0052] The embodiment of the present invention realizes enhancing the anti-noise performance of speech recognition in the feature space, and iterates the traditional MFCC parameters through cepstrum calculation to obtain the dynamic change trajectory of the MFCC parameter featur...

Embodiment 2

[0101] The present invention provides a voice recognition device, which is the device embodiment of Embodiment 1, including:

[0102] Parameter extraction module 30, is used for obtaining a frame speech signal, extracts d dimension MFCC parameter from described speech signal;

[0103] The cepstrum module 32 is used for performing cepstrum calculation to the d-dimensional MFCC parameters to obtain d-dimensional cepstrum MFCC parameters;

[0104] The iterative module 34 is used to iteratively process the cepstrum MFCC parameters of each dimension according to the preset number of iterations to obtain d-dimensional iterative cepstrum MFCC parameters;

[0105] The identification module 36 is configured to identify the speech signal based on the d-dimensional iterative cepstrum MFCC parameters.

[0106] The embodiment of the present invention realizes enhancing the anti-noise performance of speech recognition in the feature space, and iterates the traditional MFCC parameters throu...

Embodiment 3

[0120] An embodiment of the present invention provides a terminal, where the terminal includes the speech recognition device described in Embodiment 2. The terminal in the embodiment of the present invention specifically refers to a terminal with a voice recognition function, including a mobile phone, a tablet computer, a PDA (Personal Digital Assistant), a notebook computer, and the like.

[0121] In the specific implementation process of the embodiments of the present invention, refer to Embodiments 1 and 2, which have the technical effects of Embodiments 1 and 2, and will not be repeated here.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech recognition method and device and a terminal, and is to improve anti-noise performance of an existing speech recognition mode. The method comprises the following steps: obtaining a frame of speech signal, and extracting d-dimension MFCC parameters from the speech signal; carrying out cepstrum calculation on the d-dimension MFCC parameters to obtain d-dimension cepstrum MFCC parameters; carrying out iteration processing on each dimension of cepstrum MFCC parameters according to preset iteration times to obtain d-dimension iteration cepstrum MFCC parameters; and carrying out speech signal recognition based on the d-dimension iteration cepstrum MFCC parameters.

Description

technical field [0001] The present invention relates to the electronic field, in particular to a voice recognition method, device and terminal. Background technique [0002] Speech feature extraction is a very critical step in the speech recognition process. At present, the speech recognition algorithm mainly preprocesses the noise signal (such as filtering) in the signal space to obtain a purer speech signal, but speech recognition is difficult in noisy environments. The recognition rate is still unsatisfactory; therefore, how to accurately and effectively extract the feature parameters that reflect the characteristics of speech is an important research topic; the robustness and accuracy of feature parameters directly affect the accuracy of speech recognition; The real-time performance of the recognition system also has a great influence. [0003] At present, the feature parameter extraction methods mainly include pitch, formant, Linear Predictive Coding (LPC), Linear Pred...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/26

CPCG10L15/26

Inventor 黎小松傅文治胡绩强汪平炜

Owner シェンジェンインムーテクノロジーシーオーエルティーディー

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech recognition method and device and terminal

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology