Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Isolated digit speech recognition classification system and method combining principal component analysis (PCA) with restricted Boltzmann machine (RBM)

A digital voice, recognition and classification technology, applied in the direction of voice recognition, voice analysis, instruments, etc., can solve the problem that it is difficult to give full play to the comprehensive advantages of the method

Inactive Publication Date: 2015-12-30
CHANGAN UNIV
View PDF3 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Each method has its own strengths, and it is difficult for a single type of method to give full play to its comprehensive advantages

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Isolated digit speech recognition classification system and method combining principal component analysis (PCA) with restricted Boltzmann machine (RBM)
  • Isolated digit speech recognition classification system and method combining principal component analysis (PCA) with restricted Boltzmann machine (RBM)
  • Isolated digit speech recognition classification system and method combining principal component analysis (PCA) with restricted Boltzmann machine (RBM)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0072] Below in conjunction with accompanying drawing, the present invention is described in further detail:

[0073] see Figure 1 to Figure 3 , an isolated digital speech recognition classification system combining PCA and RBM, including an isolated digital speech input module, MFCC and first-order difference MFCC feature extraction module, PCA linear dimensionality reduction module, RBM non-linear dimensionality reduction module, Softmax classification recognition module;

[0074] The isolated digital voice input module samples or reads the isolated digital voice signal: the sampling frequency is 12.5kHz, and each sample is quantized with 16bit. The sampling results are saved in the form of files for subsequent MFCC and first-order difference MFCC feature extraction modules to use;

[0075] MFCC and first-order difference MFCC feature extraction module extracts the Mel-frequency cepstral coefficient (MFCC) and first-order difference MFCC of the speech signal: First, the in...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an isolated digit speech recognition classification system and method combining a principal component analysis (PCA) with a restricted Boltzmann machine (RBM). First of all, a Mel frequency cepstrum coefficient (MFCC) is employed for combination with a one-order difference MFCC, and a voice dynamic characteristic of an isolated digit is preliminarily drawn off; then, linear dimension reduction processing is carried out on an MFCC combination characteristic by use of the PCA, and dimensions of a newly obtained characteristic are unified; accordingly, nonlinear dimension reduction processing is performed on the obtained new characteristic by use of the RBM; and finally, finishing recognition classification on a digit voice characteristic after nonlinear dimension reduction by use of a Softmax classifier. According to the invention, PCA linear dimension reduction, unification of the dimensions of the characteristic and RBM nonlinear dimension reduction are combined together, such that the characteristic representation and classification capabilities of a model are greatly improved, the isolated digit voice recognition correct rate is improved, and an efficient solution is provided for high-accuracy recognition of isolated digit voice.

Description

technical field [0001] The invention belongs to the field of voice recognition, and in particular relates to a system and method for categorizing isolated digital voices combining PCA and RBM. Background technique [0002] Digital speech recognition has broad research and application value. There are many common methods, such as dynamic time warping (DTW), principal component analysis (PCA), artificial neural network (ANN) methods, etc. Based on the idea of ​​dynamic programming, DTW solves the problem of template matching with different pronunciation lengths. However, DTW has disadvantages such as large amount of calculation and recognition performance relying on endpoint detection. PCA can reduce the dimensionality of data and unify the dimensionality of data, but it is essentially a linear dimensionality reduction method based on optimal orthogonal transformation, which cannot retain the nonlinear characteristics of the original data, and it is difficult to obtain a compa...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/26G10L15/08
Inventor 宋青松田正鑫安毅生赵祥模
Owner CHANGAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products