Voice emotion recognition method based on multi-fractal and information fusion

A voice emotion recognition and multiple technology, applied in the information field, can solve the problems of lack of acoustic features, chaotic features of voice signals, and low recognition accuracy

Inactive Publication Date: 2014-12-24
PEKING UNIV SHENZHEN GRADUATE SCHOOL
View PDF4 Cites 25 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the generation of speech signals is a complex non-stationary and nonlinear process, and there is a chaotic mechanism in it. Traditional acoustic features lack the ability to describe the chaotic characteristics of speech signals.
[0005] The modeling methods of speech signals include linear modeling methods and nonlinear modeling methods, among which linear modeling methods include K nearest neighbor method, principal component analysis method, etc., nonlinear modeling methods include Hidden Markov...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice emotion recognition method based on multi-fractal and information fusion
  • Voice emotion recognition method based on multi-fractal and information fusion
  • Voice emotion recognition method based on multi-fractal and information fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The present invention will be described in further detail below in conjunction with the accompanying drawings.

[0049] The present invention is a kind of speech emotion recognition method based on nonlinear analysis, wherein:

[0050] Step 1: The speech emotion database uses the Mandarin speech database of Beihang University. The speech database includes seven speech categories: sadness, anger, surprise, fear, joy, disgust and calmness. Select 180 speeches for each of anger, joy, sadness and calmness. Samples, a total of 720 speech samples for emotion recognition. Among them, the first 260 speech samples are used to train the recognition model, and the last 180 speech samples are used to test the performance of the recognition model.

[0051]Step 2: The chaotic features of the speech signal are discriminated by Lyapunov exponent, such as figure 2 As shown, the Lyapunov exponent refers to the divergence or convergence rate of the orbits generated by two initial value...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a voice emotion recognition method based on multi-fractal and information fusion. The method comprises the steps that firstly, voice sample data are extracted from a voice library, and a voice sample training set and a voice sample testing set are established; secondly, a nonlinearity characteristic value used for voice emotion recognition is extracted from the voice sample training set according to the nonlinearity characteristic, wherein the nonlinearity characteristic comprises voice signal multi-fractal spectrum and a voice signal broad sense hurst index; thirdly, preprocessing is carried out on the voice sample training set, the nonlinearity characteristic value serves as the input of various weak classifiers, and all the weak classifiers are trained; fourthly, the trained weak classifiers are gathered into a powerful classifier, and the powerful classifier is tested according to voice sample signals in the voice sample testing set; fifthly, new voice signals are classified according to the tested powerful classifier, and the classifications of emotions corresponding to the voice signals are recognized. According to the voice emotion recognition method, and the accuracy of voice signal recognition is greatly improved.

Description

technical field [0001] The invention relates to a speech signal nonlinear feature extraction and a speech emotion recognition method based on the nonlinear feature, in particular to a speech emotion recognition method based on multifractal and information fusion. The present invention belongs to the field of information technology. Background technique [0002] The emotion in the speech signal is one of the important bases for judging human emotions. The speech signal contains a large amount of non-semantic information and semantic information generated according to pronunciation rules. Usually, the semantic information and non-semantic information contain language information and speech information respectively. Human emotional factors, traditional speech recognition is limited to the accuracy of speech semantics, ignoring the emotional information in speech signals, and the emotional information features of speech are usually regarded as difference noise and pattern change...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L25/63G10L15/00
Inventor 刘宏张文娟
Owner PEKING UNIV SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products