Voice emotion recognition system and voice emotion recognition method

A speech emotion recognition and emotion technology, applied in speech emotion recognition system, speech emotion recognition, and deep neural network technology field, can solve problems such as poor versatility, and achieve the effect of reducing size, reducing training time, and reducing complexity

A speech emotion recognition and emotion technology, applied in speech emotion recognition system, speech emotion recognition, and deep neural network technology field, can solve problems such as poor versatility, and achieve the effect of reducing size, reducing training time, and reducing complexity

CN110534133AActive Publication Date: 2019-12-03珠海亿智电子科技有限公司

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice emotion recognition system and voice emotion recognition method
  • Voice emotion recognition system and voice emotion recognition method
  • Voice emotion recognition system and voice emotion recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0050] The present invention will be further described below in combination with specific embodiments.

[0051] Such as figure 1 Shown, a kind of speech emotion recognition system comprises: the audio frequency preprocessing module, CNN module, pyramidal FSMN module, time step attention module and output module connected successively; Described CNN module has convolutional layer, and described pyramidal FSMN module It has a pyramid memory block structure.

[0052] The voice emotion recognition system in this embodiment is an end-to-end feed-forward deep neural network structure. The present invention improves the basic network in the classic FSMN and DFSMN structures, and adds convolution in the basic network. layer to achieve lower-level feature extraction.

[0053] The audio data of the present invention is in the mainstream wav format, and the sampling frequency is 16000 Hz. The original audio data is further divided into frames for Fourier transform. The length of each f...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice emotion recognition system which comprises: an audio preprocessing module, a CNN module, a pyramid FSMN module, a time step attention module and an output module, whichare connected in sequence. The invention also discloses a speech emotion recognition method applied to the speech emotion recognition system, wherein the method comprises the following steps: 1, carrying out the preliminary operation of speech, thereby obtaining a speech spectrum feature map; 2, operating the spectrogram feature map, and constructing a spectrogram feature map containing audio shallow information; 3, further processing the spectrogram feature map containing the audio shallow information to obtain deeper semantic information and context information; 4, processing the spectrogram feature map with deeper semantic information and context information to obtain a feature vector with the highest relevancy between the whole speech and the emotion of the speaker; and 5, outputtingan emotion category corresponding to the whole section of voice. Compared with the prior art, the speech emotion recognition performance of the speech emotion recognition method is greatly improved.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence and speech recognition, in particular to a speech emotion recognition system and a speech emotion recognition method, which is an end-to-end deep neural network technology, which is improved based on DFSMN. Background technique [0002] With the continuous improvement of speech recognition technology and the wide application of speech recognition equipment, human-computer interaction is becoming more and more common in people's daily life. However, most of these devices can only recognize the text-level content of human language and cannot recognize the emotional state of the speaker, while speech emotion recognition has many useful applications in human-centered services and human-computer interaction, such as intelligent service robots, automation Call centers and distance education. So far, it has attracted considerable research attention and many methods have been proposed. D...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
03 Dec 2019
Publication
CN110534133A
IPC
G10L25/30; G10L25/63; G06N3/04; G06N3/063
CPC
G10L25/30; G10L25/63; G06N3/063; G06N3/045
Inventors
殷绪成; 曹秒