Speech emotion recognition method for extracting depth space attention characteristics based on spectrogram

A technology of speech emotion recognition and depth extraction, which is applied in speech recognition, speech analysis, character and pattern recognition, etc., and can solve problems of potential nature (neglect of relevance, etc.)

Active Publication Date: 2019-04-16
HANGZHOU DIANZI UNIV
View PDF5 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the study of the emotional features of speech emotion recognition, many scholars have made many attempts, such as the selection of traditional features, the selection of specified features to simulate the recognition effect; the selection of processed features (such as first-order difference, etc.) to simulate the recognition effect, but these sequences There are certain limitations in features, that is, taking such frequency-domain features pays attention to the frequency dimension and ignores the influence of the time dimension. On the contrary, selecting time-domain features will ignore the influence of the frequency dimension, and at the same time makes the potential properties hidden between the time-frequency domain ( associativity, etc.) are ignored

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech emotion recognition method for extracting depth space attention characteristics based on spectrogram
  • Speech emotion recognition method for extracting depth space attention characteristics based on spectrogram
  • Speech emotion recognition method for extracting depth space attention characteristics based on spectrogram

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0074] The embodiments of the present invention will be described in detail below in conjunction with specific embodiments and drawings.

[0075] Before describing the specific technical solution of the present invention, first define some abbreviations and symbols and introduce the system model: the basic setting of the experiment is that the learning rate I is 0.001, and the input batch B is 400 epochs. For the number of layers of the network, it is determined under the optimal performance. The convolution part is based on VGGNet and the specific layer settings are formed through multiple experiments. For details, see Table 1. The F_CRNN network structure uses random initial words for the initialization of model weights and biases. , for convenience, the following hybrid neural network (CRNN) is an optimized network. The algorithms all adopt supervised training, and the category labels of the data are only used during training, and the experimental results are presented in t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a speech emotion recognition method for extracting depth space attention characteristics based on a spectrogram, which comprises the following steps : a, preprocessing speech in a standard emotion database marked with specified emotion labels to generate a spectrogram; b, sending the speech spectrogram into an AItti model to extract SEF characteristics; c, sending the speech spectrogram into a speech emotion model to extract shallow features; d, taking the SEF feature and the shallow feature as input, and sending the input to a CSWNet to generate a calibration weight feature; and e, sending the calibration weight characteristics to a network layer behind the CRNN, extracting depth emotion characteristics, and performing emotion classification through a Softmax classifier to generate a final classification result. Compared with the traditional model, the method disclosed by the invention has the advantages that the average recognition rate is improved by 8.43% onthe premise of slightly increasing the complexity of the model, the non-similar emotion distinguishing effect is obvious, and the generalization property is good.

Description

technical field [0001] The invention belongs to the field of artificial intelligence emotion recognition, in particular to a speech emotion recognition method for extracting deep space attention features based on spectrograms. Background technique [0002] With the rapid development of artificial intelligence technology and the robot industry, people continue to put forward higher requirements for the interaction of artificial intelligence. However, most artificial intelligence systems so far cannot recognize and respond to various human emotions. At present, the research on human-computer interaction mainly focuses on image emotion recognition and speech emotion recognition (SpeechEmotion Recognition, SER), while image emotion recognition is mainly aimed at the recognition of human facial expressions. Speed ​​and other issues, which lead to technically accurate implementation is limited, and image emotion recognition has high hardware requirements and is difficult to be wi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/02G10L15/06G10L15/16G06K9/62G06N3/04
CPCG10L15/02G10L15/063G10L15/16G06N3/045G06F18/24
Inventor 王金华应娜朱辰都
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products