Speech emotion recognition method for extracting depth space attention characteristics based on spectrogram

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech emotion recognition and depth extraction, which is applied in speech recognition, speech analysis, character and pattern recognition, etc., and can solve problems of potential nature (neglect of relevance, etc.)

Active Publication Date: 2019-04-16

HANGZHOU DIANZI UNIV

View PDF5 Cites 15 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In the study of the emotional features of speech emotion recognition, many scholars have made many attempts, such as the selection of traditional features, the selection of specified features to simulate the recognition effect; the selection of processed features (such as first-order difference, etc.) to simulate the recognition effect, but these sequences There are certain limitations in features, that is, taking such frequency-domain features pays attention to the frequency dimension and ignores the influence of the time dimension. On the contrary, selecting time-domain features will ignore the influence of the frequency dimension, and at the same time makes the potential properties hidden between the time-frequency domain ( associativity, etc.) are ignored

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0074] The embodiments of the present invention will be described in detail below in conjunction with specific embodiments and drawings.

[0075] Before describing the specific technical solution of the present invention, first define some abbreviations and symbols and introduce the system model: the basic setting of the experiment is that the learning rate I is 0.001, and the input batch B is 400 epochs. For the number of layers of the network, it is determined under the optimal performance. The convolution part is based on VGGNet and the specific layer settings are formed through multiple experiments. For details, see Table 1. The F_CRNN network structure uses random initial words for the initialization of model weights and biases. , for convenience, the following hybrid neural network (CRNN) is an optimized network. The algorithms all adopt supervised training, and the category labels of the data are only used during training, and the experimental results are presented in t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech emotion recognition method for extracting depth space attention characteristics based on a spectrogram, which comprises the following steps : a, preprocessing speech in a standard emotion database marked with specified emotion labels to generate a spectrogram; b, sending the speech spectrogram into an AItti model to extract SEF characteristics; c, sending the speech spectrogram into a speech emotion model to extract shallow features; d, taking the SEF feature and the shallow feature as input, and sending the input to a CSWNet to generate a calibration weight feature; and e, sending the calibration weight characteristics to a network layer behind the CRNN, extracting depth emotion characteristics, and performing emotion classification through a Softmax classifier to generate a final classification result. Compared with the traditional model, the method disclosed by the invention has the advantages that the average recognition rate is improved by 8.43% onthe premise of slightly increasing the complexity of the model, the non-similar emotion distinguishing effect is obvious, and the generalization property is good.

Description

technical field [0001] The invention belongs to the field of artificial intelligence emotion recognition, in particular to a speech emotion recognition method for extracting deep space attention features based on spectrograms. Background technique [0002] With the rapid development of artificial intelligence technology and the robot industry, people continue to put forward higher requirements for the interaction of artificial intelligence. However, most artificial intelligence systems so far cannot recognize and respond to various human emotions. At present, the research on human-computer interaction mainly focuses on image emotion recognition and speech emotion recognition (SpeechEmotion Recognition, SER), while image emotion recognition is mainly aimed at the recognition of human facial expressions. Speed and other issues, which lead to technically accurate implementation is limited, and image emotion recognition has high hardware requirements and is difficult to be wi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L15/02G10L15/06G10L15/16G06K9/62G06N3/04

CPCG10L15/02G10L15/063G10L15/16G06N3/045G06F18/24

Inventor 王金华应娜朱辰都

Owner HANGZHOU DIANZI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech emotion recognition method for extracting depth space attention characteristics based on spectrogram

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology