Speech emotion recognition method based on multi-scale deep convolution recurrent neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of speech emotion recognition and cyclic neural network, which is applied in speech analysis, instruments, etc., and can solve problems such as ignoring discrimination

Active Publication Date: 2018-10-30

TAIZHOU UNIV

View PDF3 Cites 36 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the existing speech emotion recognition methods based on deep learning technology have ignored the different characteristics of different lengths of speech spectrum fragment information for the different discrimination of different emotion types (see literature: Mao Q, et al. Learning salient features for speech emotion recognition using convolutional neural networks. IEEE Transactions on Multimedia, 2014, 16(8): 2203-2213.)

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0045] The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0046] figure 1 It is a flowchart of the present invention, mainly including:

[0047] Step 1: Generation of three-channel speech spectrum segments;

[0048] Step 2: Using a deep convolutional neural network (CNN) to extract features of speech spectrum segments at different scales;

[0049] Step 3: Use long-short-term memory network (LSTM) to realize time modeling of speech spectrum segment sequences at different scales, and output the emotion recognition result of the entire speech;

[0050] Step 4: Use the fractional layer fusion method to realize the fusion of the recognition results obtained by CNN+LSTM at different scales, and output the final speech emotion recognition results.

[0051] One, the realization of each step of the flow chart of the present invention is specifically expressed as follows in conjunction with...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speech emotion recognition method based on a multi-scale deep convolution recurrent neural network. The method comprises the steps that (1), three-channel speech spectrum segments are generated; (2), speech spectrum segment features under different scales are extracted by adopting the convolution neural network (CNN); (3), time modeling of a speech spectrum segment sequence under different scales is achieved by adopting a long short-term memory (LSTM), and emotion recognition results of a whole sentence of speech is output; (4), fusions of recognition results obtainedby CNN+LSTM under different scales are achieved by adopting a score level fusion method, and the final speech emotion recognition result is output. By means of the method, natural speech emotion recognition performance under actual environments can be effectively improved, and the method can be applied to the fields of artificial intelligence, robot technologies, natural human-computer interaction technologies and the like.

Description

technical field [0001] The invention relates to the fields of speech signal processing and pattern recognition, in particular to a speech emotion recognition method based on a multi-scale deep convolutional cyclic neural network. Background technique [0002] Human language not only contains rich text information, but also carries audio information that contains people's emotional expressions, such as changes in the pitch, strength, and cadence of voice. How to let the computer automatically recognize the emotional state of the speaker from the voice signal, that is, the so-called "speech emotion recognition" research, has become a hot research topic in the fields of artificial intelligence, pattern recognition, and emotional computing. The research aims to enable the computer to acquire, recognize and respond to the user's emotional information by analyzing the speaker's voice signal, so as to achieve a more harmonious and natural interaction between the user and the comput...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G10L25/63G10L25/30

CPCG10L25/30G10L25/63

Inventor 张石清赵小明

Owner TAIZHOU UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speech emotion recognition method based on multi-scale deep convolution recurrent neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology