Speech emotion recognition method and device based on meta-multi-task learning

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A speech emotion recognition and multi-task learning technology, applied in speech recognition, speech analysis, instruments, etc., can solve the problem of insufficient accuracy of speech emotion recognition

Pending Publication Date: 2021-05-28

GUANGDONG UNIV OF TECH

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The present invention provides a method and device for speech emotion recognition based on meta-multi-task learning in order to overcome the defect that the accuracy of speech emotion recognition in the above-mentioned prior art is not high enough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0058] This embodiment provides a speech emotion recognition method based on meta-multi-task learning. Such as figure 1 As shown, the speech emotion recognition method based on meta-multi-task learning mainly includes the following two key stages:

[0059] 1) By combining meta-learning and multi-task learning, the correlation between auxiliary tasks is learned, corresponding to Multi-trainStage.

[0060] 2) Learning the ability to transfer knowledge from auxiliary tasks to main tasks, corresponding to KnowledgeTransferStage.

[0061] Such as figure 2 As shown, the speech emotion recognition method based on meta-multi-task learning specifically includes the following steps:

[0062] 1) Dataset collection: You can choose IEMOCAP, a dataset that describes emotions from the emotional dimensional space and the discrete dimensional space. Generally speaking, speech emotion can be represented by continuous emotion space, such as Valence-Arousal space, or by discrete emotion spac...

Embodiment 2

[0080] This embodiment provides a device for speech emotion recognition based on meta-multi-task learning, the device can implement the method described in Embodiment 1, such as image 3 As shown, the device includes:

[0081] 1) The acquisition unit is specifically configured as:

[0082] For the voice data set obtained, the discrete space emotion label is selected as the data corresponding to happiness, anger, sadness and neutrality. In addition to the discrete emotion space label, each section of voice is also marked with the label of the dimension emotion space. The dimension emotion space of the present embodiment, Select the Valence-Activation-Dominance space.

[0083] 2) The data processing unit is specifically configured as:

[0084] Slice the voice data in advance, so that the length of each voice slice is approximately equal and does not exceed 3 seconds, and then use Fourier transform, filter and other acoustic processing methods to extract the spectrogram from th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a speech emotion recognition method and device based on meta-multi-task learning. The method comprises the following steps: by combining meta-learning and multi-task learning, learning the correlation between auxiliary tasks and learning the knowledge migration capability from the auxiliary tasks to a main task. The method has the main advantages that: for speech emotion recognition, the correlation of emotion in a continuous space and a discrete space is considered, on a support set, meta learning can learn correlation of auxiliary tasks like multi-task learning, and multi-task learning can share a learning device like element learning. On a query set, a knowledge migration mechanism is introduced, so that the model can model the correlation between the main task and the auxiliary task. The device comprises an acquisition unit, a data processing unit, a metadata generation unit, an initialization unit, a meta-training unit, a meta-prediction fine tuning unit and a meta-prediction identification unit. According to the invention, the accuracy of speech emotion recognition is remarkably improved.

Description

technical field [0001] The present invention relates to the field of computer speech, more specifically, to a speech emotion recognition method and device based on meta-multi-task learning. Background technique [0002] The development of science and technology is increasingly changing the way of life of human beings, and the computers of the new era are gradually developing towards the direction of being able to communicate, think and make decisions like human beings. Among them, human-computer interaction technology promotes a more natural and intelligent interaction between humans and computers. Speech emotion recognition is an important content in the field of human-computer interaction and artificial intelligence, and plays an important role in practical applications such as electronic distance learning, disease treatment, lie detection, and customer service call center systems. Deep learning plays an important role in the research of speech emotion recognition, such a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G10L25/63G10L25/03G10L15/06

CPCG10L25/63G10L25/03G10L15/063

Inventor蔡瑞初郭锴槟许柏炎

OwnerGUANGDONG UNIV OF TECH

Speech emotion recognition method and device based on meta-multi-task learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology