Attention mechanism-based multi-modal emotion feature learning and recognition method

A technology of emotional features and identification methods, applied in the field of emotional computing, can solve the problem of not comprehensively considering the influence of single-modal emotional features, and achieve the effect of enhancing the extraction ability and improving the accuracy.

Active Publication Date: 2020-10-09
JIANGSU UNIV
View PDF7 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Although the traditional multimodal emotion recognition methods can promote the final emotion recognition by fusing the emotional features of different modalities, most of the multimodal emot

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Attention mechanism-based multi-modal emotion feature learning and recognition method
  • Attention mechanism-based multi-modal emotion feature learning and recognition method
  • Attention mechanism-based multi-modal emotion feature learning and recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the drawings in the embodiments of the present invention.

[0027] figure 1 A general idea of ​​the invention is given. Firstly, preprocessing and feature extraction are performed on the samples of the audio modality and the text modality respectively to obtain the FBank acoustic features of the audio samples and the word vector features of the text samples; secondly, the obtained original features are respectively coded as audio emotion features The original input features of the CBiLSTM and the text emotional feature encoder BiLSTM can extract the emotional semantic features of different modalities through the corresponding encoder; then, the audio attention, modal jump attention and Text attention learning, extracting four complementary emotional features: emotionally significant audio features, semantically aligned audio features,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an attention mechanism-based multi-modal emotion feature learning and recognition method, and the method comprises the steps: carrying out the feature extraction of an audio and text sample, and obtaining an FBank acoustic feature and a word vector feature; taking the obtained features as original input features of an audio emotion feature encoder and a text emotion feature encoder respectively, and extracting emotion semantic features of different modes through the encoders; performing audio attention, modal jump attention and text attention learning on the obtained emotion semantic features respectively, and extracting four complementary emotion features including an audio feature with remarkable emotion, an audio feature with semantic alignment, a text feature with semantic alignment and a text feature with remarkable emotion; and fusing the four features and then classifying to obtain corresponding emotion categories. According to the invention, the problemof low emotion recognition rate caused by intra-modal emotion irrelevant factors and inter-modal emotion semantics inconsistency in traditional multi-modal emotion recognition is solved, and the multi-modal emotion recognition accuracy can be effectively improved.

Description

technical field [0001] The invention belongs to the field of emotional computing, and in particular relates to a method for learning and identifying multi-modal emotional features based on an attention mechanism. Background technique [0002] In people's daily interactions, emotions often play a very important role, and the perception of emotional information helps people understand each other's mental states and behaviors. Likewise, emotional information is crucial to sustain long-term interactions between humans and machines, and automatic speech emotion recognition is an effective way to bridge the communication gap between humans and computers. With the rapid development and popularization of the Internet, people have put forward higher requirements for human-computer interaction systems, and people expect the machines they interact with to have the ability to observe, understand and generate emotional features similar to humans. Therefore, multimodal emotion recognitio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/30G06F16/35G06K9/62G06N3/04G06N3/08G10L25/03G10L25/63
CPCG06F40/30G06F16/35G06N3/049G06N3/08G10L25/63G10L25/03G06N3/045G06F18/241G06F18/253Y02D10/00
Inventor 薛艳飞张建明毛启容
Owner JIANGSU UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products