Multi-modal emotion recognition method based on attention feature fusion

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A feature fusion and emotion recognition technology, applied in the field of emotional computing, can solve the problems of not being able to reflect the degree of modal influence, ignoring the differences of different modal features, etc., to achieve maximum information utilization, simple and effective execution, and improve recognition effect Effect

Inactive Publication Date: 2019-04-12

SHANDONG UNIV

View PDF6 Cites 67 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The implementation of feature layer fusion is simple and effective, making full use of the information of different modal features, but the disadvantage is that most feature layer fusion methods ignore the differences between different modal features, and cannot reflect the influence of each mode on the final result

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0064] A method for multimodal emotion recognition based on attention feature fusion, comprising the following steps:

[0065] (1) Preprocessing the data of multiple modalities to make it meet the input requirements of the models corresponding to multiple modalities;

[0066] (2) feature extraction is carried out to the data of multiple modalities after step (1) preprocessing;

[0067] (3) Perform feature fusion of the data features of multiple modalities extracted in step (2): traditional feature layer fusion is to concatenate the feature vectors of the three modalities to form a total joint feature vector, and then Sent to the classifier for classification. However, since the characteristics of different modalities have different influences on our final recognition effect, in order to effectively obtain the influence weight of each modal feature on the final result according to the distribution of the data set. The attention mechanism is used to assign a weight to the data...

Embodiment 2

[0070] A kind of method based on the multimodal emotion recognition of attention feature fusion described in embodiment 1, such as figure 1 As shown, the difference is that in the step (1), the data of multiple modes includes text data, voice data, video data,

[0071] For text data, the preprocessing process includes: converting text data into mathematical data by training word vectors, that is, converting the words in each piece of text into a word vector representation, so that it meets the input requirements of the bidirectional LSTM model; bidirectional LSTM model It includes the word vector layer, the bidirectional LSTM layer, the first Dropout layer and the first fully connected layer in turn. The word vector layer is used to convert each word in the text into a word vector representation. The bidirectional LSTM layer is used to extract text features. The first Dropout The layer is used to avoid overfitting of the bidirectional LSTM model, and the first fully connected ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a multi-modal emotion recognition method based on attention feature fusion. According to the multi-modal emotion recognition method based on attention feature fusion, final emotion recognition is carried out by mainly utilizing data of three modes of a text, a voice and a video. The method comprises the following steps: firstly, performing feature extraction on data of three modes; in the text aspect, bidirectional LSTM is used for extracting text features, a convolutional neural network is used for extracting features in a voice mode, and a three-dimensional convolutional neural network model is used for extracting video features in a video mode. Performing feature fusion on the features of the three modes by adopting an attention-based feature layer fusion mode;a traditional feature layer fusion mode is changed, complementary information between different modes is fully utilized, certain weights are given to the features of the different modes, the weights and the network are obtained through training and learning together, and therefore the method better conforms to the whole data distribution of people, and the final recognition effect is well improved.

Description

technical field [0001] The invention relates to a multimodal emotion recognition method based on attention feature fusion, and belongs to the technical field of emotion computing. Background technique [0002] In the 1990s, the concept of affective computing appeared in various fields of computers. Affective computing is related to human emotions, and the calculation of factors that are triggered by human emotions or can affect emotions has completely opened the door to the study of emotion recognition. , the purpose of the research is to promote the realization of a highly harmonious human-computer interaction experience in the information society, so that the computer has a more comprehensive artificial intelligence. When people express emotions, there is often not only one way of expression, but to a certain extent, different ways of expression have a certain complementary effect when expressing emotional information. Combining information of multiple modalities for emoti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/00G06K9/62G06F17/27G06N3/04

CPCG06V20/41G06V20/46G06N3/045G06F18/241

Inventor 李玉军宋绪靖马浩洋

Owner SHANDONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-modal emotion recognition method based on attention feature fusion

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology