Multi-modal emotion classification method based on text, voice and video fusion

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of emotional classification and video fusion, applied in character and pattern recognition, other database clustering/classification, instruments, etc., can solve the problems of unstable accuracy and high cost, achieve good flexibility, improve accuracy, and be easy to implement Effect

Pending Publication Date: 2019-09-27

NANJING UNIV OF SCI & TECH

View PDF3 Cites 28 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Before the rise of machine learning methods, sentiment analysis was mainly done manually, with high cost and unstable accuracy

Traditional machine learning and traditional multimodal methods mainly rely on the idea of feature engineering, using artificially extracted features on the voice and video sides. However, due to the ambiguity of emotional expression, artificially extracted features are often difficult to extract deep emotional expressions. , there is still a lot of room for improvement in the accuracy of emotion recognition

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0055] Such as Figure 4 As shown, this embodiment takes the MOSI data set of Carnegie Mellon University as an example, first obtains the original data of three modes, and then performs preprocessing.

[0056] Mark the emotional label of the corresponding segment, and align the corresponding video subtitle data (text mode), same-frequency audio data (audio mode), and video data (video mode). for example:

[0057] Ordinary sample: "I love this movie." From the semantics, the emotional category can be directly marked as positive;

[0058] Semantically ambiguous samples: "The movie is sick." Combined with a loud voice and obvious frowns in the video, the emotional category can be marked as negative;

[0059] In the training phase, the original samples of are sent to the multimodal emotion classification model based on tensor fusion for training, and the emotion classification model is obtained, which is used to judge the emotion category of the test sample during the test; In...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-modal emotion classification method based on text, voice and video fusion, and the method comprises the steps of obtaining the multi-modal data, preprocessing the multi-modal data, and dividing the multi-modal data into a training set and a test set; constructing an end-to-end multi-modal emotion classification model based on tensor fusion, and training the model by using the training set; and carrying out the preprocessing operation in the step 1 on the test set, and carrying out the sentiment classification by using the tensor fusion sentiment classification model obtained in the step 2. According to the present invention, the fuzzy deep emotion information can be better captured through the multi-modal emotion classification model.

Description

technical field [0001] The invention belongs to natural language processing technology, specifically a multimodal emotion classification method based on fusion of text, voice and video. Background technique [0002] At present, relevant social media sites are producing a large amount of video data with rich emotional information every day, resulting in a large number of multimodal opinion mining and sentiment analysis technologies for text, voice, and video. This technology is not only a natural Academic frontier issues and hot research issues in the field of language processing and sentiment analysis are also important issues that need to be solved urgently in the field of application. They have immeasurable application value and social significance, but also pose great challenges. [0003] Before the rise of machine learning methods, sentiment analysis was mainly done manually, which was costly and unstable. Traditional machine learning and traditional multimodal methods ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/906G06K9/62

CPCG06F16/906G06F18/241

Inventor 夏睿李晟华

Owner NANJING UNIV OF SCI & TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-modal emotion classification method based on text, voice and video fusion

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology