Multi-modal irony object detection method based on multi-scale cross-modal neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of object detection and neural network, applied in the direction of biological neural network model, neural learning method, neural architecture, etc., can solve problems such as inadequacy and incompleteness, and achieve high-performance results

Inactive Publication Date: 2022-02-11

ZHEJIANG UNIV CITY COLLEGE

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] Just as it is not enough to judge whether it is sarcasm with only text (for which multimodal sarcasm detection has been proposed), it is also insufficient and incomplete to detect sarcasm targets with only text modality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0051] Embodiment 1 of this application provides such as figure 1 The shown multi-modal sarcasm object detection method for social tweets based on multi-scale cross-modal encoding neural network for real-time detection of multi-modal sarcasm objects in sarcastic tweets:

[0052] S101. Training a multimodal satirical object detection neural network;

[0053] The directly initialized neural network cannot work directly, so it is necessary to train the constructed neural network according to the existing data set. After training on the training set, use the test set to test the performance of the trained neural network weights to obtain the evaluation results. Constantly repeat the above process, and continuously adjust the relevant hyperparameters (Exact Match (EM, absolute matching rate) and F1 score for text satire object detection tasks; AP for visual satire object detection) through the evaluation results connected to the test. 、AP 50 、AP 75 ), and finally get a result w...

Embodiment 2

[0063] On the basis of Embodiment 1, Embodiment 2 of this application provides a specific implementation of step S101 in Embodiment 1, such as figure 2 Shown:

[0064] S201. Collect multimodal satirical tweet data required for training;

[0065] As mentioned in the content of the invention above, positive samples, that is, samples with ironic meaning, are selected from the multi-modal satire detection data set used in the related paper "Multi-Modal Sarcasm Detection in Twitter with Hierarchical Fusion Model" as the basic data.

[0066] S202. Perform data labeling and divide the data set;

[0067] On the basis of the existing satire dataset, the multi-modal satire object annotation is carried out, including visual satire object annotation and text satire object annotation, and a labeled dataset is obtained. Then divide the data set into training set, verification set and test set according to the appropriate ratio.

[0068] S203, train the neural network model

[0069] Use...

Embodiment 3

[0081] On the basis of Embodiments 1 to 2, Embodiment 3 of the present application provides a specific implementation of step S104 in Embodiment 1, such as image 3 and 4 Shown:

[0082] S301. Input the processed neural network image-text pair;

[0083] The images and texts of the multimodal satirical object detection data samples that have undergone preprocessing steps are input into the neural network, and the neural network will process the data of the two modalities respectively. Perform one-hot encoding on text data and normalize image data.

[0084] S302, performing feature extraction and representation on the text;

[0085] The text of modal satirical tweets is input into a pre-trained language model (such as BERT, RoBERT or BERTweets, etc.), the pre-trained language model extracts and encodes the text features, and selects the output of the last layer of the model as the text final expression of

[0086] S303, performing multi-scale feature extraction on the image...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a multi-modal irony object detection method based on a multi-scale cross-modal neural network. The multi-modal irony object detection method comprises the following steps of acquiring a certain amount of Tale text containing irony meaning and images as a basic data set; marking image-text pairs in the basic data set in combination with image and text context contrast information; and designing a multi-mode irony object detection neural network. The method is advantaged in that a multi-mode irony object detection neural network is designed and constructed, and the neural network comprises a text information encoder, an image information encoder, a B2M converter, a cross-mode encoder (MCE), an M2N conversion network, a text irony object detection network (TSTI) and an image irony object detection network; the image and the text are combined to detect a text irony object and an image irony object in the Tale text; compared with the existing method, the method provided by the invention has higher performance.

Description

technical field [0001] The invention belongs to the field of ironic object detection of social tweets, and in particular relates to a multi-modal ironic object detection method of social tweets based on a multi-scale cross-modal encoding neural network. Background technique [0002] Sarcasm is an emotion in which people express their negative emotions with words that are positive or reinforce the positive. It has the ability to mask the hostility of the speaker, thereby enhancing the effect of ridicule or humor on the listener. Sarcasm is very popular on social media platforms such as Twitter, and automatic sarcasm detection (STI) is of great significance in customer service, opinion mining, online harassment detection, etc. [0003] Spotting sarcasm requires understanding people's real emotions. Previous research mainly focused on text modality, and proposed methods such as rule-based, statistical classifier-based, and socio-linguistic feature-based deep learning models. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/62G06F40/279G06F40/253G06N3/04G06N3/08

CPCG06F40/279G06F40/253G06N3/08G06N3/044G06N3/045G06F18/214

Inventor 孙霖王跻权邵美芝刘益郑增威

Owner ZHEJIANG UNIV CITY COLLEGE

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-modal irony object detection method based on multi-scale cross-modal neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology