A cross-modal image-text association anomaly detection method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An anomaly detection and cross-modal technology, applied in the direction of neural learning methods, character and pattern recognition, biological neural network models, etc., can solve the problem that it is difficult to fully learn the data association of different modalities, and it is difficult to establish data of different modalities Contact and other issues to achieve the effect of improving accuracy and robustness

Active Publication Date: 2022-06-21

FUDAN UNIV

View PDF9 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] However, it is very difficult to model the relationship between cross-modal data. There are great differences in data between different modalities. For example, the representation of images is continuous, while the representation of text is usually discrete, so it is difficult to It is difficult to establish links between data of different modalities at this level

Some traditional image-text association anomaly detection algorithms based on machine learning can map heterogeneous data features into a common latent space by introducing typical correlation analysis methods. In the public latent space, different modal data The correlation coefficient can be directly calculated between different modal data. Due to the heterogeneity of different modal data, it is difficult for these 'shallow' models to fully learn the correlation between different modal data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0027] The present invention provides a cross-modal image-text correlation anomaly detection method, the flowchart of which is as follows figure 1 As shown, it is divided into three stages, namely the image multi-label classification stage, the text multi-label classification stage and the associated anomaly detection stage. details as follows:

[0028] 1. Image multi-label classification stage

[0029] The input of the image multi-label classification stage is the image in the image-text pair to be detected. The image multi-label classification model consists of a CNN encoder and an RNN decoder. The CNN encoder is used to extract important visual features from the image, and then the extracted The features of the input RNN decoder generate a sequence of labels to predict the final label of the image. First, the image is preprocessed. In the preprocessing stage, the size of the image is adjusted to a fixed shape of 288*288, and then the pixels of the three channels of the i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

PUM

Login to view more

Abstract

The invention belongs to the technical field of computer multimedia, and specifically relates to a cross-modal image-text association anomaly detection method. The present invention judges whether the image-text pair association is abnormal through the following steps: 1) In the image multi-label classification stage, the image is input into a CNN-RNN-based codec to accurately extract the label information of the image; 2) Text multi-label In the classification stage, the text is input into the BiLSTM-based network to obtain the label information of the text; 3) in the association anomaly detection stage, the classification results of the image and text are fused to determine whether the image-text pair is abnormal. The method proposed by the invention can accurately realize the abnormal detection of image-text pair association, and the model has strong robustness.

Description

technical field [0001] The invention relates to a cross-modal image-text correlation abnormal detection method, which belongs to the technical field of computer multimedia. Background technique [0002] With the application of related technologies such as mobile Internet, Internet of Things, and social media networks, the amount of data that can be collected and analyzed is growing rapidly, and the carrier of information is also developing from traditional text records to richer multimedia records. Unlike written records, which contain a large number of abstract concepts, the information content of multimedia is mostly figurative sensory information description, and how to let artificial intelligence learn to understand multimedia content can associate abstract text semantic information with intuitive multimedia content. has become a subject of increasing attention in recent years. Image-text association anomaly detection is an important research content in this topic. In t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to view more

Application Information

Patent Timeline

Login to view more

Patent Type & Authority Patents(China)

IPC IPC(8): G06V10/44G06V10/764G06V10/82G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06V10/44G06N3/048G06N3/044G06N3/045G06F18/241

Inventor 金城王尚尚吴渊

Owner FUDAN UNIV

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Try Eureka

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic.

A cross-modal image-text association anomaly detection method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology