Adversarial Cross-media Retrieval Method Based on Constrained Text Space

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A cross-media, confrontational technology, applied in the field of computer vision, can solve the problems of cross-media retrieval performance degradation, loss of image action and interaction information, and inapplicability of cross-media retrieval, etc.

Active Publication Date: 2021-07-30

PEKING UNIV SHENZHEN GRADUATE SCHOOL

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, they are also pre-trained on some datasets different from cross-media retrieval, so the extracted features are not suitable for cross-media retrieval

[0004] The second defect is reflected in the choice of isomorphic feature space

Therefore, this feature will also lose the rich action and interactive information contained in the image, which also shows that for cross-media retrieval, the Word2Vec space is not an effective text feature space.

[0005] The third defect is reflected in the difference in the characteristic distribution of different modal data

Although existing methods map data features of different modalities to an isomorphic feature space, there is still a modality gap between them, and there are obvious differences in feature distribution, which will lead to Media retrieval performance degradation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0063] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0064] The invention provides an adversarial cross-media retrieval method based on a limited text space, which mainly obtains the limited text space through learning, and realizes the similarity measurement between images and texts. Based on a limited text space, the method extracts image and text features suitable for cross-media retrieval by simulating human cognition, realizes the mapping of image features from image space to text space, and introduces an adversarial training mechanism. It aims to continuously reduce the difference in feature distribution between different modal data during the learning process. The feature extraction network, feature mapping network, modality classifier and their implementation in the present invention, as well as the training steps of the network are described i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention discloses an adversarial cross-media retrieval method based on a restricted text space, and designs a feature extraction network, a feature mapping network, and a modality classifier, obtains a restricted text space through learning, and extracts images and images suitable for cross-media retrieval. Text features realize the mapping of image features from image space to text space; through the adversarial training mechanism, the difference in feature distribution between different modal data is continuously reduced during the learning process; thus cross-media retrieval is realized. The present invention can better fit human behavior in cross-media retrieval tasks; obtain image and text features that are more suitable for cross-media retrieval tasks, and make up for the lack of expression ability of pre-trained features; introduce adversarial learning Through the max-min game between the modality classifier and the feature mapping network, the retrieval accuracy is further improved.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to an adversarial cross-media retrieval method based on a restricted text space. Background technique [0002] With the advent of the Web 2.0 era, a large amount of multimedia data (images, texts, videos, audios, etc.) began to accumulate and spread on the Internet. Different from traditional single-modal retrieval tasks, cross-media retrieval is used to achieve bidirectional retrieval between different modal data, such as text retrieval images and image retrieval texts. However, due to the inherently heterogeneous nature of multimedia data, their similarity cannot be directly measured. Therefore, the core problem of this type of task is how to find a homogeneous mapping space, so that the similarity between heterogeneous multimedia data can be directly measured. In the current field of cross-media retrieval, people have done a lot of research on the basis of this problem,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityPatents(China)

IPC IPC(8): G06F16/2458G06F16/58G06F16/28G06N3/08

CPCG06F16/2462G06F16/285G06F16/5846G06N3/084H04N21/44008G06N3/08G06N3/044G06N3/045

Inventor王文敏余政王荣刚李革王振宇赵辉高文

OwnerPEKING UNIV SHENZHEN GRADUATE SCHOOL

Adversarial Cross-media Retrieval Method Based on Constrained Text Space

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements:Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology