Adversarial Cross-media Retrieval Method Based on Constrained Text Space

A cross-media, confrontational technology, applied in the field of computer vision, can solve the problems of cross-media retrieval performance degradation, loss of image action and interaction information, and inapplicability of cross-media retrieval, etc.

Active Publication Date: 2021-07-30
PEKING UNIV SHENZHEN GRADUATE SCHOOL
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, they are also pre-trained on some datasets different from cross-media retrieval, so the extracted features are not suitable for cross-media retrieval
[0004] The second defect is reflected in the choice of isomorphic feature space
Therefore, this feature will also lose the rich action and interactive information contained in the image, which also shows that for cross-media retrieval, the Word2Vec space is not an effective text feature space.
[0005] The third defect is reflected in the difference in the characteristic distribution of different modal data
Although existing methods map data features of different modalities to an isomorphic feature space, there is still a modality gap between them, and there are obvious differences in feature distribution, which will lead to Media retrieval performance degradation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Adversarial Cross-media Retrieval Method Based on Constrained Text Space
  • Adversarial Cross-media Retrieval Method Based on Constrained Text Space
  • Adversarial Cross-media Retrieval Method Based on Constrained Text Space

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0063] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0064] The invention provides an adversarial cross-media retrieval method based on a limited text space, which mainly obtains the limited text space through learning, and realizes the similarity measurement between images and texts. Based on a limited text space, the method extracts image and text features suitable for cross-media retrieval by simulating human cognition, realizes the mapping of image features from image space to text space, and introduces an adversarial training mechanism. It aims to continuously reduce the difference in feature distribution between different modal data during the learning process. The feature extraction network, feature mapping network, modality classifier and their implementation in the present invention, as well as the training steps of the network are described i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention discloses an adversarial cross-media retrieval method based on a restricted text space, and designs a feature extraction network, a feature mapping network, and a modality classifier, obtains a restricted text space through learning, and extracts images and images suitable for cross-media retrieval. Text features realize the mapping of image features from image space to text space; through the adversarial training mechanism, the difference in feature distribution between different modal data is continuously reduced during the learning process; thus cross-media retrieval is realized. The present invention can better fit human behavior in cross-media retrieval tasks; obtain image and text features that are more suitable for cross-media retrieval tasks, and make up for the lack of expression ability of pre-trained features; introduce adversarial learning Through the max-min game between the modality classifier and the feature mapping network, the retrieval accuracy is further improved.

Description

technical field [0001] The invention relates to the technical field of computer vision, in particular to an adversarial cross-media retrieval method based on a restricted text space. Background technique [0002] With the advent of the Web 2.0 era, a large amount of multimedia data (images, texts, videos, audios, etc.) began to accumulate and spread on the Internet. Different from traditional single-modal retrieval tasks, cross-media retrieval is used to achieve bidirectional retrieval between different modal data, such as text retrieval images and image retrieval texts. However, due to the inherently heterogeneous nature of multimedia data, their similarity cannot be directly measured. Therefore, the core problem of this type of task is how to find a homogeneous mapping space, so that the similarity between heterogeneous multimedia data can be directly measured. In the current field of cross-media retrieval, people have done a lot of research on the basis of this problem,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2458G06F16/58G06F16/28G06N3/08
CPCG06F16/2462G06F16/285G06F16/5846G06N3/084H04N21/44008G06N3/08G06N3/044G06N3/045
Inventor 王文敏余政王荣刚李革王振宇赵辉高文
Owner PEKING UNIV SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products