Picture text cross-modal retrieval method based on self-supervised adversarial

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A cross-modal, text technology, applied in the field of image and text cross-modal retrieval based on self-supervised confrontation, can solve the problem of affecting the cross-modal retrieval effect of icon text, the failure of samples to be successfully discriminated by the discriminator, and hindering cross-modal retrieval, etc. question

Active Publication Date: 2021-03-12

GUIZHOU UNIV

View PDF8 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Adversarial training alternately trains the generator and the discriminator so that the samples generated by the generator cannot be successfully discriminated by the discriminator

[0004] Due to the large gap between image features and text features, it is difficult to map the features of the two modalities into the same space for retrieval in traditional image-text cross-modal retrieval methods, which hinders efficient cross-modal retrieval

In particular, there is redundant information in image features, and different modal features cannot be well mapped to the same space, which affects the effect of icon text cross-modal retrieval

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0050] Specific embodiments of the present invention will be described below in conjunction with the accompanying drawings, so that those skilled in the art can better understand the present invention. It should be noted that in the following description, when detailed descriptions of known functions and designs may dilute the main content of the present invention, these descriptions will be omitted here.

[0051] figure 1 It is a flow chart of a specific embodiment of the self-supervised confrontation-based image-text cross-modal retrieval method of the present invention.

[0052] In this example, if figure 1 As shown, the present invention is based on self-supervised confrontation image text cross-modal retrieval method, it is characterized in that, comprises the following steps:

[0053] Step S1: Build the Embedding Generative Network

[0054] In this example, if figure 2 As shown, two independent single-layer feedforward neural networks are constructed as image networ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a picture text cross-modal retrieval method based on self-supervised adversarial, which comprises the following steps of: constructing a shared network in an embedded generation network to serve as picture features and text features to be respectively mapped into shared subspaces of picture codes and text codes, so that the separation between the picture features and the text features is reduced, and meanwhile, constructing a self-supervised adversarial network, performing adversarial training under the assistance of the self-supervised adversarial network to embed a generative network, filtering redundant information in picture features, so that different modal features are better mapped to the same space, the different modal features can be better aligned, and therefore the cross-modal retrieval effect of picture texts is improved.

Description

technical field [0001] The invention belongs to the technical field of image text cross-modal retrieval, and more specifically, relates to a self-supervised confrontation-based image text cross-modal retrieval method. Background technique [0002] With the development of cloud computing, Internet of Things, mobile phones, social media and other information technologies, the data on the Internet is growing explosively, and the era of big data has arrived. In the era of big data, how to perform fast image text cross-modal retrieval is the focus of people's attention. [0003] Self-supervised learning uses the labels of its own attributes to train the model to obtain the features of the picture, avoiding expensive manual labeling, but the learned features are often only related to the visual information of the picture. Adversarial training alternately trains the generator and the discriminator, so that the samples generated by the generator cannot be successfully discriminated...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/46G06K9/62G06N3/04G06N3/08

CPCG06N3/084G06V10/44G06N3/044G06N3/045G06F18/213

Inventor 杨阳何仕远王阳超

Owner GUIZHOU UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Picture text cross-modal retrieval method based on self-supervised adversarial

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology