Zero-sample cross-modal retrieval method based on multi-modal feature synthesis

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-modal, cross-modal technology, applied in the field of cross-modal retrieval, can solve the problems of ignoring mutual correlation and not optimizing the cross-modal retrieval problem.

Active Publication Date: 2020-07-17

UNIV OF ELECTRONICS SCI & TECH OF CHINA

View PDF3 Cites 12 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Existing zero-shot learning methods of this kind are usually used to solve traditional classification problems, are not optimized for cross-modal retrieval problems, and often focus only on the mapping from raw data representations to category embeddings, ignoring their interrelationships

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0058] figure 1 It is a flowchart of a zero-sample cross-modal retrieval method based on multimodal feature synthesis in the present invention.

[0059] In this example, if figure 1 As shown, a zero-sample cross-modal retrieval method based on multimodal feature synthesis of the present invention comprises the following steps:

[0060] S1. Extract multimodal data features

[0061] Multimodal data includes images, text, etc. These raw data are expressed in a way that humans can accept, but computers cannot directly process them. Their features need to be extracted and expressed in numbers that computers can process.

[0062] Download N sets of multimodal data containing images, texts, and image and text shared category labels. These data belong to C categories, and images and texts under each category have shared category labels. Then use the convolutional neural network VGG Net to extract image features v i , using network Doc2vec to extract text features t i , using the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a zero-sample cross-modal retrieval method based on multi-modal feature synthesis. The method comprises the steps: employing two adversarial generative networks, synthesizing feature representations of different modals through class embedding shared by two modal data, then mapping original modal data and synthesized modal data to a common subspace, and carrying out the aligned distribution of the original modal data and the synthesized modal data. Therefore, the relation between different modal data of the same category is established, and knowledge is migrated to unseen categories. The cyclic consistency constraint further reduces the difference between the original semantic feature and the reconstructed semantic feature, and well establishes the association between the original representation and the semantic feature in each modal, so that the common semantic space is more robust, and the accuracy of zero-sample cross-modal retrieval is improved.

Description

technical field [0001] The invention belongs to the technical field of cross-modal retrieval, and more specifically relates to a zero-sample cross-modal retrieval method based on multimodal feature synthesis. Background technique [0002] The goal of cross-modal retrieval is to search for semantically similar instances in another modality (such as image) by using a query from one modality (such as text). The distributions and feature representations of different modality data are inconsistent, so it is difficult to directly measure the similarity between different modality data. The solution of existing methods is usually to establish a common subspace, map the data of different modalities to this common subspace to obtain a unified representation, and then use some measurement methods to calculate the similarity between different modal data, and The search results are those with the largest similarity of the search targets, thus achieving cross-modal search. [0003] Howe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/583G06F16/55G06F16/38G06F16/35G06K9/62G06N3/04G06N3/08

CPCG06F16/583G06F16/55G06F16/38G06F16/355G06N3/08G06N3/045G06F18/22

Inventor 徐行张明林凯毅杨阳邵杰申恒涛

Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Zero-sample cross-modal retrieval method based on multi-modal feature synthesis

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology