Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Zero-sample cross-modal retrieval method based on multi-modal feature synthesis

A multi-modal, cross-modal technology, applied in the field of cross-modal retrieval, can solve the problems of ignoring mutual correlation and not optimizing the cross-modal retrieval problem.

Active Publication Date: 2020-07-17
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing zero-shot learning methods of this kind are usually used to solve traditional classification problems, are not optimized for cross-modal retrieval problems, and often focus only on the mapping from raw data representations to category embeddings, ignoring their interrelationships

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Zero-sample cross-modal retrieval method based on multi-modal feature synthesis
  • Zero-sample cross-modal retrieval method based on multi-modal feature synthesis
  • Zero-sample cross-modal retrieval method based on multi-modal feature synthesis

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0058] figure 1 It is a flowchart of a zero-sample cross-modal retrieval method based on multimodal feature synthesis in the present invention.

[0059] In this example, if figure 1 As shown, a zero-sample cross-modal retrieval method based on multimodal feature synthesis of the present invention comprises the following steps:

[0060] S1. Extract multimodal data features

[0061] Multimodal data includes images, text, etc. These raw data are expressed in a way that humans can accept, but computers cannot directly process them. Their features need to be extracted and expressed in numbers that computers can process.

[0062] Download N sets of multimodal data containing images, texts, and image and text shared category labels. These data belong to C categories, and images and texts under each category have shared category labels. Then use the convolutional neural network VGG Net to extract image features v i , using network Doc2vec to extract text features t i , using the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a zero-sample cross-modal retrieval method based on multi-modal feature synthesis. The method comprises the steps: employing two adversarial generative networks, synthesizing feature representations of different modals through class embedding shared by two modal data, then mapping original modal data and synthesized modal data to a common subspace, and carrying out the aligned distribution of the original modal data and the synthesized modal data. Therefore, the relation between different modal data of the same category is established, and knowledge is migrated to unseen categories. The cyclic consistency constraint further reduces the difference between the original semantic feature and the reconstructed semantic feature, and well establishes the association between the original representation and the semantic feature in each modal, so that the common semantic space is more robust, and the accuracy of zero-sample cross-modal retrieval is improved.

Description

technical field [0001] The invention belongs to the technical field of cross-modal retrieval, and more specifically relates to a zero-sample cross-modal retrieval method based on multimodal feature synthesis. Background technique [0002] The goal of cross-modal retrieval is to search for semantically similar instances in another modality (such as image) by using a query from one modality (such as text). The distributions and feature representations of different modality data are inconsistent, so it is difficult to directly measure the similarity between different modality data. The solution of existing methods is usually to establish a common subspace, map the data of different modalities to this common subspace to obtain a unified representation, and then use some measurement methods to calculate the similarity between different modal data, and The search results are those with the largest similarity of the search targets, thus achieving cross-modal search. [0003] Howe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/583G06F16/55G06F16/38G06F16/35G06K9/62G06N3/04G06N3/08
CPCG06F16/583G06F16/55G06F16/38G06F16/355G06N3/08G06N3/045G06F18/22
Inventor 徐行张明林凯毅杨阳邵杰申恒涛
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products