Cross-modal retrieval method based on modal specificity and shared feature learning

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A feature learning and cross-modal technology, applied in neural learning methods, other database retrieval, digital data information retrieval, etc., can solve problems such as the combination of less modal-specific information and modal-shared information, loss of effective information, etc., to achieve Good semantic differentiation and the effect of reducing distribution differences

Active Publication Date: 2021-05-14

NANJING UNIV OF POSTS & TELECOMM

View PDF3 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

In addition, the existing cross-modal retrieval work rarely considers modality-specific information and modality-shared information when doing feature extraction, resulting in the loss of effective information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0054] see figure 1 , this embodiment provides a cross-modal retrieval method based on modality-specific and shared feature learning, including the following steps:

[0055] Step S1, obtaining a cross-modal retrieval data set, and dividing the cross-modal retrieval data set into a training set and a test set;

[0056] Specifically, in this embodiment, the data sets obtained through conventional channels such as the Internet include: Wikipedia and NUS-WIDE, and these data sets are composed of labeled image-text pairs.

[0057] Step S2, performing feature extraction on the text and images in the training set;

[0058] Specifically, in this embodiment, the image features of the seventh part of the fully connected layer are extracted through the VGG-19 model; the text features are extracted through the bag-of-words model.

[0059] In this embodiment, the VGG-19 model used includes 16 convolutional layers and 3 fully connected layers. The network structure is: the first part cons...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a cross-modal retrieval method based on modal specificity and shared feature learning, which comprises the following steps: S1, acquiring a cross-modal retrieval data set, and dividing the cross-modal retrieval data set into a training set and a test set; S2, respectively carrying out feature extraction on the text and the image; S3, extracting modal specific features and modal sharing features; S4, generating a hash code corresponding to the modal sample through a hash network; S5, training the network by combining the loss function of the adversarial auto-encoder network and the loss function of the Hash network; and S6, performing cross-modal retrieval on the samples in the test set by using the network trained in the step S5. According to the method, a Hash network is designed, encoding features of image channels, encoding features of text channels and modal sharing features are projected into a Hamming space, and modeling is performed by using label information, modal specificity and sharing features, so that output Hash codes have better semantic discrimination between modals and in the modals.

Description

technical field [0001] The invention relates to a cross-modal retrieval method, in particular to a cross-modal retrieval method based on modality-specific and shared feature learning. Background technique [0002] In recent years, massive amounts of multimodal data have flooded our lives. Take the news on the Internet as an example, which usually includes text introductions, and sometimes some photos taken by reporters are also typeset on the page, and there are even some exclusive video and audio reports. Multimodal data such as text, image, video, audio, etc. is an important means for us to efficiently obtain the same information from multiple angles. Users not only need to retrieve between single-modal data, but also need a more flexible retrieval method: accurately retrieve related data from one modality to another modality. In recent years, the work of cross-modal retrieval has become a hot topic widely discussed in academia. However, multimodal data usually have rel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/9032G06F16/901G06N3/04G06N3/08

CPCG06F16/9032G06F16/9014G06N3/04G06N3/08

Inventor 吴飞罗晓开季一木黄庆花高广谓蒋国平

Owner NANJING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Cross-modal retrieval method based on modal specificity and shared feature learning

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology