Cross-modal retrieval method based on modal specificity and shared feature learning

A feature learning and cross-modal technology, applied in neural learning methods, other database retrieval, digital data information retrieval, etc., can solve problems such as the combination of less modal-specific information and modal-shared information, loss of effective information, etc., to achieve Good semantic differentiation and the effect of reducing distribution differences

Active Publication Date: 2021-05-14
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, the existing cross-modal retrieval work rarely considers modality-specific information and modality-shared information when doing feature extraction, resulting in the loss of effective information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal retrieval method based on modal specificity and shared feature learning
  • Cross-modal retrieval method based on modal specificity and shared feature learning
  • Cross-modal retrieval method based on modal specificity and shared feature learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] see figure 1 , this embodiment provides a cross-modal retrieval method based on modality-specific and shared feature learning, including the following steps:

[0055] Step S1, obtaining a cross-modal retrieval data set, and dividing the cross-modal retrieval data set into a training set and a test set;

[0056] Specifically, in this embodiment, the data sets obtained through conventional channels such as the Internet include: Wikipedia and NUS-WIDE, and these data sets are composed of labeled image-text pairs.

[0057] Step S2, performing feature extraction on the text and images in the training set;

[0058] Specifically, in this embodiment, the image features of the seventh part of the fully connected layer are extracted through the VGG-19 model; the text features are extracted through the bag-of-words model.

[0059] In this embodiment, the VGG-19 model used includes 16 convolutional layers and 3 fully connected layers. The network structure is: the first part cons...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-modal retrieval method based on modal specificity and shared feature learning, which comprises the following steps: S1, acquiring a cross-modal retrieval data set, and dividing the cross-modal retrieval data set into a training set and a test set; S2, respectively carrying out feature extraction on the text and the image; S3, extracting modal specific features and modal sharing features; S4, generating a hash code corresponding to the modal sample through a hash network; S5, training the network by combining the loss function of the adversarial auto-encoder network and the loss function of the Hash network; and S6, performing cross-modal retrieval on the samples in the test set by using the network trained in the step S5. According to the method, a Hash network is designed, encoding features of image channels, encoding features of text channels and modal sharing features are projected into a Hamming space, and modeling is performed by using label information, modal specificity and sharing features, so that output Hash codes have better semantic discrimination between modals and in the modals.

Description

technical field [0001] The invention relates to a cross-modal retrieval method, in particular to a cross-modal retrieval method based on modality-specific and shared feature learning. Background technique [0002] In recent years, massive amounts of multimodal data have flooded our lives. Take the news on the Internet as an example, which usually includes text introductions, and sometimes some photos taken by reporters are also typeset on the page, and there are even some exclusive video and audio reports. Multimodal data such as text, image, video, audio, etc. is an important means for us to efficiently obtain the same information from multiple angles. Users not only need to retrieve between single-modal data, but also need a more flexible retrieval method: accurately retrieve related data from one modality to another modality. In recent years, the work of cross-modal retrieval has become a hot topic widely discussed in academia. However, multimodal data usually have rel...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/9032G06F16/901G06N3/04G06N3/08
CPCG06F16/9032G06F16/9014G06N3/04G06N3/08
Inventor 吴飞罗晓开季一木黄庆花高广谓蒋国平
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products