Cross-modal retrieval method based on deep self-supervised sorting hash

A cross-modal and deep technology, applied in the field of pattern recognition, can solve the problems of large coding error, poor retrieval performance, unsatisfactory retrieval performance, etc., and achieve the effect of good robustness

Active Publication Date: 2021-07-02
NANJING UNIV OF POSTS & TELECOMM
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, cross-modal retrieval has been greatly developed. Many shallow cross-modal hash retrieval methods have been proposed. These shallow methods are based on handcrafted features for hash learning. A common shortcoming of them is that handcrafted features The crafting process and the hashing learning process are completely independent, and thus handcrafted features may not be fully compatible with the hashing learning process
Another reason for the unsatisfactory retrieval performance is that most existing deep hashing cross-modal retrieval methods discard the full labels of the data and only use the cross-modal similarity matrix for supervised learning, so that the learned hash codes lack Semantic information, not accurate enough
In addition, most cross-modal retrieval methods are encoded using binary space partitioning functions, which can produce large encoding errors and also lead to poor retrieval performance.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal retrieval method based on deep self-supervised sorting hash
  • Cross-modal retrieval method based on deep self-supervised sorting hash
  • Cross-modal retrieval method based on deep self-supervised sorting hash

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The technical solution of the present invention will be further described in detail in conjunction with the accompanying drawings and specific implementation: the present invention provides a cross-modal retrieval algorithm based on deep self-supervised sorting hash, and the specific process is as follows figure 1 shown.

[0038] Step (1): Obtain a training dataset, where each sample includes text, images and labels. Here we use three widely used benchmark multimodal datasets, namely Wiki, MIRFlickr and NUS-WIDE.

[0039] Step (2): Use the label information to train the label network. The specific method is:

[0040] The purpose of label network is to learn the semantic features of instances to guide the feature learning of image and text networks. Semantic feature learning: Using a 4-layer fully connected network, the input layer of the neural network is the label of the instance, the second layer has 4096 nodes, uses the Relu activation function and performs local n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a cross-modal retrieval method based on deep self-supervised sorting hash. The method comprises the following steps of: firstly, learning a label network for reserving a similarity relationship between semantic features and corresponding hash codes; according to the label network, effectively using multi-label information to bridge semantic relevance among different modes; and designing an end-to-end feature learning network for the image and the text, and carrying out feature learning; in one aspect, a semantic correlationship between the label network and an image text network is kept. And on the other hand, the learned features can be perfectly compatible with a specific cross-modal retrieval task. In order to solve the problem that binary partition function coding is very sensitive to partition threshold values, a coding function based on sorting is adopted. The relative order of each dimension is not changed, and the value of the Hash code is not changed, so that the Hash function is not very sensitive to some threshold values, and the robustness of the obtained Hash code is better.

Description

technical field [0001] The invention relates to pattern recognition, in particular to a cross-modal retrieval method based on deep self-supervised sorting and hashing. Background technique [0002] Cross-modal retrieval has become a compelling topic in recent years due to the explosion of multimedia data on various search engines and social media. Cross-modal retrieval aims to use data from one modality (e.g. text) to search for semantically similar instances in another modality (e.g. image). Since data from different modalities usually have incomparable feature representations and distributions, it is necessary to map them into a common feature space. To meet the requirements of low storage cost and high query speed in practical applications, hashing has attracted much attention in the field of cross-modal retrieval. It maps high-dimensional multi-modal data to the public Hamming space, and after obtaining the hash code, the similarity between multi-modal data can be calc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/31G06F16/953G06F40/30G06N3/04G06N3/08
CPCG06F16/325G06F16/953G06N3/084G06N3/045
Inventor 荆晓远钱金星吴飞董西伟
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products