Cross-modal retrieval method and device based on deep adversarial discrete hash learning

A cross-modal, hashing technology, applied in the field of cross-modal retrieval methods and devices based on deep confrontation discrete hash learning, can solve problems such as optimization instability and huge quantization errors, achieve optimization robustness, improve accuracy, The effect of strong semantic learning ability

Pending Publication Date: 2020-08-28
ZHEJIANG UNIV OF TECH
View PDF4 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Under such loose conditions, it will lead to huge quantization errors, optimization instability and other problems.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal retrieval method and device based on deep adversarial discrete hash learning
  • Cross-modal retrieval method and device based on deep adversarial discrete hash learning
  • Cross-modal retrieval method and device based on deep adversarial discrete hash learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0025] In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.

[0026] Hash learning maps data into binary strings through machine learning mechanisms, which can significantly reduce data storage and communication overhead, thereby effectively improving the efficiency of the learning system. The purpose of hash learning is to learn the binary hash code representation of the data, so that the hash code retains the neighbor relationship in the original space as much as possible, that is, maintains the similarity. Specifically, each data point will be represented as a compact binary string code (hash code), and two similar points in the original space sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-modal retrieval method and device based on deep adversarial discrete hash learning, and the method comprises the steps: forming an image network through three full connection layers, forming a text network through four full connection layers, and obtaining a final hash value through the activation of a tanh function. Through adversarial training of the two networks,it can be ensured that feature expressions of the two modes tend to be consistent; semantic similarity is guaranteed through a cosine triple loss function to be weighted, meanwhile, relevancy ordering of similar samples is guaranteed, discreteness of hash values is maintained in the optimization process, quantization errors are reduced, and finally two robust hash functions for keeping semantic similarity, reducing heterogeneous gaps and being small in accumulative error are obtained. According to the method, the hash function is learned by utilizing adversarial training and keeping semanticsimilarity and relevancy ranking, so that the retrieval precision is improved, and the obtained hash learning method has stronger semantic learning capability.

Description

technical field [0001] The invention relates to the technical field of image big data processing and analysis in the field of computer vision and natural language processing and analysis, in particular to a cross-modal retrieval method and device based on deep confrontational discrete hash learning. Background technique [0002] With the development of modern network technology, a large amount of multimodal data is generated in people's daily life every day, including text, audio, video and image. Meanwhile, efficient retrieval from such a large amount of multimodal data has become a great challenge, among which image-to-text and text-to-image retrieval are the most widely studied. Retrieval based on hash learning is widely used in various retrieval tasks due to its high efficiency and convenient storage. Hash learning learns the optimal hash function, and maps high-dimensional data into binary codes on the premise of ensuring the similarity between the data in the original...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06F16/53G06F16/55G06K9/62G06N3/04G06N3/08
CPCG06F16/334G06F16/35G06F16/53G06F16/55G06N3/084G06N3/045G06F18/241
Inventor 白琮曾超马青张敬林陈胜勇
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products