Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Cross-modal hash retrieval method based on triple deep networks

A deep network and triplet technology, applied in the field of computer vision, can solve the problem of low retrieval accuracy, and achieve the effect of improving accuracy, enriching semantic information, and increasing discriminativeness.

Active Publication Date: 2018-06-15
XIDIAN UNIV
View PDF4 Cites 32 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to address the above-mentioned deficiencies in the prior art, and propose a cross-modal hash retrieval method based on a triple deep network, which is used to solve the technical problem of low retrieval accuracy existing in the existing cross-modal hash retrieval method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal hash retrieval method based on triple deep networks
  • Cross-modal hash retrieval method based on triple deep networks
  • Cross-modal hash retrieval method based on triple deep networks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] Below in conjunction with accompanying drawing and specific embodiment, the present invention is described in further detail,

[0037] refer to figure 1 , the present invention comprises the following steps:

[0038] Step 1) Preprocess the data:

[0039] Determine the data of two modalities: image data and text data, use the word2vec method to extract the Bag-of-words feature of the text data, express the text into a vector form for computer processing, and extract the original pixel features of the image data to retain the original information of the image; And 80% of the image data are used as image training data, and the rest are used as image query data; the text data corresponding to the image training data is used as text training data, and the rest are used as text query data;

[0040] Step 2) Get the hash codes of image training data and text training data:

[0041] Input the Bag-of-words feature of the text training data into the text deep network to obtain ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a cross-modal hash retrieval method based on triple deep networks. The method is used to solve the technical problem of low retrieval precision existing in existing cross-modalhash retrieval methods, and includes the realization steps of: preprocessing data, and dividing the data into training data and query data; acquiring hash codes of image training data and text training data; using triple supervisory information to establish an objective loss function; carrying out orderly iterative optimization on the objective loss function; calculating hash codes of image querydata and text query data; and acquiring retrieval results of the query data. According to the solution provided by the invention, the triple information is used to construct the objective loss function, semantic information is increased, an intra-modal loss function is added at the same time, discriminability of the method is improved, and precision of cross-modal retrieval can be effectively improved. The method can be used for Internet-of-things information retrieval and image and text mutual-searching services of e-commerce, mobile equipment and the like.

Description

technical field [0001] The invention belongs to the technical field of computer vision, and relates to mutual retrieval between large-scale image data and text data, specifically a cross-modal hash retrieval method based on a triple deep network, which can be used for Internet of Things information retrieval, electronic Image and text mutual search service for business and mobile devices. Background technique [0002] With the rapid development of Internet technology and social networking sites, massive amounts of multimedia data, such as text, images, video, and audio, are generated every day. The mutual retrieval of cross-modal data has become a research hotspot in the field of information retrieval. Hash method is a very effective information retrieval method, which has the advantages of low memory consumption and fast retrieval. Hash methods can be divided into single-modal hash methods, multi-modal hash methods and cross-modal hash methods. The query data and retrieva...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06K9/62G06N3/08
CPCG06F16/35G06F16/583G06N3/084G06F18/22
Inventor 邓成陈兆佳李超杨二昆杨延华
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products