Image-text cross-modal retrieval based on multi-layer semantic deep Hash algorithm

A hash algorithm and cross-modal technology, applied in the field of cross-modal retrieval, can solve the problems of affecting retrieval results, multiple labels in real data, and inability to fully preserve data associations, etc., to achieve the effect of improving retrieval accuracy

Inactive Publication Date: 2019-08-09
BEIJING JIAOTONG UNIV
View PDF7 Cites 31 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, with the continuous growth of multimedia data, the feature representation using deep learning faces the challenges of storage space and retrieval efficiency due to the large dimensionality, which makes it unable to adapt to large-scale multimedia data retrieval tasks.
At the same time, the problem of cross-modal retrieval also faces th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image-text cross-modal retrieval based on multi-layer semantic deep Hash algorithm
  • Image-text cross-modal retrieval based on multi-layer semantic deep Hash algorithm
  • Image-text cross-modal retrieval based on multi-layer semantic deep Hash algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0018] The present invention will be further described below in conjunction with specific example:

[0019] In the present invention, both image and text modes are taken as examples for discussion.

[0020] The present invention provides an image-text cross-modal retrieval (DeepMulti-Level Semantic Hashing for Cross-modal Retrieval, DMSH) method based on a multi-layer semantic depth hashing algorithm, which includes three modules: deep feature extraction module, similarity Degree matrix generation module, hash code learning module, such as figure 1 shown;

[0021] Table 1 Image feature extraction network structure

[0022]

[0023]

[0024] The deep feature extraction module uses a deep neural network to extract image and text data features. The deep convolutional neural network CNN-F network structure is used for image feature extraction, and the network structure configuration is shown in Table 1. In the text feature extraction stage, the text data is first modeled...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an image combined with deep learning and hashing methods. And a text cross-modal retrieval model. The invention provides a deep cross-modal Hash algorithm based on multilayersemantics in order to solve the limitation that a traditional cross-modal Hash method based on deep learning directly converts multi-label data into a single label problem when processing the multi-label data problem. And the similarity between the data is defined through a co-occurrence relation between the multi-label data, and the similarity is used as supervision information of network training. And designing a loss function which comprehensively considers multi-layer semantic similarity and binary similarity, and training the network, so that the feature extraction and Hash code learningprocesses are unified in one framework, and end-to-end learning is realized. According to the algorithm, semantic correlation information between the data is fully utilized, and the retrieval accuracyis improved.

Description

technical field [0001] The invention relates to the field of cross-modal retrieval, in particular to an image-text cross-modal retrieval algorithm based on multi-layer semantics combined with deep learning and a hash method. Background technique [0002] With the development of mobile Internet and the popularization of smart phones, digital cameras and other devices, the multimedia data on the Internet is growing explosively. In the field of information retrieval, the continuous growth of multimedia big data has brought about the demand for cross-modal retrieval applications. However, the current mainstream search engines, such as Baidu, Google, Bing, etc., only provide one mode of retrieval results. In addition, as deep learning has made a series of breakthroughs in the fields of computer vision and natural language processing, the combination of multimedia big data and artificial intelligence is a common development trend in the future of the two fields. Therefore, combi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/583G06F16/58
CPCG06F16/5846G06F16/5866G06F16/583
Inventor 冀振燕姚伟娜杨文韬皮怀雨
Owner BEIJING JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products