Universal cross-modal retrieval model based on deep hash

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A cross-modal and model technology, applied in network data retrieval, other database retrieval, biological neural network models, etc., can solve the problems of model retrieval delay and inefficiency, large storage space, and ignoring retrieval efficiency.

Pending Publication Date: 2021-07-06

CHINA UNIV OF PETROLEUM (EAST CHINA)

View PDF0 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Most existing techniques directly model based on the extracted feature values to achieve cross-modal retrieval, which is very time-consuming for large-scale datasets and requires a lot of storage space

And it only pursues the retrieval accuracy, but ignores the retrieval efficiency, which leads to the huge retrieval delay and low efficiency of the trained model, making it impossible to apply in reality

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0030] Such as figure 1 As shown, a general cross-modal retrieval model based on deep hashing, the model includes image model 1, text model 2, binary code conversion model 3 and Hamming space 4, in which:

[0031] Image model 1, extracting image features, abstracting original image features and semantics;

[0032] Text model 2, which converts text data into vector form, and extracts the features and semantics of the text;

[0033] Binary code conversion model 3, which converts the features and semantics extracted by the image and text models into binary codes, and then maps the data points in the original feature space of different modalities to the common Hamming space;

[0034] Hamming space 4, the common subspace of the image and text modal feature space, in which the Hamming distance between the query data and the original data binary hash code is calculated for similarity ranking.

[0035] The image model 1 mainly recommends the use of CNN models such as ResNet, DenseNe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a universal cross-modal retrieval model based on deep hash. The universal cross-modal retrieval model comprises an image model, a text model, a binary code conversion model and a Hamming space. The image model is used for the feature and semantic extraction of the image data; the text model is used for the feature and semantic extraction of the text data; the binary code conversion model is used for converting the original features into the binary codes; the Hamming space is a common subspace of images and the text data, and the similarity of the cross-modal data can be directly calculated in the Hamming space. According to the universal model for solving cross-modal retrieval by combining deep learning and Hash learning, the data points in an original feature space are mapped into the binary codes in the public Hamming space, similarity ranking is carried out by calculating the Hamming distance between the codes of the data to be queried and the codes of the original data, and therefore a retrieval result is obtained, and the retrieval efficiency is greatly improved. The binary codes are used for replacing the original data storage, so that the requirement of the retrieval tasks for the storage capacity is greatly reduced.

Description

technical field [0001] The invention relates to the field of cross-modal retrieval, especially the cross-modal retrieval of images and texts. Background technique [0002] In recent years, with the vigorous development of the Internet and the popularity of smart devices and social networks, multimedia data has exploded on the Internet. These massive data include various modal forms such as text, image, video, and audio, and the same thing will be described by many different modal data. These data are "heterogeneous and multi-source" in form, but interrelated semantically. People's demand for information acquisition is no longer satisfied with single-modal data retrieval, and the realization of cross-modal retrieval through knowledge collaboration of different modes has become a research hotspot in recent years. [0003] Deep learning has made breakthroughs in single-modal fields, such as natural language processing, imagery, and speech recognition. The powerful abstraction...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F16/953G06F40/30G06K9/62G06N3/04G06N3/08

CPCG06F16/953G06F40/30G06N3/049G06N3/08G06N3/045G06F18/213G06F18/214

Inventor 段友祥陈宁孙歧峰

Owner CHINA UNIV OF PETROLEUM (EAST CHINA)

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Universal cross-modal retrieval model based on deep hash

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology