A supervised fast discrete multimodal hash retrieval method and system

A multi-modal and supervised technology, applied in the field of cross-modal retrieval, can solve the problems of time-consuming and high computational complexity, and achieve the effect of enhancing discriminant, high learning efficiency, and avoiding computational complexity.

Inactive Publication Date: 2019-03-08
SHANDONG NORMAL UNIV
View PDF0 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, the hash codes obtained by these methods contain only limited semantic information
[0005] 2) High computational complexity
This means that such methods have to learn the hash code bit by bit, which can be time consuming when dealing with large datasets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A supervised fast discrete multimodal hash retrieval method and system
  • A supervised fast discrete multimodal hash retrieval method and system
  • A supervised fast discrete multimodal hash retrieval method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0054] This embodiment discloses a supervised fast discrete multimodal hash retrieval method, comprising the following steps:

[0055] Step 1: Obtain the multimodal training data set O train , where each sample contains pairs of multimodal data features, such as image and text;

[0056] Step 2: Using the joint multimodal feature map, the multimodal training dataset O train Projecting to a joint multimodal intermediate representation;

[0057] Described step 2 specifically comprises:

[0058] First, the multimodal training data set O train The data features of each modality in are transformed into a nonlinear embedding φ m (x (m) ):

[0059]

[0060] Among them, {x (m)} m=1,...,M is the training data set of the mth mode, and there are M modes in total, is the anchor point set (we randomly select a part of the samples in the training samples of the corresponding modality to form the anchor point set), N is the total number of training samples of the modality, P is t...

Embodiment 2

[0117] The purpose of this embodiment is to provide a computer system.

[0118] A computer system, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the program, it realizes:

[0119] Receive a multimodal training dataset, where each sample contains pairs of multimodal data features;

[0120] Using the joint multimodal feature map, project the multimodal training dataset into a joint multimodal intermediate representation;

[0121] For the joint multimodal intermediate representation of the multimodal training data set, construct a supervised fast discrete multimodal hash objective function; solve the objective function to obtain a hash function;

[0122] Receive the multimodal retrieval data set and the multimodal test data set, project the samples in it into a joint multimodal intermediate representation, and then project to the Hamming space according to the hash function to obtain the h...

Embodiment 3

[0124] The purpose of this embodiment is to provide a computer-readable storage medium.

[0125] A computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the following steps are performed:

[0126] Receive a multimodal training dataset, where each sample contains pairs of multimodal data features;

[0127] Using the joint multimodal feature map, project the multimodal training dataset into a joint multimodal intermediate representation;

[0128] For the joint multimodal intermediate representation of the multimodal training data set, construct a supervised fast discrete multimodal hash objective function; solve the objective function to obtain a hash function;

[0129] Receive the multimodal retrieval data set and the multimodal test data set, project the samples in it into a joint multimodal intermediate representation, and then project to the Hamming space according to the hash function to obtain the hash co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a supervised fast discrete multi-modal hash retrieval method and system. The method includes receiving a multi-modal training data set, wherein each sample contains a pair of multi-modal data features; projecting the multi-modal training dataset to a joint multi-modal intermediate representation by using a joint multi-modal feature map; for the joint multimodal intermediaterepresentation of multimodal training datasets, constructing a supervised fast discrete multimodal hash objective function; solving the objective function to obtain a hash function; receiving multimodal retrieval data set and multimodal test data set, projecting samples into joint multimodal middle representation, and then projecting them into Hamming space to obtain hash code according to hash function; based on hash codes, retrieving samples from multimodal test datasets in multimodal retrieval datasets. The invention learns discrete hash codes for heterogeneous multi-modal data, and ensures learning efficiency and retrieval precision at the same time.

Description

technical field [0001] The invention belongs to the technical field of cross-modal retrieval, and in particular relates to a supervised fast discrete multi-modal hash retrieval method and system. Background technique [0002] Due to its fast similarity calculation efficiency and low storage cost, hashing can significantly improve the speed of large-scale data retrieval. Therefore, many researchers have devoted to learning hashing techniques, especially applying them to single-modal and cross-modal retrieval. [0003] In multimedia retrieval, target data objects are usually described by heterogeneous multimodal features, where different modal features have their own attributes and can exhibit unique data characteristics from different aspects. For example, an image is usually represented by heterogeneous image and text features. A video can be fully represented by multiple features (such as image, text, audio and time channel, etc.). In order to support large-scale multime...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/43G06K9/62
CPCG06F18/214
Inventor 张化祥芦旭李静朱磊刘丽王振华郭培莲
Owner SHANDONG NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products