Image retrieval method based on deep cross-modal hashing based on joint semantic matrix

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A semantic matrix, image retrieval technology, applied in the field of deep learning and image retrieval, can solve problems such as unsatisfactory application effect

Active Publication Date: 2021-09-14

OCEAN UNIV OF CHINA

View PDF10 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although there have been in-depth research on the deep cross-modal hashing algorithm, the final application effect is not ideal.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0051] An image retrieval method based on deep cross-modal hashing (DCSJM) of joint semantic matrix, including the following steps (the specific process is as follows: figure 1 shown):

[0052] S1: Randomly obtain a batch of image-text pair data, and construct a label matrix T;

[0053] S2: Image and text data are sent to the pre-training model VGG19 layer model and Word2Vec model to obtain image and text features (such as figure 1 The upper left part of , specifically the process of obtaining image features through ImgCNN (image network) for image data, and obtaining text features for text through Text CNN (text network);

[0054] S3: Use the features obtained in S2 to construct a joint semantic matrix (such as figure 1 In the dotted box on the right, the image similarity matrix is obtained by calculating the cos distance of the image features. In the figure, I1, I2, and I3 are taken as examples. Similar texts are denoted by T1, T2, T3);

[0055] S4: Use the label matri...

Embodiment 2

[0058] Embodiment 2 (this embodiment is the specific development of embodiment 1)

[0059] An image retrieval method based on Deep Cross-Modal Hashing (DCSJM) of Joint Semantic Matrix, comprising the following steps:

[0060] S1: Use n to indicate that n image-text pair instances were used in the experiment, denoted as ,in represents the i-th instance in the image instance Indicates the i-th instance in the text instance. per image-text pair corresponding to a class vector , where c represents the number of categories, if the i-th instance is in the j-th category, then ,otherwise . Construct a label matrix T for each batch of data.

[0061] S2: The image and text data are respectively sent to the pre-training model VGG19 layer model and the Word2Vec model to obtain image and text features. First, some definitions for constructing the joint semantic matrix are introduced: with Represents the batch size; the specific description is as follows, using represent...

Embodiment 3

[0087] Embodiment 3 (this embodiment is verified by specific experimental data)

[0088] For the specific process of this embodiment, refer to Embodiment 2.

[0089] Experiments are performed on the widely used dataset Pascal Sentence. This dataset is a subset of Pascal VOC and contains 1000 pairs of image and text descriptions from 20 categories. In the experiment, 19-layer VGGNet is used to learn image data representation, and the 4096-dimensional feature learned by fc7 layer is used as the image representation vector. For text data, a sentence CNN is used to learn a 300-dimensional representation vector for each text.

[0090] Results on the Pascal Sentence dataset:

[0091] Validate the hyperparameters multiple times, and finally set the hyperparameters to = 0.0001, = 0.1, = 0.0001. In the experiment, the hyperparameters in other loss functions will be adjusted according to the actual situation.

[0092] figure 2 Shows the mAP values of different digits in ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image retrieval method based on joint semantic matrix deep cross-modal hashing. The method includes: randomly obtaining a batch of image-text pair data, and constructing a label matrix; sending them into the pre-training model VGG19 layer model and Word2Vec model to obtain image features and text features respectively, and constructing a joint semantic matrix; using the label matrix and The joint semantic matrix is used as supervision information to build a deep cross-modal supervised hashing framework, set an improved objective function, and supervise the training of network parameters; repeat the above until the number of training times reaches the set number, and obtain a well-trained deep cross-modal Supervised hashing model; after the image data to be retrieved is processed, it is input to the trained deep cross-modal supervised hashing model for retrieval, and the retrieval result is output. It is verified that the model proposed by the present invention has better retrieval performance than other existing baseline methods.

Description

technical field [0001] The invention belongs to the technical field of combining deep learning and image retrieval, and in particular relates to an image retrieval method based on joint semantic matrix-based deep cross-modal hashing. Background technique [0002] With the development of science and technology and the rapid development of the big data era, approximate nearest neighbor (ANN) methods play an important role in machine learning and image retrieval applications. Hashing has been widely studied by researchers due to its high efficiency and low storage properties in solving ANN search problems. The main principle of hashing is to map the data from the original space to the Hamming space, and preserve the similarity between the original space and the Hamming space as much as possible. Binary codes can be used for large-scale retrieval or other applications, which can not only greatly reduce storage space, but also improve search speed. [0003] In most current appl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F16/55G06F16/58G06K9/62G06N3/08

CPCG06F16/55G06F16/5866G06N3/08G06F18/214

Inventor 曹媛陈娜桂杰

Owner OCEAN UNIV OF CHINA

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Image retrieval method based on deep cross-modal hashing based on joint semantic matrix

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology