Cross-modal retrieval method based on graph convolutional neural network

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and cross-modal technology, applied in the field of end-to-end cross-modal retrieval, can solve the problems of insufficient utilization of multi-modal data, poor data representation ability, and low retrieval accuracy, and achieve cross-modal The effect of strong retrieval and representation ability and narrowing the semantic gap

Active Publication Date: 2020-08-28

ZHEJIANG UNIV OF TECH

View PDF3 Cites 18 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] In order to overcome the shortcomings of the existing cross-modal retrieval methods, such as insufficient utilization of multi-modal data, poor ability to represent data between different modalities, and low retrieval accuracy, the present invention provides a high-precision, fully-utilized multi-modal The cross-modal retrieval method based on graph convolutional neural network with high data and strong representation ability adopts the latest advanced neural network technology based on graph learning, which can not only extract deep semantic features more effectively, but also mine features in the model. potential correlations in

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0032] In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be further described below in conjunction with specific embodiments and with reference to the accompanying drawings.

[0033] Reference figure 1 , A cross-modal retrieval method based on graph convolutional neural networks, including four processes: network construction, data set preprocessing, network training, retrieval and accuracy testing.

[0034] The multi-modal data set used in this implementation case contains a total of 4,500 pairs of multi-modal data, and each pair of multi-modal data includes an image, a set of image-related data, a paragraph of text, and a set of text-related data. Each pair is marked with a category label, and the label has three categories.

[0035] The cross-modal retrieval method based on graph convolutional neural network includes the following steps:

[0036] Step 1. Network construction, the process is as follows: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a cross-modal retrieval method based on a graph convolutional neural network. The cross-modal retrieval method comprises four processes of network construction, data set preprocessing, network training and retrieval and precision testing. Semantic representations in an image mode and a text mode are respectively learned by using a graph convolutional neural network; the cross-modal retrieval method can help to process the potential relationship among modal features, introduces the associated data of the third modal into the cross-modal retrieval method to reduce the semantic gap among the modals, and can significantly improve the accuracy and stability of cross-modal retrieval, thereby realizing accurate cross-modal retrieval.

Description

Technical field [0001] The present invention relates to the field of multimodal retrieval, in particular to an end-to-end cross-modal retrieval method. Background technique [0002] Cross-modal retrieval is a method that uses one modal data to query and returns retrieval results in other different modalities. It is widely used in matching image and text data. For example, in a traditional image-to-text cross-modal retrieval task, the most similar text is output as the output. In recent years, with the rapid development of deep learning, the current cross-modal retrieval methods mostly use neural networks to directly retrieve multi-modal data without relying on tags, but their methods simply combine cross-modal retrieval algorithms with deep neural networks. The combination of networks, such as selecting several features in the image to perform dimensionality reduction operations, etc., most of them do not make full use of the potential depth information in the multi-modal data, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06N3/08G06K9/62G06F16/48G06F16/45G06F16/43

CPCG06N3/08G06F16/45G06F16/43G06F16/48G06N3/045G06F18/251G06F18/214Y02D10/00

Inventor 白琮周鹏飞

Owner ZHEJIANG UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Cross-modal retrieval method based on graph convolutional neural network

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology