Image and text retrieval method based on graph neural network structure modeling

A network structure and text technology, applied in digital data information retrieval, special data processing applications, instruments, etc., can solve problems such as limiting the overall similarity calculation of images and texts, rarely considering alignment, and affecting retrieval accuracy.

Active Publication Date: 2020-06-23
UNIV OF SCI & TECH OF CHINA
View PDF9 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, these methods seldom consider fine-grained alignment between visual elements, text elements
This limits the overall similarity calculation of images and texts, affecting the final retrieval accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image and text retrieval method based on graph neural network structure modeling
  • Image and text retrieval method based on graph neural network structure modeling
  • Image and text retrieval method based on graph neural network structure modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0014] An embodiment of the present invention provides an image and text retrieval method based on graph neural network structure modeling, such as figure 1 As shown, the process of the whole method, the main process of training and testing phase is the same, specifically:

[0015] Training phase: extract the visual elements and initial text elements of a single picture and text pair, and introduce an attention mechanism to re-represent each text e...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an image and text retrieval method based on graph neural network structure modeling, and the method comprises the steps: employing an attention mechanism to represent the fine-grained vision extracted from a picture and a text, and enabling text elements to calculate the similarity of the picture and the text better; a graph structure is constructed in a self-adaptive modethrough visual and text elements, features are updated through a graph convolution method, and the intra-modal and inter-modal relation of the visual and text elements can be better considered; a constraint mechanism is introduced between different pictures and text pairs and in the visual and text element alignment process, fine-grained text elements can correspond to corresponding picture areaseasily, then the reliability of picture and text level similarity calculation is improved, and the accuracy of picture and text retrieval is improved.

Description

technical field [0001] The invention relates to the technical field of multimedia retrieval, in particular to an image and text retrieval method based on graph neural network structure modeling. Background technique [0002] With the influx of massive multimedia data into the Internet, multimedia retrieval technologies spanning multiple different modal data (visual, text, voice, etc.) play an increasingly important role. [0003] Traditional image retrieval techniques often use tags to retrieve images. This process is often unidirectional and can only utilize discrete labeled data. The two-way retrieval of images and texts contains richer semantics and is more in line with the human habit of using natural language. However, there is a big difference between the data of the two different modalities of vision and text. In order to achieve cross-modal retrieval of images and texts, it is necessary to integrate computer vision and natural language understanding well. [0004...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/432G06F16/483G06F16/438
CPCG06F16/432G06F16/434G06F16/483G06F16/438Y02D10/00
Inventor 张勇东张天柱魏曦
Owner UNIV OF SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products