Unlock instant, AI-driven research and patent intelligence for your innovation.

Image visual relationship anaphora positioning method based on attention pyramid graph network

A technology of image vision and positioning method, which is applied in the field of image visual relationship reference positioning, which can solve problems such as large variance, difficult visual relationship distinction, and difficult model learning, so as to achieve the effect of improving model accuracy and positioning accuracy

Active Publication Date: 2020-06-05
TONGJI UNIV
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Symmetric stacked attention transfer methods may not be able to effectively localize these small-sized entities using a single small-scale attention feature map
In addition, the symmetric stacking attention transfer method models the relationship based on the visual features in the image, and the intra-class variance of the visual features of a specific relationship category in the visual relationship triplet is large, and the use of visual features makes the learning of the model difficult. Larger and difficult to distinguish visual relationships

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image visual relationship anaphora positioning method based on attention pyramid graph network
  • Image visual relationship anaphora positioning method based on attention pyramid graph network
  • Image visual relationship anaphora positioning method based on attention pyramid graph network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments. This embodiment is carried out on the premise of the technical solution of the present invention, and detailed implementation and specific operation process are given, but the protection scope of the present invention is not limited to the following embodiments.

[0034] The present embodiment provides a method for locating image visual relationship referencing based on an attention pyramid diagram network, the method processes an input image through an attention pyramid diagram network model, and obtains a corresponding image visual relation referencing entity location map to realize Refers to positioning.

[0035] In this method, the structure of the attention pyramid network model is as follows: figure 1 As shown, including attention feature pyramid network (Attentipn Pyramid Networks) and relationship conduction graph network (Relationship...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to an image visual relationship anaphora positioning method based on an attention pyramid graph network. The image visual relationship anaphora positioning method comprises the following step of processing an input picture through using an attention pyramid graph network model to obtain a corresponding image visual relationship anaphora entity positioning graph, so as to realize anaphora positioning, wherein the attention pyramid graph network model comprises an attention feature pyramid network and a relationship conduction graph network, the attention feature pyramid network is used for acquiring a multi-scale attention feature graph from the input picture, and the relation conduction graph network obtains a final visual relation referring entity positioning graph based on the multi-scale attention feature graph. Compared with the prior art, the image visual relationship anaphora positioning method has the advantages of high positioning precision, high robustness and the like.

Description

technical field [0001] The invention relates to a method for referring to and locating visual relationship of an image, in particular to a method for referring and locating visual relationship of an image based on an attention pyramid network. Background technique [0002] In recent years, expressive reference understanding has received increasing attention in the fields of computer vision and natural language processing. This task is aimed at locating specific referent entities. This technology can help eliminate the ambiguity between entities of the same category, thus playing an important role in application scenarios such as image retrieval in the security field and robot human-computer interaction. However, traditional expressive reference understanding tasks need to evaluate the model's natural language and computer vision modules separately, so it is difficult to judge whether the error is caused by the language module or the vision module. To alleviate the need for ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/58G06N3/08
CPCG06F16/58G06N3/084
Inventor 王瀚漓朱健
Owner TONGJI UNIV