Relation visual attention mechanism-based scene graph generation method

A visual attention mechanism and scene graph technology, applied in the field of computer vision, can solve problems such as poor network interpretability and lack of attention mechanism, achieve strong interpretability, avoid redundant interaction between predictions and features, and improve accuracy Effect

Active Publication Date: 2020-04-10
XIDIAN UNIV
View PDF4 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Since neither of the above two methods has established a suitable attention mechanism, the features used by the network for the final classification relationship do not really focus on the area where the two targets are related, resulting in poor interpretability of the network.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Relation visual attention mechanism-based scene graph generation method
  • Relation visual attention mechanism-based scene graph generation method
  • Relation visual attention mechanism-based scene graph generation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The embodiments and effects of the present invention will be further described in detail below with reference to the accompanying drawings.

[0043] On the basis of the attention mechanism, the present invention combines the characteristics that each pair of relationship needs to have the interaction between the subject and the object in the task of the scene graph, and considers that the place where the relationship occurs must be in a contact or close area between the two targets; in the target detection Based on the relationship attention transfer function, a relationship attention transfer function is proposed. By learning the relationship attention transfer function alternately and iteratively, not only a better relationship representation can be learned in the end, but also the relationship representation can better correspond to the real relationship between the two targets. area. Its implementation plan is to first construct the image features of the data set; o...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a relation visual attention mechanism-based scene graph generation method and mainly aims to solve the problem that redundant relation prediction and interpretability are poorin the prior art. According to the embodiments of the invention, the method includes the following steps of: 1) obtaining the category and boundary frame of targets in images through target detection,and establishing a full-connection relation graph; 2) carrying out sparsification on the relational graph through analyzing a data set to obtain sparse relational graph representations; (3) learninga relation attention transfer function through alternate iteration, transferring subjects and objects to relation occurrence sites through union set features, and learning accurate relation representations; and 4) classifying the learned relation representations, and combining the relation representations into a final scene graph. According to the method, on the basis of the internal relationshipof the relation occurrence of two targets, a relation attention mechanism is established to accurately pay attention to a region where relations occur; a scene graph is accurately generated; the interpretability of a network is improved; and the method can be used for image description and visual question and answer tasks.

Description

technical field [0001] The invention belongs to the field of computer vision, and in particular relates to a method for generating a scene graph, which can be used for image description and visual question answering tasks. Background technique [0002] With the development of deep learning, the computer's understanding of images has reached a new level. From object detection to semantic segmentation to instance detection, computer vision has made great progress. But there are still many problems for deeper image understanding. Since the relationship between objects in the image does not exist independently, tasks such as object detection cannot grasp the relationship between instances, such as a person carrying a bag and a person holding a bag. Although the categories detected by the object may be the same, the relationship between them The relationship classes are different. In order to make computers understand images further like humans, Johnsn et al. proposed the scen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06N3/04
CPCG06N3/045G06F18/24G06F18/214
Inventor 刘芳李玲玲王思危焦李成陈璞华古晶刘旭郭雨薇
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products