Visual relation detection method and device based on scene graph high-order semantic structure

A detection method and scene graph technology, applied in the field of image processing, can solve problems such as difficulty in obtaining the correct type and quantity of triplet labels, achieve the effect of optimizing computational complexity, simple and direct position encoding processing, and reducing hardware requirements

Active Publication Date: 2021-08-10
SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
View PDF9 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In order to solve the technical problem that it is difficult to obtain the correct type and quantity of annotations and complete triplet annotations, the present invention proposes a visual relationship detection method and device based on the high-order semantic structure of the scene graph

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual relation detection method and device based on scene graph high-order semantic structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036]In order to have a clearer understanding of the technical features, purposes and effects of the present invention, the specific implementation manners of the present invention will now be described with reference to the accompanying drawings.

[0037] The visual relationship detection method based on the high-order semantics of the scene graph proposed by the embodiment of the present invention specifically includes:

[0038] S1, visual feature extraction, predict the category and position of all objects in the picture through the convolutional neural network CNN and the regional convolutional neural network RCNN, the category of the object is a number, generally by sequentially encoding the objects that may appear in the input data Obtained, the position of the object is a box, which is determined by two points, namely the upper left corner and the lower right corner of the box, each point includes the values ​​of the abscissa and ordinate, and at the same time, it also ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a visual relation detection method and device based on a high-order semantic structure of a scene graph, and the algorithm comprises the steps: predicting the types and positions of all objects in a picture, outputting a visual feature vector corresponding to each object, and carrying out the pairing operation of every two of all detected objects; based on a pairing result, extracting a combined visual feature vector, and encoding the position to obtain a position code; inputting the categories of all the objects into a hierarchical semantic clustering algorithm, and processing to obtain a high-level semantic feature vector corresponding to each object; performing semantic coding on the output of the hierarchical semantic clustering algorithm; generating a relation classifier weight; and combining the visual feature vector, the joint visual feature vector and the position code into a unified feature vector, carrying out point product operation on the unified feature vector by using the relation classifier weight, and finally obtaining a relation conditional probability between every two objects as a scene graph.

Description

technical field [0001] The invention relates to the field of image processing, in particular to a visual relationship detection method and device based on a high-order semantic structure of a scene graph. Background technique [0002] The main goal of the visual relationship detection task is to identify and localize the content of the visual triplet relationship (subject, relation, object) present in the image. Recognition refers to identifying the category attributes of the target object, and positioning is to return the bounding box of the target object. Understanding a visual scene is often more than just recognizing individual objects, and even a perfect object detector would struggle to perceive the slight difference between a person feeding a horse and a person standing next to it. Learning rich semantic relationships among these objects is what visual relationship detection is all about. The key to a deeper understanding of the visual scene is to build a structured...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/00G06N3/04G06N3/08
CPCG06N3/08G06V20/00G06N3/045G06F18/23G06F18/24G06F18/214
Inventor 袁春魏萌
Owner SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products