3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A natural language and 3D scene technology, applied in the field of artificial intelligence and computer vision, can solve the problem of insufficient 3D target positioning accuracy, and achieve the goal of overcoming multi-modal feature problems, improving accuracy, and breaking through the bottleneck of cross-modal feature domain differences. Effect

Active Publication Date: 2021-08-31

XIDIAN UNIV

View PDF8 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to provide a 3D scene target detection modeling and detection method based on natural language description to solve the problem of insufficient positioning accuracy of 3D targets in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0069] A graph network construction method is disclosed in this embodiment. On the basis of the above embodiments, the following technical features are also disclosed. The method includes the following sub-steps:

[0070] Step a: The input is the natural language description Q of the 3D scene, and the noun phrase is parsed by the offline language parser and relative phrases And use two-way GRU to encode separately to get the noun phrase feature representation and relational phrase feature representation i, j, N are positive integers, and N is the total number of noun phrases;

[0071] Step b: Establish a language scene graph with the noun phrase P as the node and the relational phrase R as the edge Linked Noun Phrase Features is a node feature and associates a relational phrase feature is an edge feature;

[0072] Step c: Update each noun phrase node p by aggregating all adjacent nodes and edge features that have edges with the specified noun phrase node through t...

Embodiment 2

[0098] Such as image 3 As shown, construct the visual relationship diagram of the 3D target candidate frame Update each node o by aggregating the features of all neighboring nodes and edges through the attention mechanism i,k , get node features with global context awareness

[0099] Such as image 3 As shown, the language prior figure 2 Each noun node in corresponds to selecting the top 25 candidate boxes with the highest score as the nodes of the visual relationship diagram of the three-dimensional target candidate boxes, respectively o i,k , where i=1,2,3, k=1,...,25. According to the existence rules of edges in the language prior graph, construct edges u i,j,k,l , where i=1,2,3, j=1,2,3, k=1,...,25, l=1,...,25. Construct the visual relationship diagram of the 3D target candidate frame, and update each node by aggregating all adjacent nodes and edge features through the attention mechanism. Based on each updated pair of nodes with edges, the node features and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a modeling and detection method for three-dimensional scene target detection based on natural language description. The method includes: ① designing a language prior graph network, which is used to represent the generated noun phrases and relational phrases in graphs; ② constructing a three-dimensional object-oriented circumscribed candidate box initialization prediction network in a point cloud scene; ③ updating based on the language prior graph Guided by noun phrase features, redundant cropping and updating of 3D object initialization candidate frames; ④ Constructing a visual relationship graph network of 3D target candidate frames; Match the similarity score with the edge to locate the final 3D object. The present invention efficiently captures global context dependencies by constructing language prior graphs and visual relationship graphs, and also develops a cross-modal graph matching strategy, which avoids increasing the amount of calculation and effectively improves the accuracy of large-scale 3D point cloud scenes. Target positioning accuracy.

Description

technical field [0001] The invention belongs to the field of artificial intelligence and computer vision, and in particular relates to a three-dimensional scene target detection modeling and detection method based on natural language description. Background technique [0002] In recent years, with the widespread application of lidar and depth cameras, mobile robots can better obtain 3D information of work scenes, and 3D point cloud scene understanding based on deep learning has attracted a lot of attention. Humans issue instructions to mobile robots through natural language, and mobile robots locate target objects in the three-dimensional scene based on natural language description information, which will greatly improve the intelligence level of mobile robots. The 3D point cloud object location based on the natural language description has some problems, such as how to abstract the relationship characteristics of the free language description, and how to integrate the natur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06T7/73G06K9/62G06F40/289

CPCG06T7/73G06F40/289G06T2207/10028G06T2207/20081G06V2201/07G06F18/22G06F18/253

Inventor 冯明涛张亮朱光明宋娟沈沛意

Owner XIDIAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology