Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description

A natural language and 3D scene technology, applied in the field of artificial intelligence and computer vision, can solve the problem of insufficient 3D target positioning accuracy, and achieve the goal of overcoming multi-modal feature problems, improving accuracy, and breaking through the bottleneck of cross-modal feature domain differences. Effect

Active Publication Date: 2021-08-31
XIDIAN UNIV
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a 3D scene target detection modeling and detection method based on natural language description to solve the problem of insufficient positioning accuracy of 3D targets in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • 3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description
  • 3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description
  • 3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] A graph network construction method is disclosed in this embodiment. On the basis of the above embodiments, the following technical features are also disclosed. The method includes the following sub-steps:

[0070] Step a: The input is the natural language description Q of the 3D scene, and the noun phrase is parsed by the offline language parser and relative phrases And use two-way GRU to encode separately to get the noun phrase feature representation and relational phrase feature representation i, j, N are positive integers, and N is the total number of noun phrases;

[0071] Step b: Establish a language scene graph with the noun phrase P as the node and the relational phrase R as the edge Linked Noun Phrase Features is a node feature and associates a relational phrase feature is an edge feature;

[0072] Step c: Update each noun phrase node p by aggregating all adjacent nodes and edge features that have edges with the specified noun phrase node through t...

Embodiment 2

[0098] Such as image 3 As shown, construct the visual relationship diagram of the 3D target candidate frame Update each node o by aggregating the features of all neighboring nodes and edges through the attention mechanism i,k , get node features with global context awareness

[0099] Such as image 3 As shown, the language prior figure 2 Each noun node in corresponds to selecting the top 25 candidate boxes with the highest score as the nodes of the visual relationship diagram of the three-dimensional target candidate boxes, respectively o i,k , where i=1,2,3, k=1,...,25. According to the existence rules of edges in the language prior graph, construct edges u i,j,k,l , where i=1,2,3, j=1,2,3, k=1,...,25, l=1,...,25. Construct the visual relationship diagram of the 3D target candidate frame, and update each node by aggregating all adjacent nodes and edge features through the attention mechanism. Based on each updated pair of nodes with edges, the node features and the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a modeling and detection method for three-dimensional scene target detection based on natural language description. The method includes: ① designing a language prior graph network, which is used to represent the generated noun phrases and relational phrases in graphs; ② constructing a three-dimensional object-oriented circumscribed candidate box initialization prediction network in a point cloud scene; ③ updating based on the language prior graph Guided by noun phrase features, redundant cropping and updating of 3D object initialization candidate frames; ④ Constructing a visual relationship graph network of 3D target candidate frames; Match the similarity score with the edge to locate the final 3D object. The present invention efficiently captures global context dependencies by constructing language prior graphs and visual relationship graphs, and also develops a cross-modal graph matching strategy, which avoids increasing the amount of calculation and effectively improves the accuracy of large-scale 3D point cloud scenes. Target positioning accuracy.

Description

technical field [0001] The invention belongs to the field of artificial intelligence and computer vision, and in particular relates to a three-dimensional scene target detection modeling and detection method based on natural language description. Background technique [0002] In recent years, with the widespread application of lidar and depth cameras, mobile robots can better obtain 3D information of work scenes, and 3D point cloud scene understanding based on deep learning has attracted a lot of attention. Humans issue instructions to mobile robots through natural language, and mobile robots locate target objects in the three-dimensional scene based on natural language description information, which will greatly improve the intelligence level of mobile robots. The 3D point cloud object location based on the natural language description has some problems, such as how to abstract the relationship characteristics of the free language description, and how to integrate the natur...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06T7/73G06K9/62G06F40/289
CPCG06T7/73G06F40/289G06T2207/10028G06T2207/20081G06V2201/07G06F18/22G06F18/253
Inventor 冯明涛张亮朱光明宋娟沈沛意
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products