3D Scene Object Detection Modeling and Detection Method Based on Natural Language Description
A natural language and 3D scene technology, applied in the field of artificial intelligence and computer vision, can solve the problem of insufficient 3D target positioning accuracy, and achieve the goal of overcoming multi-modal feature problems, improving accuracy, and breaking through the bottleneck of cross-modal feature domain differences. Effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0069] A graph network construction method is disclosed in this embodiment. On the basis of the above embodiments, the following technical features are also disclosed. The method includes the following sub-steps:
[0070] Step a: The input is the natural language description Q of the 3D scene, and the noun phrase is parsed by the offline language parser and relative phrases And use two-way GRU to encode separately to get the noun phrase feature representation and relational phrase feature representation i, j, N are positive integers, and N is the total number of noun phrases;
[0071] Step b: Establish a language scene graph with the noun phrase P as the node and the relational phrase R as the edge Linked Noun Phrase Features is a node feature and associates a relational phrase feature is an edge feature;
[0072] Step c: Update each noun phrase node p by aggregating all adjacent nodes and edge features that have edges with the specified noun phrase node through t...
Embodiment 2
[0098] Such as image 3 As shown, construct the visual relationship diagram of the 3D target candidate frame Update each node o by aggregating the features of all neighboring nodes and edges through the attention mechanism i,k , get node features with global context awareness
[0099] Such as image 3 As shown, the language prior figure 2 Each noun node in corresponds to selecting the top 25 candidate boxes with the highest score as the nodes of the visual relationship diagram of the three-dimensional target candidate boxes, respectively o i,k , where i=1,2,3, k=1,...,25. According to the existence rules of edges in the language prior graph, construct edges u i,j,k,l , where i=1,2,3, j=1,2,3, k=1,...,25, l=1,...,25. Construct the visual relationship diagram of the 3D target candidate frame, and update each node by aggregating all adjacent nodes and edge features through the attention mechanism. Based on each updated pair of nodes with edges, the node features and the...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com