A target hierarchical tree-based unmanned aerial vehicle visual language navigation method
By employing a target-hierarchical tree-structured UAV visual-language navigation method, and utilizing a large language model and a multimodal encoder, the problem of aligning visual and textual information in complex environments for UAVs is solved, achieving efficient and accurate navigation decisions and target recognition.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIHANG UNIV
- Filing Date
- 2024-09-24
- Publication Date
- 2026-06-16
AI Technical Summary
In complex environments, drones struggle to accurately identify and understand visual targets in navigation commands, especially in multi-view and multi-granularity scenarios. Existing methods struggle to achieve fine-grained alignment between visual and textual information.
A visual language navigation method for UAVs based on a target hierarchy tree is adopted. The text features of navigation instructions are obtained through the target parsing module, a first-order logic program is generated using a large language model, a hierarchy tree is constructed and visual features are extracted by the target localization module, and finally, navigation decision is achieved by integrating navigation information through a multimodal encoder.
It improves the accuracy and understanding of navigation targets for UAVs in complex environments, enhances the quality of navigation decisions and system scalability, and reduces system upgrade and maintenance costs.
Smart Images

Figure CN119197529B_ABST