Distributed multi-view visual radar unmanned aerial vehicle positioning method and system integrating sensing and control
By employing a distributed multi-view vision-radar fusion method, and leveraging the complementary characteristics of radar and cameras, high-precision and low-cost UAV positioning is achieved, overcoming the limitations of single-modal positioning and meeting wide-area requirements.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- PEKING UNIV
- Filing Date
- 2024-12-30
- Publication Date
- 2026-06-30
AI Technical Summary
Existing UAV positioning methods suffer from inaccurate positioning and high costs due to the limitations of single-mode positioning, and are difficult to avoid perception failures, thus failing to meet wide-area requirements.
A distributed multi-view vision-radar UAV positioning method integrating sensing and control is adopted. Through multi-view modal fusion, using strong directional pulse Doppler radar and visible light camera, combined with distributed sensing terminals and computing platforms, high-quality sensing and high-precision positioning are achieved.
It significantly improves the positioning range and accuracy of drones, reduces system costs, and maintains high flexibility and low computational complexity in complex environments.
Smart Images

Figure CN122307536A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of intelligent Internet of Things technology, and relates to UAV positioning technology, and particularly to a distributed multi-view vision-radar UAV positioning method and system that integrates sensing and control. Background Technology
[0002] Drones have become an important component of low-altitude airspace and a foundation for the development of the low-altitude economy, with a total market value exceeding $4 billion and projected to grow at an annual rate of 2.26%. However, due to their ease of acquisition and ability to carry various payloads, the misuse of drones has also caused serious safety problems. Nearly a thousand drone incidents are reported globally each month, with unauthorized drones potentially causing serious harm to individuals and facilities. Furthermore, drone interference in cyberspace presents new security threats. The main reason for the increasingly serious threats posed by drones is their wide flight range, high speed, remote control capability, and difficulty in detection, making effective drone monitoring difficult and thus hindering efforts to prevent drones from entering no-fly zones and causing serious damage.
[0003] Currently, UAV localization methods focus on single-modal localization, such as visual localization and radar localization. The modal limitations of these methods, such as the limited classification capabilities of radar and the narrow field of view of vision, can lead to inaccurate localization and high costs. Some cross-modal methods overcome these limitations by utilizing complementary cross-modal properties. However, these methods typically require meticulous mining of insignificant UAV features, resulting in high computational complexity and potentially unsatisfactory results. This is mainly due to the "weakest link" effect of single-view deployment, where localization results are severely affected by the worst-performing perception capability; individual perception is constrained by the worst-performing modality, and the global result is influenced by the worst-performing perception outcome.
[0004] Therefore, existing UAV positioning technology does not utilize multi-view modalities for positioning, making it difficult to reduce the "weakest link" effect of current methods, resulting in perception failure, poor positioning algorithm performance, and difficulty in meeting wide-area requirements. Summary of the Invention
[0005] To address the shortcomings of the existing technologies, this invention provides a distributed multi-view vision-radar UAV positioning method and system that integrates sensing and control. By utilizing multi-view modalities for UAV positioning, it can avoid sensing failures and meet wide-area requirements, significantly improving the positioning range and accuracy of UAVs.
[0006] To achieve the above objectives, the present invention adopts the following technical solution:
[0007] A sensor-control integrated distributed multi-view vision-radar UAV localization method and system are disclosed. Specifically, the sensor-control integrated distributed multi-view vision-radar UAV localization system involves a system architecture, a localization workflow, a sensor control strategy providing high-quality perceptual input for the localization process, and a sensor-control integrated distributed multi-view vision-radar UAV localization method. Specifically, the sensor-control integrated distributed multi-view vision-radar UAV localization method is a wide-area UAV localization algorithm based on an abstraction-binding two-stage multi-view collaborative fusion model, ensuring low complexity, high flexibility, and high accuracy in the collaborative fusion of distributed cross-modal perceptual data and high-precision, wide-area localization.
[0008] The architecture of the integrated sensing and control distributed multi-view vision-radar UAV positioning system includes: radar, vision camera, data transmitter, data processor, distributed sensing terminal, and computing platform. Unlike traditional UAV detection systems, this system supplements the low-saliency modal perception results from a single viewpoint through multi-view fusion, improving the saliency and range of UAV perception. Specifically, it uses a highly directional pulse Doppler radar with global perception capabilities and a fine-grained but limited field-of-view visible light camera as cross-modal sensors. Through distributed multi-view deployment, it achieves effective multi-view cross-modal UAV positioning and monitoring. The radar and vision camera are mounted on the distributed sensing terminal for real-time detection and acquisition of radar or visual image data of the current area. The data transmitter, mounted on the distributed sensing terminal and computing platform, enables the mutual transmission of information and data such as radar points, visual perception images, and operational commands between the terminal and the platform. The data processor, mounted on the computing platform, processes and analyzes the results obtained from the radar and vision camera, determines the control parameters of the vision camera, and outputs the UAV's position. The distributed sensing terminal system comprises multiple distributed sensing terminals, each equipped with a low-altitude Doppler phased array radar or a visual camera for real-time monitoring, and a data transmitter to acquire sensing results in real time and send them to the computing platform, while simultaneously receiving sensor control commands from the computing platform. In practical deployment, since the radar's sensing range is greater than that of the visual camera, the system adopts a "one-to-many" deployment approach. That is, if one sensing terminal in the distributed sensing terminal system carries a radar sensor, other sensing terminals within the radar's range carry lower-cost visual cameras. Therefore, on the one hand, the "one-to-many" deployment approach reduces system costs; on the other hand, it fully utilizes the radar's three-dimensional sensing capabilities and the visual camera's fine-grained sensing capabilities, achieving sensory complementarity and ensuring that multiple visual cameras within the radar's sensing range can participate in the fine-grained sensing of potential UAVs. The computing platform receives real-time distributed sensing detection data, calculates the data from each sensing terminal equipped with a visual camera, and sends back control commands to the distributed sensing terminals to achieve higher-quality visual information (i.e., clearer, larger-scale UAV imaging results), realizes distributed multi-view visual-radar sensing information fusion, and provides the UAV's position information from each visual camera's perspective.
[0009] The radar includes low-altitude Doppler phased array radar based on the Doppler principle, which can provide the three-dimensional position coordinates p of potential UAVs in the airspace with the radar as the center and its effective range as the radius. t =[x,y,z] t (where x, y, z represent the three-dimensional coordinates with the radar as the origin, and t represents the current timestamp), based on the flight speed v t and direction D t =[ε, φ] t(Where ε and φ represent the horizontal and vertical directions, respectively), the potential UAV, i.e., the detected target, may be a UAV with a deviated position, or it may be other low-altitude targets or false detection coordinates. At the same time, low-altitude Doppler phased array radar has the characteristic of low sampling rate (e.g., the sampling interval needs to be more than 6 seconds).
[0010] A visual camera consists of a gimbal and an imaging module. The imaging module can achieve high frame rate 2D imaging and acquire fine-grained images, but its sensing range is lower than that of radar. The gimbal allows for adjustment of the camera's pitch, azimuth, and zoom levels. In summary, radar has 3D and wide-area sensing capabilities, but its imaging granularity (providing object coordinates) is coarse and its imaging speed is slow. A visual camera has fine-grained sensing capabilities and a fast imaging speed, but its sensing range is limited, requiring adjustment by a gimbal, and its overall range is smaller than that of radar; furthermore, the imaging result is 2D.
[0011] Data transmitter: Used for encoding, sending, receiving, and decoding information and data. It mainly includes an integrated data transceiver device. Based on existing wired or wireless communication protocols, it can encode, send, receive, and decode information and data such as radar points, visual perception images, and operation commands.
[0012] Data processor: including a vision camera control module based on the motion characteristics of the UAV, and a distributed multi-view vision-radar UAV positioning method integrating sensing and control.
[0013] Distributed Sensing Terminals: This system comprises multiple distributed sensing terminals. Each terminal includes a low-altitude Doppler phased array radar or visual camera for real-time monitoring, and a data transmitter to acquire sensing results and send them to a computing platform, while simultaneously receiving sensor control commands from the platform. In practical deployment, since the radar's sensing range is greater than that of a visual camera, a "one-to-many" deployment approach is adopted. This means that if one sensing terminal in the distributed system carries a radar sensor, other sensing terminals within the radar's range carry lower-cost visual cameras. Therefore, this "one-to-many" deployment reduces system costs while fully utilizing the radar's 3D sensing capabilities and the visual camera's fine-grained sensing capabilities, achieving sensory complementarity and ensuring that multiple visual cameras within the radar's sensing range can participate in the fine-grained sensing of potential UAVs.
[0014] The computing platform is equipped with a data transmitter and a data processor. The data transmitter receives sensing data from the distributed sensing terminals and inputs it into the data processor, while also sending control commands from the data processor back to the distributed sensing terminals. The data processor includes sensor control strategies and a distributed multi-view vision-radar UAV localization method. The vision camera control module determines the monitoring area to be monitored by the vision camera based on the candidate UAV positions given by the radar, calculates the parameters of each sensing terminal carrying a vision camera, determines its pitch angle, azimuth angle, and zoom ratio, and transmits these parameters to the distributed sensing terminals via the data transmitter. This enables the distributed sensing terminals to control and adjust the vision cameras. The integrated sensing and control distributed multi-view vision-radar UAV localization method achieves the fusion of distributed multi-view vision-radar sensing information and provides the UAV position information from the perspectives of each vision camera.
[0015] The integrated sensing and control distributed multi-view vision-radar UAV positioning system effectively reduces the overall high latency issues caused by factors such as low radar sensor sampling speed, varying visual camera adjustment delays (each visual camera needs to rotate at different azimuth and pitch angles and perform different scaling ratios, resulting in different data delivery times), and varying transmission delays within the distributed system. The system's workflow includes the following steps:
[0016] 1) Radar sampling and transmission: Each time the radar sensing terminal samples, it transmits the sampling results to the computing platform.
[0017] 2) Selection of candidate points for drones: For radar sensing terminals, the radar detection result point with the closest average distance to all monitoring devices in the radar sensing results is selected as the next candidate point. The selection process can be expressed as follows:
[0018]
[0019] Where, p t For radar candidate points with timestamp t, The coordinates of the i-th sensing terminal, where n is the total number of sensing terminals. To find satisfaction The smallest radar candidate point.
[0020] 3) Calculation and transmission of control parameters of visual camera control module based on UAV motion characteristics: The control parameters of the visual camera sensing terminal corresponding to the radar terminal are calculated based on the radar candidate point coordinates and UAV motion characteristics, and the parameters are transmitted back to each visual camera sensing terminal through the data transmitter.
[0021] 4) Sensing terminal control adjustment and sensing result feedback: Based on the control parameters, each sensing terminal adjusts its own visual camera parameters. After the adjustment is completed, the sensing results are transmitted to the computing platform in real time.
[0022] 5) Distributed Multi-View Vision-Radar UAV Localization Platform: Based on radar perception results and multi-view visual information, the distributed multi-view vision-radar UAV localization platform fuses and calculates this information before outputting the result. Specifically, during the fusion calculation process, the first transmitted perception results are fused, and subsequently received perception results are mapped to the already fused result, ensuring that the fusion process does not require waiting for all views to be in place. The fused result is then fused with each view to finally output the localization result for each view. Notably, a maximum time T is set during the process; for perception terminals that have not transmitted data after the time limit, their view is not included in the localization process.
[0023] 6) Loop: Eliminate potential UAV candidate points in the intersection of the perception areas of the visual perception terminal, and use the remaining results to select the next candidate point.
[0024] 7) If the radar information is updated, return to step 1) and start positioning again.
[0025] This invention designs a visual camera control module based on the motion characteristics of unmanned aerial vehicles (UAVs). Based on these characteristics, the module predicts the UAV's flight range and calculates control parameters for each visual camera, including pitch, rotation, and zoom, enabling more targeted image sampling and providing high-quality perception information for subsequent fusion localization. Radar perception results can provide the location of potential UAVs. However, the real-time movement of the UAV can cause it to move out of the line of sight during the visual camera adjustment process (i.e., the time difference between selecting a candidate point and the visual camera being in position) and radar perception delay (candidate points are selected from radar perception results, but radar information is not updated in real time, and there is a time difference between sampling the current candidate point and the time the visual camera uses it for fusion localization). This makes it difficult to perceive UAV information.
[0026] In this scenario, a vision camera control module based on the drone's motion characteristics considers the drone's dynamic equations to obtain its flight range over future time periods and implements sensor control guided by the drone's motion. Since the drone's movement is limited by internal (torsional) and external (drag, gravity) forces, its speed and direction follow certain patterns, making its flight range predictable in the near future. Specifically, based on the flight speed v... t and direction D t =[ε, φ] t (Where ε and φ represent the horizontal and vertical directions, respectively), and t represents the current timestamp. The possible location of the drone within a certain future timeframe can be predicted, as follows:
[0027]
[0028] in, Indicates that the drone is in the time slot Below, at roll angle ω, pitch angle τ, and yaw angle The flight speed at that time; R is the rotation matrix. Indicates flight speed v t According to direction D t The velocity decomposed into x, y, z coordinates. T max This represents the maximum reference thrust of the drone. g represents the acceleration due to gravity, while... This represents a linear damping term, determined using a fixed reference value. The superscript T indicates the transpose operation.
[0029] For each time slot Calculate the velocities corresponding to the pitch and roll angles of the drone in six directions: forward, backward, left, right, up, and down. Then, based on the assumption of linear velocity change, use the formula... This is used to calculate the flight distance in six directions. The flight range of a drone can be represented as a region... Sphere centered It is the tangent sphere with flight distances in six directions.
[0030] Based on this, the line of sight of the visual camera can be represented as Where Ψ represents the horizontal rotational velocity, ρ represents the vertical rotational velocity, t represents the current moment, and l t Indicates the current line of sight, l St This indicates the direction of the next line of sight. The direction of the line connecting the visual camera and the drone's center of motion is indicated as follows:
[0031]
[0032] Indicates the drone sports center, The line connecting the center of motion of the visual camera and the drone represents the direction, arccos(·) represents the arccosine function, arctan(·) represents the arctangent function, and p mon This represents the coordinates of the visual camera in a three-dimensional coordinate system with the radar coordinates as the origin. ||·|| represents the Euclidean distance calculation. The subscripts x, y, and z represent the x, y, and z coordinates of the calculated result, respectively.
[0033] make The rotation time of the vision camera can be obtained, and the camera's rotation angle can be calculated using the following formula:
[0034]
[0035] Where, θ, These represent the rotation angle and the pitch angle, respectively.
[0036] This allows us to determine the maximum scaling factor of the visual camera. Since the visual camera is already aimed at the center of the drone's motion... The maximum field of view represents the angle between two lines of sight tangent to the drone's monitoring range, ensuring that the visual camera observes the predicted drone range at its maximum zoom level. Assume the drone's flight range radius is... Maximum field of view The following formula can be used for calculation:
[0037]
[0038] based on The maximum scaling factor χ can be calculated based on the imaging principle of a visual camera. During the perception process, the scaling factor is smoothly amplified from one (1×) to the maximum scaling factor (χ×) to ensure that clear and large-scale UAV visual imaging results can be obtained.
[0039] A distributed multi-view vision-radar UAV localization method is proposed, which includes a wide-area UAV localization algorithm based on an abstraction-binding two-stage multi-view collaborative fusion. This algorithm fuses UAV candidate points with visual image features, specifically including visual-radar feature fusion, multi-view feature fusion, and localization output. This achieves the collaborative fusion of distributed multi-view features, enabling accurate UAV localization.
[0040] B1: Vision-Radar Cross-Modal Feature Fusion: First, the first three layers of a lightweight deep learning model (ShuffleNet) are used as a shallow Convolutional Neural Network (CNN) to extract visual features. During this process, each viewpoint (visual terminal) shares the same CNN parameters to ensure that the features extracted from each viewpoint have the same dimensionality.
[0041] Next, this invention maps the UAV's motion range to visual information from each viewpoint to achieve cross-modal feature fusion. Specifically, the UAV's motion range based on radar information is mapped to visual information from each viewpoint and assigned a high confidence level to obtain a confidence map. Then, based on cross-modal correlation, the confidence map and visual features are combined using an attention-based Transformer module g. sa The fusion process is performed using the following formula:
[0042]
[0043] in, This indicates that the control parameter V = [p] v[θ,φ,χ](viewpoint position, rotation angle, pitch angle, zoom) represents the drone's flight range predicted by the vision camera control module. Features x mapped to viewpoint i i The mapping function, Indicates the belonging after mapping The area with the value 1 is assigned a value of 1, and the rest of the area is assigned a value of 0.5.
[0044] B2: Multi-view feature fusion: This includes a feature abstraction process based on meta-networks, which abstracts multi-view features into a unified feature space to ensure consistent observation of potential UAV regions; and binding of the unified feature space, which binds the abstracted unified feature space back to each viewpoint to enhance the UAV region representation of each viewpoint.
[0045] B2.1: Feature abstraction process based on meta-networks: Let the first in-place viewpoint x i As a unified feature space S, for the abstraction process, S and X are flattened in terms of spatial dimensions, and then a residual convolutional network f with 1×1 convolutions of weights W is used. c To achieve the fusion of feature space S:
[0046]
[0047] Where S a and Let S and x represent the flattened feature space and viewpoint, respectively. i And use the sigmoid function σ to activate f c Obtain the feature space S and viewpoint x i The association weights between each element in the f. c This is a convolutional neural network containing three 1×1 residual convolutional blocks. × represents matrix multiplication, used to ensure that each element in the flattened, unified feature space is related to the viewpoint x. i This invention establishes correlations between elements in S. a Copy by Column View x i Number of spatial elements get Therefore, after The result obtained after the operation is the same as S. a Dimensions are consistent, and the viewpoint x is also included. i Information, through comparison with the original S a Adding them together will update the unified feature space.
[0048] A meta-network is used to fit the weights W. m Use S a x a and control parameter V = [p vThe meta-network computes W using [θ, φ, χ] as input as follows:
[0049]
[0050] Where [·] represents the concatenation operation, FC is a fully connected layer, and ReLU is the ReLIU activation function. In this case, the process of updating W changes from value correlation mining based on the first derivative to adaptive mining of the spatial correlation between the two spaces behind the value correlation based on the second derivative, thus avoiding the failure of updating the unified feature space.
[0051] B2.2: Binding of Unified Feature Space: After obtaining the feature space S, the features are rebound to each viewpoint based on the fused features. The process is represented as follows:
[0052]
[0053] in, These are features used for further localization after binding. The binding parameter represents the degree of association between each element in the unified feature space S and the k-th element in viewpoint i. Enhancing this binding parameter yields a binding network, represented as:
[0054]
[0055] Among them, f b It is a residual CNN network with the same characteristics as f c Same structure and weights For a 1×1 convolution. Finally, by reshaping... Obtain each perspective x i The fusion characteristics.
[0056] B3: Cooperative Position Output: A visual detector, consisting of the remainder of the lightweight neural network ShuffleNet and the detector part of the object detection algorithm YOLO, is applied to output the pixel-level drone position in each viewpoint v. and the corresponding confidence level Output the results.
[0057] Compared with the prior art, the beneficial effects of the present invention are as follows:
[0058] This invention provides a distributed multi-view vision-radar UAV localization method and system integrating sensing and control. It employs a novel, calibration-free, abstract-binding two-stage strategy to achieve real-time and fine-grained fusion. This strategy adaptively abstracts all features into a unified feature space, and then enhances each viewpoint based on this feature space. This invention utilizes multi-view modalities for UAV localization, avoiding perception failures and meeting wide-area requirements, significantly improving the performance of UAV localization algorithms. The computational and data complexity of this invention is far lower than many current methods, while offering finer fusion granularity and greater flexibility. Attached Figure Description
[0059] Figure 1 This is a schematic diagram of the architecture of a distributed multi-view vision-radar UAV positioning system integrating sensing and control.
[0060] Figure 2 This is a flowchart of a sensor-control integrated distributed multi-view vision-radar UAV positioning method.
[0061] Figure 3 This is a flowchart of the vision camera control module method.
[0062] Figure 4 This is a flowchart of the abstract-binding two-stage collaborative fusion algorithm.
[0063] Figure 5 This is a schematic diagram of the system device structure in actual deployment of this method. Detailed Implementation
[0064] The present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
[0065] This invention provides a distributed multi-view visual radar UAV positioning method and system integrating sensing and control. The system includes a radar, a visual camera, a data transmitter, a data processor, a distributed sensing terminal, and a computing platform. It designs a sensor control method and a wide-area UAV positioning algorithm based on an abstract-binding two-stage multi-view collaborative fusion, utilizing multi-view modalities for UAV positioning to achieve collaborative fusion of distributed cross-modal sensing data and high-precision wide-area UAV positioning. The technical solution provided by this invention can avoid UAV sensing failures and meets the needs of wide-area applications, significantly improving the positioning range and accuracy of UAVs.
[0066] This invention proposes a distributed multi-view vision-radar UAV positioning method and system integrating sensing and control. Figure 1 This is a schematic diagram of the architecture of a sensor-control integrated distributed multi-view vision-radar UAV positioning system. The architecture of this sensor-control integrated distributed multi-view vision-radar UAV positioning system comprises two core components: a distributed sensing terminal and a computing platform.
[0067] Distributed Sensing Terminals: This system comprises multiple distributed sensing terminals. Each terminal includes a low-altitude Doppler phased array radar or visual camera for real-time monitoring, and a data transmitter to acquire sensing results and send them to a computing platform, while simultaneously receiving sensor control commands from the platform. In practical deployment, since the radar's sensing range is greater than that of a visual camera, a "one-to-many" deployment approach is adopted. This means that if one sensing terminal in the distributed system carries a radar sensor, other sensing terminals within the radar's range carry lower-cost visual cameras. Therefore, this "one-to-many" deployment reduces system costs while fully utilizing the radar's 3D sensing capabilities and the visual camera's fine-grained sensing capabilities, achieving sensory complementarity and ensuring that multiple visual cameras within the radar's sensing range can participate in the fine-grained sensing of potential UAVs.
[0068] The computing platform is equipped with a data transmitter and a data processor. The data transmitter receives sensor data from the distributed sensing terminals and inputs it into the data processor, while also sending control commands from the data processor back to the distributed sensing terminals. The data processor includes a sensor control strategy and a distributed multi-view vision-radar UAV localization method. The sensor control strategy determines the monitoring area to be monitored by the vision camera based on the candidate UAV positions given by the radar, calculates the parameters of each sensing terminal carrying a vision camera, determines its pitch angle, azimuth angle, and zoom ratio, and transmits these parameters to the distributed sensing terminals via the data transmitter. This enables the distributed sensing terminals to control and adjust the vision cameras. The integrated sensing and control distributed multi-view vision-radar UAV localization method achieves the fusion of distributed multi-view vision-radar sensing information and provides the UAV position information from the perspectives of each vision camera.
[0069] When the integrated sensing and control distributed multi-view vision-radar UAV positioning system is in operation, it can effectively reduce the overall high latency problems caused by low radar sensor sampling speed, different visual camera adjustment delays (each visual camera needs to rotate at different azimuth angles, pitch angles and perform different magnifications, so the time it takes to provide sensing data is different) and different transmission delays in the distributed system.
[0070] Figure 2 This is a system flowchart. The system's workflow includes the following steps:
[0071] 1) Radar sampling and transmission: Each time the radar sensing terminal samples, it transmits the sampling results to the computing platform.
[0072] 2) Selection of candidate points for drones: For radar sensing terminals, the radar detection result point with the closest average distance to all monitoring devices in the radar sensing results is selected as the next candidate point. The selection process can be expressed as follows:
[0073]
[0074] Where, p t For radar candidate points with timestamp t, The coordinates of the i-th sensing terminal, where n is the total number of sensing terminals. To find satisfaction The smallest radar candidate point.
[0075] 3) The computing platform calculates and transmits control parameters based on the visual camera control module of the UAV's motion characteristics: Based on the radar candidate point coordinates and the UAV's motion characteristics, the control parameters of the visual camera sensing terminal corresponding to the radar terminal are calculated, and the parameters are transmitted back to each visual camera sensing terminal through the data transmitter.
[0076] 4) Sensing terminal control adjustment and sensing result feedback: Based on the control parameters, each sensing terminal adjusts its own visual camera parameters. After the adjustment is completed, the sensing results are transmitted to the computing platform in real time.
[0077] 5) Distributed Multi-View Vision-Radar UAV Localization Platform: Based on radar perception results and multi-view visual information, the distributed multi-view vision-radar UAV localization platform fuses and calculates this information before outputting the result. Specifically, during the fusion process, the first transmitted perception results are fused, and subsequently received perception results are mapped to the already fused result. This ensures that the fusion process does not require waiting for all views to be in place. The fused result is then fused with each view, ultimately outputting the localization result for each view. Notably, a maximum time T is set during the process; perception terminals that have not transmitted data by the specified time are excluded from localization.
[0078] 6) Loop: Eliminate potential UAV candidate points in the intersection of the perception areas of the visual perception terminal, and use the remaining results to select the next candidate point.
[0079] 7) If the radar information is updated, return to step 1) and start positioning again.
[0080] II. This invention designs a visual camera control module based on the motion characteristics of unmanned aerial vehicles (UAVs).
[0081] Figure 3 This is a flowchart of a vision camera control module based on the motion characteristics of a drone.
[0082] This invention designs a visual camera control module based on the motion characteristics of unmanned aerial vehicles (UAVs) for calculating visual camera control parameters. It utilizes a guidance algorithm based on UAV motion to control the pitch, rotation, and zoom of each visual camera for more targeted image sampling, providing high-quality perception information for subsequent fusion localization. Radar perception results can provide the location of potential UAVs. However, the real-time movement of the UAV can cause it to move out of the line of sight during the visual camera adjustment process (i.e., the time difference between selecting a candidate point and the visual camera being in position) and radar perception delay (candidate points are selected from radar perception results, but radar information is not updated in real time, and there is a time difference between sampling the current candidate point and the time when the visual camera fuses and localizes it). This makes it difficult to perceive UAV information.
[0083] In this scenario, a vision camera control module based on the drone's motion characteristics considers the drone's dynamic equations to obtain its flight range over future time periods and implements sensor control guided by the drone's motion. Since the drone's movement is limited by internal (torsional) and external (drag, gravity) forces, its speed and direction follow certain patterns, making its flight range predictable in the near future. Specifically, based on the flight speed v... t and direction D t =[ε, φ] t (Where ε and φ represent the horizontal and vertical directions, respectively), and t represents the current timestamp. The possible location of the drone within a certain future timeframe can be predicted, as follows:
[0084]
[0085] in, Indicates that the drone is in the time slot Below, at roll angle ω, pitch angle τ, and yaw angle The flight speed at that time; R is the rotation matrix. Indicates flight speed v t According to direction D t The velocity decomposed into x, y, z coordinates. T max This represents the maximum reference thrust of the drone. g represents the acceleration due to gravity, while... This represents a linear damping term, determined using a fixed reference value. The superscript T indicates the transpose operation.
[0086] For each This invention calculates the velocity corresponding to the pitch and roll angles of a UAV in six directions: forward, backward, left, right, up, and down. Then, based on the assumption of linear velocity change, it uses the formula... This is used to calculate the flight distance in six directions. The flight range of a drone can be represented as a region... Sphere centered It is the tangent sphere with flight distances in six directions.
[0087] Based on this, the direction of the line of sight can be represented as Where Ψ represents the horizontal rotational velocity, ρ represents the vertical rotational velocity, t represents the current moment, and l t Indicates the current line of sight, l St This indicates the direction of the next line of sight. The direction of the line connecting the visual camera and the drone's center of motion is indicated as follows:
[0088]
[0089] Indicates the drone sports center, The line connecting the center of motion of the visual camera and the drone represents the direction, arccos(·) represents the arccosine function, arctan(·) represents the arctangent function, and p mon The symbol represents a visual camera, and ||·|| represents Euclidean distance calculation. The subscripts x, y, and z represent the x, y, and z coordinates of the calculated result, respectively.
[0090] make The rotation time of the vision camera can be obtained, and the camera rotation can be calculated using the following formula:
[0091]
[0092] Where, θ, These represent the rotation angle and the pitch angle, respectively.
[0093] This allows us to determine the maximum scaling factor of the visual camera. Since the visual camera is already aimed at the center of the drone's motion... The maximum field of view represents the angle between two lines of sight tangent to the drone's monitoring range. Assume the range radius is... Maximum field of view The following formula can be used for calculation:
[0094]
[0095] based on The maximum scaling factor χ can be calculated based on the imaging principle of a visual camera. During perception, the scaling factor smoothly increases from one (1×) to the maximum scaling factor (χ×).
[0096] III. This invention designs a distributed multi-view vision-radar UAV positioning method.
[0097] Figure 4 This is a flowchart of a multi-perspective collaborative fusion wide-area UAV positioning algorithm based on an abstract-binding two-stage process.
[0098] Distributed multi-view vision-radar UAV localization method: It is implemented by a multi-view collaborative fusion wide-area UAV localization algorithm based on abstraction-binding two-stage, including the following steps.
[0099] Visual-radar cross-modal feature fusion: First, the first three layers of a lightweight deep learning model (ShuffleNet) are used as a shallow convolutional neural network (CNN) to extract visual features. During this process, each viewpoint (visual terminal) shares the same CNN parameters to ensure that the features extracted from each viewpoint have the same dimensionality.
[0100] Next, this invention maps the UAV's motion range to visual information from each viewpoint to achieve cross-modal feature fusion. Specifically, the UAV's motion range based on radar information is mapped to visual information from each viewpoint and assigned a high confidence level to obtain a confidence map. Then, based on cross-modal correlation, the confidence map and visual features are combined using an attention-based Transformer module g. sa The fusion process is performed using the following formula:
[0101]
[0102] in, This indicates that the control parameter V = [p] v [,θ,φ,χ](viewpoint position, rotation angle, pitch angle, zoom) will Features x mapped to viewpoint i i The mapping function, the mapped part belongs to The area with the value 1 is assigned a value of 1, and the rest of the area is assigned a value of 0.5.
[0103] Multi-view feature fusion includes a feature abstraction process based on meta-networks, which abstracts multi-view features into a unified feature space to ensure consistent observation of potential UAV regions; and binding of the unified feature space, which binds the abstracted unified feature space back to each viewpoint to enhance the UAV region representation of each viewpoint.
[0104] Let the first perspective be in place x i As a unified feature space S, for the abstraction process, S and x are flattened in terms of spatial dimensions, and then a residual convolutional network f with 1×1 convolutions of weights W is used. c To achieve the fusion of feature space S:
[0105]
[0106] Where S a and Let S and x represent the flattened feature space and viewpoint, respectively. i And use the sigmoid function σ to activate f c Obtain the feature space S and viewpoint x i The association weights between each element in the f. c This is a convolutional neural network containing three 1×1 residual convolutional blocks. × represents matrix multiplication, used to ensure that each element in the flattened, unified feature space is related to the viewpoint x. i This invention establishes correlations between elements in S. a Copy by Column View x i Number of spatial elements get Therefore, after The result obtained after the operation is the same as S. a Dimensions are consistent, and the viewpoint x is also included. i Information, through comparison with the original S a Adding them together will update the unified feature space.
[0107] A meta-network is used to fit the weights W. m Use S a x a and control parameter V = [p v The meta-network computes W using [θ, φ, χ] as input as follows:
[0108]
[0109] Where [·] represents the concatenation operation, FC is a fully connected layer, and ReLU is the ReLIU activation function. In this case, the process of updating W changes from value correlation mining based on the first derivative to adaptive mining of the spatial correlation between the two spaces behind the value correlation based on the second derivative, thus avoiding the failure of updating the unified feature space.
[0110] After obtaining the feature space S, the features are re-bound to each viewpoint based on the fused features. The process is represented as follows:
[0111]
[0112] in, These are features used for further localization after binding. The binding parameter represents the degree of association between each element in the unified feature space S and the k-th element in viewpoint i. Enhancing this binding parameter yields a binding network, represented as:
[0113]
[0114] Among them, f b It is a residual CNN network with the same characteristics as fc Same structure and weights For a 1×1 convolution. Finally, by reshaping... Obtain each perspective x i The fusion characteristics.
[0115] Cooperative position output: A visual detector, consisting of the remainder of a lightweight neural network ShuffleNet and a detector portion of the object detection algorithm YOLO, is applied to output the pixel-level drone position in each viewpoint v. and the corresponding confidence level Output the results.
[0116] In specific implementation, this invention provides a distributed multi-view vision-radar UAV positioning system device integrating sensing and control, such as... Figure 5 As shown, the structure of the integrated sensing and control distributed multi-view vision-radar UAV positioning system includes:
[0117] 50111 Vision Camera: Includes a gimbal and imaging module for acquiring fine-grained images.
[0118] 50121 radar: used to acquire the three-dimensional position coordinates p of potential UAVs in the airspace with the radar as the center and its effective range as the radius. t =[x,y,z] t (where x, y, z represent the three-dimensional coordinates with the radar as the origin, and t represents the current timestamp), based on the flight speed v t and direction D t =[ε, φ] t (where ε and φ represent the horizontal and vertical directions, respectively).
[0119] 50112 Data Transmitter: Used for encoding, sending, receiving, and decoding information and data.
[0120] 5021 Data Processor: Includes a visual camera control module based on the motion characteristics of UAVs, and a distributed multi-view visual-radar UAV positioning method integrating sensing and control.
[0121] The 501 Distributed Sensing Terminal: This system comprises multiple distributed sensing terminals. Each terminal includes a low-altitude Doppler phased array radar or visual camera for real-time monitoring, and a data transmitter to acquire sensing results and send them to a computing platform, while simultaneously receiving sensor control commands from the platform. In practical deployment, since the radar's sensing range is greater than that of a visual camera, the system employs a "one-to-many" deployment approach. This means that if one sensing terminal in the distributed system carries a radar sensor, other sensing terminals within the radar's range carry lower-cost visual cameras. Therefore, this "one-to-many" deployment reduces system costs while fully utilizing the radar's 3D sensing capabilities and the visual camera's fine-grained sensing capabilities, achieving sensory complementarity and ensuring that multiple visual cameras within the radar's sensing range can participate in the fine-grained sensing of potential UAVs.
[0122] The 502 computing platform is equipped with a data transmitter and a data processor. The data transmitter receives sensor data from the distributed sensing terminals and inputs it into the data processor, while also sending control commands from the data processor back to the distributed sensing terminals. The data processor includes a sensor control strategy and a distributed multi-view vision-radar UAV localization method. The sensor control strategy determines the monitoring area to be monitored by the vision camera based on the candidate UAV positions given by the radar, calculates the parameters of each sensing terminal carrying a vision camera, determines its pitch angle, azimuth angle, and zoom ratio, and transmits these parameters to the distributed sensing terminals via the data transmitter. This enables the distributed sensing terminals to control and adjust the vision cameras. The integrated sensing and control distributed multi-view vision-radar UAV localization method achieves the fusion of distributed multi-view vision-radar sensing information and provides the UAV position information from the perspectives of each vision camera.
[0123] It should be noted that the terms such as "upper", "lower", "left", "right", "front", and "back" used in the invention are only for clarity of description and are not intended to limit the scope of the invention. Changes or adjustments to their relative relationships, without substantially altering the technical content, should also be considered within the scope of the invention.
[0124] The above are merely preferred embodiments of the present invention. The scope of protection of the present invention is not limited to the above embodiments. All technical solutions falling within the scope of the present invention's concept are within the scope of protection of the present invention. It should be noted that for those skilled in the art, any improvements and modifications made without departing from the principles of the present invention should be considered within the scope of protection of the present invention.
Claims
1. A method for locating unmanned aerial vehicles (UAVs) using a distributed multi-view visual radar integrating sensing and control, characterized in that, The design of sensor control methods and multi-view collaborative fusion wide-area UAV localization algorithms utilizes multi-view modalities for UAV localization, achieving collaborative fusion of distributed cross-modal sensing data and high-precision wide-area UAV localization; including the following steps: 1) The radar sensing terminal acquires the sampling results and transmits the results; 2) Selecting candidate drone points: The radar sensing terminal will select the radar detection result point with the average distance to all monitoring devices from the radar sensing results as the next candidate drone point; 3) Design a sensor control method, predict the flight range of the UAV based on the motion characteristics of the UAV, calculate the control parameters of the visual camera sensing terminal corresponding to the radar sensing terminal, and send the control parameters back to each visual camera sensing terminal. 4) Based on the control parameters of the visual camera sensing terminal corresponding to the radar sensing terminal, each sensing terminal adjusts the visual camera parameters. After the adjustment is completed, the sensing results are transmitted to the computing platform in real time. 5) Design a multi-view collaborative fusion wide-area UAV localization algorithm to perform distributed multi-view vision-radar UAV localization. Based on radar perception results and multi-view visual information, perform fusion calculation to obtain distributed multi-view vision-radar UAV localization results. 6) Return to step 2): Eliminate potential UAV candidate points in the intersection of the perception areas of the visual perception terminal, and use the remaining results to select the next UAV candidate point; 7) If the radar information is updated, return to step 1) and start again; Through the above steps, a distributed multi-view vision-radar UAV positioning system integrating sensing and control is achieved.
2. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 1, characterized in that, Step 3) The method for calculating the control parameters of the visual camera sensing terminal corresponding to the radar sensing terminal includes the following steps: 31) Based on the drone's flight speed and direction, predict the drone's location over a future period of time; 32) Calculate the linear velocity corresponding to the pitch and roll angles of the UAV in each time slot in six directions: forward, backward, left, right, up, and down; based on the changes in linear velocity, calculate the flight distance in the six directions; 33) The flight range of the UAV is represented as a sphere centered on the center of motion of the UAV, which is the circumscribed sphere of the flight distance in six directions; the center of the sphere is the center of motion of the UAV. 34) Obtain the direction of the connection line between the visual camera and the drone's center of motion; 35) Obtain the rotation time of the vision camera and calculate the rotation angle of the vision camera; 36) Calculate the maximum field of view of the UAV and determine the maximum scaling factor of the visual camera based on the maximum field of view; The control parameters of the visual camera sensing terminal corresponding to the radar sensing terminal are thus calculated.
3. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 2, characterized in that, In step 34), the direction of the line connecting the visual camera and the drone's center of motion is represented as follows: in, Indicates the drone sports center, The line connecting the center of motion of the visual camera and the drone represents the direction, arccos(·) represents the arccosine function, arctan(·) represents the arctangent function, and p mon This represents the coordinates in the three-dimensional coordinate system of the visual camera with the radar coordinates as the origin. ||·|| represents the Euclidean distance calculation. The subscripts x, y, and z represent the x, y, and z coordinates of the calculated result, respectively.
4. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 3, characterized in that, Step 35) Calculate the camera's rotation angle, expressed as: Where, θ, These represent the rotation angle and pitch angle of the UAV, respectively; Ψ represents the horizontal rotation speed; ρ represents the vertical rotation speed; l ^ c t represents the direction of the line connecting the visual camera and the drone's center of motion.
5. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 4, characterized in that, Step 36) Calculate the maximum field of view of the UAV. Represented as: The radius of the drone's monitoring range is: The drone sports center is p mon This represents the coordinates in the three-dimensional coordinate system of the visual camera, with the radar coordinates as the origin.
6. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 2, characterized in that, Step 5) Design a multi-view collaborative fusion wide-area UAV localization algorithm, including visual-radar cross-modal feature fusion, multi-view feature fusion, and localization output; 51) Visual-radar cross-modal feature fusion; including: Lightweight deep learning models are used as shallow convolutional neural networks to extract visual features; each viewpoint, i.e., the visual terminal, shares the same network parameters to ensure that the features extracted from each viewpoint have the same dimension; Mapping the drone's motion range to visual information from each viewpoint enables cross-modal feature fusion: Mapping the drone's motion range based on radar information to visual information from each viewpoint and assigning it high confidence to obtain a confidence map; Based on cross-modal correlation, confidence maps and visual features are combined using the attention-based Transformer module g. sa To merge; 52) Multi-view feature fusion, including feature abstraction process based on meta-network and unified feature space binding process; The feature abstraction process based on meta-networks: realizes the abstraction of multi-view features into a unified feature space, and achieves consistent observation of potential UAV areas; The binding process of the unified feature space: realizes the reverse binding of the abstracted unified feature space to each viewpoint, and enhances the UAV region representation from each viewpoint; 53) Cooperative Positioning Output: A visual detector, consisting of the remainder of the lightweight neural network ShuffleNet and the detector part of the object detection algorithm YOLO, is applied to output the pixel-level UAV position in each viewpoint v. and the corresponding confidence level Output.
7. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 6, characterized in that, Step 51) Fuse the confidence map and visual features, as shown below: in, This refers to the range of motion of the drone; This indicates that the control parameter V = [p] v [,θ,φ,χ], where the control parameters are the viewpoint position, rotation angle, pitch angle, and scaling factor, respectively; Features x mapped to viewpoint i i The mapping function, for the mapped elements belonging to The region and the rest of the region are assigned different values. Indicates the belonging to the mapped The region is assigned the value 1, and the rest of the region is assigned the value 0.
5.
8. The integrated sensing and control distributed multi-view visual radar UAV positioning method as described in claim 7, characterized in that, Step 52) includes: 521: Feature abstraction process based on meta-networks: Let the first perspective be in place x i As a unified feature space S; S and x are flattened in terms of spatial dimension, and a residual convolutional network f is used with 1×1 convolutions with weights W. c The fusion of feature spaces S is represented as: Where S a and Let S and x represent the flattened feature space and viewpoint, respectively. i And use the sigmoid function σ to activate f c Obtain the feature space S and viewpoint x i The association weight between each element in the f; c This is a convolutional neural network containing three 1×1 residual convolutional blocks. × represents matrix multiplication; S a Copy by Column View x i Number of spatial elements get This ensures that each element in the flattened, unified feature space can be correlated with the viewpoint x. i Establish correlations among elements; after The result obtained after the operation is the same as S. a Dimensions are consistent, and the viewpoint x is also included. i Information, through comparison with the original S a Adding them together updates the unified feature space. Use a meta-network to fit the weights W: f m Use S a x a and control parameter V = [p v The meta-network computation process of W, with [θ, φ, χ] as input, is expressed as: Where [·] represents the splicing operation, FC is a fully connected layer, and ReLU is the ReLIU activation function. 522) Binding process of unified feature space: After obtaining the feature space S, the features are rebound to each viewpoint according to the fused features. The process is represented as follows: in, These are features used for further localization after binding. It is a binding parameter, representing the degree of association between each element in the unified feature space S and the k-th element in viewpoint i; The enhancement results in a bound network, represented as: Among them, f b It is a residual CNN network, which has the same characteristics as f c Same structure and weights For a 1×1 convolution; Finally, through reshaping Obtain each perspective x i The fusion characteristics.
9. A distributed multi-view visual radar UAV positioning system integrating sensing and control, implemented using the method described in claim 1, characterized in that, This includes radar, visual cameras, data transmitters, data processors, distributed sensing terminals, and computing platforms; among which, The radar uses a low-altitude Doppler phased array radar to obtain the three-dimensional position coordinates of potential UAVs in the airspace; The visual camera includes a gimbal and an imaging module; the imaging module is used to achieve 2D imaging at a high frame rate and acquire fine-grained images; the gimbal is used to adjust the camera's pitch angle, azimuth angle, and zoom level. The data transmitter includes an integrated data transceiver unit for encoding, sending, receiving, and decoding information and data; The data processor includes a visual camera control module and a distributed multi-view visual-radar UAV positioning module integrating sensing and control. The visual camera control module is used to determine the monitoring area to be monitored by the visual camera based on the candidate UAV positions acquired by the radar, and calculates the parameters of each sensing terminal carrying the visual camera, determining the pitch angle, azimuth angle, and zoom ratio, and transmits them to the distributed sensing terminal through the data transmitter, so that the distributed sensing terminal can control and adjust the visual camera. The distributed multi-view visual-radar UAV positioning module integrating sensing and control is used to realize the fusion of distributed multi-view visual-radar sensing information and obtain the UAV position information from the perspective of each visual camera. The distributed sensing terminal includes multiple distributed sensing terminals, including a low-altitude Doppler phased array radar or visual camera and a data transmitter; the low-altitude Doppler phased array radar or visual camera is used to achieve real-time monitoring. The computing platform is equipped with a data transmitter and a data processor. The data transmitter receives the sensing data from the distributed sensing terminals, inputs it to the data processor, and sends the control commands from the data processor back to the distributed sensing terminals.
10. The integrated sensing and control distributed multi-view visual radar UAV positioning system as described in claim 9, characterized in that, The distributed sensing terminals adopt a one-to-many deployment method, that is, if there is a sensing terminal equipped with a radar sensor in the distributed sensing terminals, the sensing terminals within the radar's range are equipped with visual cameras.