Low-altitude unmanned aerial vehicle panoramic intelligent early warning method and device for natural resource monitoring
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANJING UNIV OF POSTS & TELECOMM
- Filing Date
- 2026-04-01
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies for monitoring natural resources using drones suffer from several problems: a disconnect between panoramic data acquisition and target detection; insufficient efficiency in AI algorithm feature extraction and fusion; low accuracy in detecting small and occluded targets; and inadequate utilization of drone pose information.
By simultaneously collecting multi-view images and POS data from multiple types of drones, a variable-scale sparse small target detection model is constructed. The feature extraction and multi-scale fusion mechanism is optimized, and spatial positioning is performed by combining drone pose information, thus achieving deep coupling between panoramic data and target detection.
It improves the efficiency of UAV data acquisition and intelligent monitoring, achieves high-precision small target detection and accurate spatial positioning, and solves the problems of panoramic image resolution compression, viewpoint distortion and insufficient utilization of pose information.
Smart Images

Figure CN121963004B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of natural resource survey and monitoring technology, and in particular to a low-altitude unmanned aerial vehicle (UAV) panoramic intelligent early warning method and device for natural resource monitoring. Background Technology
[0002] Existing UAV monitoring technologies for natural resources suffer from the following problems: First, there is a disconnect between panoramic data acquisition and target detection. Current UAV panoramic acquisition focuses primarily on scene reconstruction, lacking deep integration with target detection tasks. While the acquired 720° panoramic images contain rich spatial information, direct input into detection models results in resolution compression and viewpoint distortion, leading to the loss of target features and failing to meet high-precision detection requirements. Second, the efficiency of AI-based feature extraction and fusion for UAV monitoring of natural resources is insufficient. Traditional backbone networks experience a significant increase in computational load and parameter count when processing high-resolution panoramic images, leading to a decrease in inference speed.
[0003] Meanwhile, the feature pyramid structure of the neck network (such as PAFPN) is prone to misalignment between high-level semantic information and low-level detail information during cross-scale fusion, affecting the detection performance of multi-scale targets. Thirdly, the detection accuracy of small and occluded targets in natural resource UAV monitoring is low: existing detection heads mostly adopt coupled structures, with classification and regression tasks interfering with each other; in panoramic images, small targets occupy a small proportion, have little texture information, and are easily occluded by the background, making it difficult for the model to effectively extract their features, resulting in missed detections and false detections. Fourthly, the pose information of images collected by UAVs for natural resource monitoring is not fully utilized: UAV-collected POS data contains rich spatial pose information, but existing technologies mostly store it as independent metadata without effectively fusing it with image features, resulting in a lack of accurate geographic coordinate correlation in the detection results, making it difficult to directly support subsequent spatial analysis and decision-making.
[0004] Given the advantages of UAV panoramic data acquisition and analysis, such as convenient data collection, high flight efficiency, large coverage area, low equipment requirements, and non-confidential data, this application focuses on how to achieve deep integration of UAV panoramic data with natural resource monitoring operations. By overcoming the challenges of intelligent identification models for suspected illegal and irregular behaviors and monocular omnidirectional spatial positioning of panoramic data, a low-altitude UAV panoramic intelligent early warning method and device for natural resource monitoring is formed. Summary of the Invention
[0005] To address the aforementioned issues, this application discloses a panoramic intelligent early warning method and device for low-altitude unmanned aerial vehicles (UAVs) for natural resource monitoring. This application is adaptable to small and lightweight UAVs, can significantly improve the efficiency of UAV data acquisition, and can greatly enhance the level of intelligent monitoring and spatial positioning, providing a new technical reference for natural resource monitoring and supervision business scenarios.
[0006] A panoramic intelligent early warning method for low-altitude unmanned aerial vehicles (UAVs) for natural resource monitoring includes the following steps:
[0007] Step S1: Conduct panoramic data collection based on multiple types of drones: acquire multi-view drone images and drone panoramic images synthesized from multi-view drone images, and record POS data simultaneously;
[0008] Step S2: Driven by natural resource monitoring operations, classify and label suspected illegal and non-compliant targets to construct a panoramic intelligent sample dataset of low-altitude UAVs for natural resource monitoring.
[0009] Step S3: Construct and train a variable-scale sparse small target detection model for panoramic data;
[0010] Step S4: Perform projection transformation on the collected drone panoramic image, and use the trained variable-scale sparse small target detection model to perform image-by-image target detection and inference on the transformed drone panoramic image to extract suspected illegal and irregular targets.
[0011] Step S5: Based on the multi-view UAV images and their POS data, spatially locate the detected suspected illegal or irregular targets and obtain their geographic coordinates;
[0012] Step S6: Integrate the low-altitude UAV panoramic intelligent early warning device and execute steps S1 to S5.
[0013] Furthermore, the specific steps of step S1 are as follows:
[0014] Step S1.1: Prepare multiple types of drones. The drones are equipped with cameras and integrated with IMU inertial measurement units, RTK-level GPS modules and wireless data transmission links to synchronously collect road images and POS data.
[0015] Step S1.2: Input the panoramic data acquisition points and flight path planning, set the drone's flight parameters, and plan the drone's data acquisition flight path;
[0016] Step S1.3: The UAV conducts panoramic data acquisition; after the UAV flies to the panoramic data acquisition point, it acquires multiple multi-view UAV images at the same location, and the UAV imaging chip or subsequent processing combines the multiple multi-view UAV images into a single UAV panoramic image.
[0017] Step S1.4: Output and store multiple multi-view drone images and drone panoramic images; synchronously record the POS data corresponding to each image, and complete local and backup storage and data integrity verification.
[0018] Furthermore, the specific steps of step S2 are as follows:
[0019] Step S2.1: Driven by natural resource monitoring operations, classify and code suspected illegal and irregular targets; determine a list of categories for suspected illegal and irregular targets to be identified, and assign a unique code to each category; the list of categories includes at least the following 20 types of targets: excavators, soil turners, dump trucks, transport vehicles, road rollers, bulldozers, pile drivers, cranes, mixers, material hoists, tower cranes, sheds and prefabricated houses, dust nets, scaffolding, brick houses under construction, piles of soil, piles of bricks, piles of steel bars, flames, and thick smoke;
[0020] Step S2.2: Label suspected illegal and irregular activities driven by natural resource monitoring business to form a panoramic intelligent sample dataset of low-altitude UAVs for natural resource monitoring;
[0021] Step S2.3: Divide the low-altitude UAV panoramic intelligent sample dataset for natural resource monitoring into a training set and a validation set in a 7:3 ratio; this will be used to train and evaluate the model.
[0022] Furthermore, the specific steps of step S3 are as follows:
[0023] S3.1: Constructing a variable-scale sparse small target detection model based on YOLOv8-seg: The variable-scale sparse small target detection model includes a backbone network, a neck network, a head network, and a post-processing module;
[0024] In this process, the feature extraction module of the backbone network embeds an SE attention module after each bottleneck layer in the original Darknet Bottleneck sequence. The SE attention module compresses the spatial dimension through global average pooling, generates channel weights through convolution, ReLU activation, deconvolution, and the Sigmoid function, and repeats this process n times. Then, the channel weights are multiplied by the original feature map for feature recalibration.
[0025] Simultaneously, the standard ConvModule modules in the backbone and neck network are replaced with ConvModule_Dino modules; the ConvModule_Dino module sequentially performs convolution, BatchNorm2d normalization, and DinoV3 ViT Block processing, where the processing flow of the DinoV3 ViT Block is as follows:
[0026] First, the input state is denoted as state one. State one is subjected to layer normalization (LN) and multi-head self-attention (MSA) to obtain state two. State two is connected with the residual of state one and then subjected to layer normalization (LN) again to obtain state three. State three is activated by multilayer perceptron (MLP) and GELU to obtain state four. Finally, state one, state three and state four are fused with residuals to obtain multi-level feature maps.
[0027] Step S3.2: Design a multi-scale feature fusion mechanism: Input multi-level feature maps, the neck network adopts a CSP structure, integrating a path aggregation network and a feature pyramid network; fuse the multi-level feature maps output by the backbone network through upsampling and downsampling operations to generate feature pyramids of three scales: 80×80 pixels, 40×40 pixels, and 20×20 pixels.
[0028] Step S3.3: Model Training and Optimization: The variable-scale sparse small target detection model is trained end-to-end using the training set. Data augmentation strategy is adopted, and the model parameters are iteratively updated through the gradient descent algorithm until the loss converges, thus obtaining the trained variable-scale sparse small target detection model.
[0029] Step S3.4: Model Validation and Deployment: After training, the performance of the variable-scale sparse small target detection model is evaluated using a test set. The metrics include precision, recall, mean precision, and frame rate. Finally, the weights of the optimized variable-scale sparse small target detection model are converted into a deployment format and integrated into embedded devices or cloud platforms to achieve real-time small target detection.
[0030] Furthermore, the specific steps of step S4 are as follows:
[0031] Step S4.1: Startup and configuration of the client application, run the PyQt client application;
[0032] Step S4.2: Integrate and initialize the inference engine, load the trained variable-scale sparse small target detection model, and allocate memory space;
[0033] Step S4.3: Preprocess the input drone panoramic image, including image scaling, normalization and noise suppression, call the inference engine to perform detection, and output the category of suspected illegal and irregular targets, target bounding box, confidence score and target polygon mask;
[0034] Step S4.4: Result visualization and interactive management. The user terminal overlays the bounding boxes and polygon masks of suspected illegal and irregular natural resource targets onto the original image in real time, and simultaneously displays the list and statistical information of suspected illegal and irregular natural resource targets.
[0035] Furthermore, in step S4.2, the inference engine is a TensorRT-optimized inference engine; in step S4.3, the image scaling size is 640×640 pixels.
[0036] Furthermore, the specific steps of step S5 are as follows:
[0037] Step S5.1: Preprocess the multi-view drone images and drone panoramic images to obtain their metadata. The multi-view drone images refer to images taken by the drone from multiple angles in the air. The drone's built-in stitching and fusion algorithm is used to form the drone panoramic image.
[0038] Step S5.2: Based on metadata, construct a pixel spatial adjacency graph between multi-view UAV images, specifically including:
[0039] Step S5.2.1: Use the K-MEANS clustering algorithm to group the multi-view drone images according to the pitch angle value. The number of clusters is set according to the number of pitch angle groups in the actual drone panoramic image shooting.
[0040] Step S5.2.2: Sort the grouped multi-view drone images in ascending order of yaw angle, within the range of [-180, 180];
[0041] Step S5.2.3: Align the adjacent grouped multi-view drone images according to the yaw angle and connect them in pairs to generate a pixel space adjacency graph;
[0042] Step S5.3: Calculate the edge weights between adjacent images in the adjacency graph based on the LOFTR algorithm. The formula for calculating the edge weights is as follows:
[0043] ;
[0044] in, For edge weights, The number of interior points matched in the image pair;
[0045] Step S5.4: Read the Eulerian information of adjacent multi-view UAV image pairs, establish a rotation matrix, and unify the multi-view UAV images to a positive orientation through homography transformation. The formula for calculating the rotation matrix is:
[0046] ;
[0047] in For the perspective transformation matrix, To bypass Axis rotation The rotation matrix of degrees, To bypass Axis rotation Rotation matrix of degrees, To bypass Axis rotation The rotation matrix of degrees;
[0048] Step S5.5: Geospatial registration of the drone image directly below the multi-view drone image set with the high-resolution remote sensing image;
[0049] Step S5.6: Based on the pixel spatial adjacency map and the absolute pose of the UAV image directly below in the registered multi-view UAV image set; through minimum weight distance path search, successively decompose and transfer the pose transformation relationship to complete the geospatial registration of all remaining multi-view UAV images with high-resolution remote sensing images, and output the simulated orthorectified multi-view UAV image set; the formula for calculating the absolute pose of the multi-view UAV images is:
[0050] ;
[0051] in, This represents the absolute pose of the current multi-view drone images. This represents the rotation matrix that transforms the multi-view drone image from the actual, corrected image to the current multi-view drone image. This represents the translation matrix from the truly corrected multi-view drone image to the current multi-view drone image. This represents the rotation matrix used to transform a true radio-corrected multi-view UAV image into a high-resolution remote sensing image. This represents the translation matrix used to transform a true, corrected multi-view UAV image into a high-resolution remote sensing image.
[0052] Step S5.7: Based on the pixel coordinates of the suspected illegal or irregular target on the drone panoramic image, traverse and locate the corresponding multi-view drone images, and then calculate the geographic coordinates of the suspected illegal or irregular target from the real-shot corrected multi-view drone image set and based on the registration information.
[0053] Furthermore, the specific steps of step S5.5 are as follows:
[0054] Step S5.5.1: Based on the metadata of the UAV image directly below in the multi-view UAV image set, crop out the corresponding local high-resolution remote sensing image from the high-resolution remote sensing image.
[0055] Step S5.5.2: Using the pose information of the drone image directly below in the multi-view drone image set, convert it into a true-view corrected multi-view drone image through perspective transformation; the calculation formula for perspective transformation is:
[0056] ;
[0057] in, This is the output of the perspective transformation. ( ) represents the perspective transformation function. The image shown is the drone directly below in the original multi-view drone image set to be transformed by perspective. For the camera intrinsic parameter matrix, This is the inverse of the rotation matrix used to transform the view from the initial perspective to the BEV perspective. This represents the rotation matrix used to transform a true radio-corrected multi-view UAV image into a high-resolution remote sensing image.
[0058] Step S5.5.3: Use the LOFTR algorithm to perform image matching between the local high-resolution remote sensing image and the true radiometrically corrected multi-view UAV image to obtain initial matching point pairs; perform inverse transformation on the initial matching point pairs to obtain matching point pairs between the true radiometrically corrected multi-view UAV image and the local high-resolution remote sensing image; use the PyCOLMAP tool to solve and correct the absolute pose of the true radiometrically corrected multi-view UAV image using the matching point pairs.
[0059] A low-altitude unmanned aerial vehicle (UAV) panoramic intelligent early warning device for natural resource monitoring includes: a labeling and training device for sample labeling, model training and optimization; a monitoring device for collecting real-time status data of the area to be inspected; a positioning device for obtaining precise location information; a server that communicates with the labeling and training device, monitoring device, positioning device and client respectively, for receiving and processing data and issuing inspection tasks; a client for displaying inspection status, early warning information and processing results to the user; a controller that communicates with the server for controlling the UAV's inspection operations; a drone nest for providing parking and resupply support for the UAV; and a UAV equipped with a camera for executing inspection tasks according to instructions and transmitting data back.
[0060] The beneficial effects of this application are:
[0061] 1. To address the problem of disconnect between panoramic data acquisition and target detection in existing technologies, this application achieves deep coupling between panoramic data acquisition and target detection tasks. It simultaneously acquires multi-view images, panoramic images, and high-precision POS data through multiple types of UAVs. Furthermore, it performs preprocessing such as projection transformation and orthorectification on the panoramic data before detection and inference, effectively solving the problem of target feature loss caused by panoramic image resolution compression and viewpoint distortion.
[0062] 2. To address the issue of insufficient efficiency in feature extraction and fusion of existing AI algorithms, this application optimizes the feature extraction and multi-scale fusion mechanism. Based on YOLOv8-seg, a variable-scale sparse small target detection network is constructed. The backbone network feature extraction module is improved to CSPlayer_2Conv_SE and an SE attention module is embedded. The standard ConvModule is replaced with ConvModule_Dino. The neck network integrates PAN and FPN to generate a multi-scale feature pyramid, which reduces the computational load and parameter count of high-resolution panoramic image processing and improves the algorithm inference speed.
[0063] 3. To address the issue of low detection accuracy for small and occluded targets in existing technologies, this application constructs a variable-scale sparse small target detection network that optimizes the detection task processing logic. At the same time, it combines data augmentation strategies, a combined loss function of classification, regression, and segmentation, and a task-aligned detection logic to enhance the model's ability to extract features of small and occluded targets in panoramic images.
[0064] 4. To address the problem of insufficient utilization of UAV pose information in existing technologies, this application fully explores and integrates POS spatial pose information and image features collected by UAVs. Through a series of steps, such as constructing a multi-view image pixel spatial adjacency graph, homography transformation to positive orientation pose, geospatial registration with high-resolution remote sensing images, and pixel coordinate to geographic coordinate conversion, it achieves accurate spatial positioning of suspected illegal and irregular targets. Attached Figure Description
[0065] Figure 1 This application provides a flowchart of a panoramic intelligent early warning method for low-altitude UAVs for natural resource monitoring.
[0066] Figure 2 This is a schematic diagram of the construction and training model of the variable-scale sparse small target detection network in this application;
[0067] Figure 3 This is a flowchart of the geospatial registration process between the drone image directly below and the high-resolution remote sensing image in the multi-view drone image collection of this application;
[0068] Figure 4 This is a schematic diagram of the low-altitude unmanned aerial vehicle (UAV) panoramic intelligent early warning device for natural resource monitoring, as described in this application.
[0069] Figure 5 This application includes an overview of the experimental area and a schematic diagram of the panoramic data collection points and data collection routes within the experimental area.
[0070] Figure 6 This is a rendering of the predicted effect of the shed prefabricated house as an example in this application (where (a) represents the original UAV image and (b) represents the model prediction result of the shed prefabricated house).
[0071] List of reference numerals in the attached diagram:
[0072] Among them, 10-device; 01-labeling and training equipment; 02-monitoring equipment; 03-positioning equipment; 04-server; 05-client; 11-controller; 12-UAV; 13-nest. Detailed Implementation
[0073] The present application will be further explained below with reference to the accompanying drawings and specific embodiments. It should be understood that the following specific embodiments are for illustrative purposes only and are not intended to limit the scope of the present application. It should be noted that the terms "front", "rear", "left", "right", "up" and "down" used in the following description refer to the directions in the accompanying drawings, and the terms "inner" and "outer" refer to the directions toward or away from the geometric center of a specific component, respectively.
[0074] like Figure 1 As shown in this embodiment, the low-altitude UAV panoramic intelligent early warning method for natural resource monitoring specifically includes the following steps:
[0075] Step S1: Conduct panoramic data collection based on multiple types of drones: acquire multi-view drone images and drone panoramic images synthesized from multi-view drone images, and record POS data simultaneously;
[0076] The specific steps of step S1 are as follows:
[0077] Step S1.1: Prepare multiple types of drones. The drones are equipped with cameras and integrated with IMU inertial measurement units, RTK-level GPS modules and wireless data transmission links to synchronously collect road images and POS data.
[0078] Step S1.2: Input the panoramic data acquisition points and flight path planning, set the drone's flight parameters, and plan the drone's data acquisition flight path;
[0079] Step S1.3: The UAV conducts panoramic data acquisition; after the UAV flies to the panoramic data acquisition point, it acquires multiple multi-view UAV images at the same location, and the UAV imaging chip or subsequent processing combines the multiple multi-view UAV images into a single UAV panoramic image.
[0080] Step S1.4: Output and store multiple multi-view drone images and drone panoramic images; synchronously record the POS data corresponding to each image, and complete local and backup storage and data integrity verification.
[0081] Step S2: Driven by natural resource monitoring operations, classify and label suspected illegal and non-compliant targets to construct a panoramic intelligent sample dataset of low-altitude UAVs for natural resource monitoring.
[0082] The specific steps of step S2 are as follows:
[0083] Step S2.1: Driven by natural resource monitoring operations, classify and code suspected illegal and irregular targets; determine a list of categories for suspected illegal and irregular targets to be identified, and assign a unique code to each category; the list of categories includes at least the following 20 types of targets: excavators, soil turners, dump trucks, transport vehicles, road rollers, bulldozers, pile drivers, cranes, mixers, material hoists, tower cranes, sheds and prefabricated houses, dust nets, scaffolding, brick houses under construction, piles of soil, piles of bricks, piles of steel bars, flames, and thick smoke;
[0084] Step S2.2: Label suspected illegal and irregular activities driven by natural resource monitoring business to form a panoramic intelligent sample dataset of low-altitude UAVs for natural resource monitoring;
[0085] Step S2.3: Divide the low-altitude UAV panoramic intelligent sample dataset for natural resource monitoring into a training set and a validation set in a 7:3 ratio; this will be used to train and evaluate the model.
[0086] Step S3: Construct and train a variable-scale sparse small target detection model for panoramic data; such as Figure 2 As shown, the specific steps of step S3 are as follows:
[0087] S3.1: Constructing a variable-scale sparse small target detection model based on YOLOv8-seg: The variable-scale sparse small target detection model includes a backbone network, a neck network, a head network, and a post-processing module;
[0088] In this process, the feature extraction module of the backbone network embeds an SE attention module after each bottleneck layer in the original Darknet Bottleneck sequence. The SE attention module compresses the spatial dimension through global average pooling, generates channel weights through convolution, ReLU activation, deconvolution, and the Sigmoid function, and repeats this process n times. Then, the channel weights are multiplied by the original feature map for feature recalibration.
[0089] Simultaneously, the standard ConvModule modules in the backbone and neck network are replaced with ConvModule_Dino modules; the ConvModule_Dino module sequentially performs convolution, BatchNorm2d normalization, and DinoV3 ViT Block processing, where the processing flow of the DinoV3 ViT Block is as follows:
[0090] First, the input state is denoted as state one. State one is subjected to layer normalization (LN) and multi-head self-attention (MSA) to obtain state two. State two is connected with the residual of state one and then subjected to layer normalization (LN) again to obtain state three. State three is activated by multilayer perceptron (MLP) and GELU to obtain state four. Finally, state one, state three and state four are fused with residuals to obtain multi-level feature maps.
[0091] Step S3.2: Design a multi-scale feature fusion mechanism: Input multi-level feature maps, the neck network adopts a CSP structure, integrating a path aggregation network and a feature pyramid network; fuse the multi-level feature maps output by the backbone network through upsampling and downsampling operations to generate feature pyramids of three scales: 80×80 pixels, 40×40 pixels, and 20×20 pixels.
[0092] Step S3.3: Model Training and Optimization: The variable-scale sparse small target detection model is trained end-to-end using the training set. Data augmentation strategy is adopted, and the model parameters are iteratively updated through the gradient descent algorithm until the loss converges, thus obtaining the trained variable-scale sparse small target detection model.
[0093] Step S3.4: Model Validation and Deployment: After training, the performance of the variable-scale sparse small target detection model is evaluated using a test set. The metrics include precision, recall, mean precision, and frame rate. Finally, the weights of the optimized variable-scale sparse small target detection model are converted into a deployment format and integrated into embedded devices or cloud platforms to achieve real-time small target detection.
[0094] Step S4: Perform projection transformation on the acquired drone panoramic image, and use the trained variable-scale sparse small target detection model to perform image-by-image target detection and inference on the transformed drone panoramic image to extract suspected illegal and irregular targets; the specific steps of step S4 are as follows:
[0095] Step S4.1: Startup and configuration of the client application, run the PyQt client application;
[0096] Step S4.2: Integrate and initialize the inference engine, load the trained variable-scale sparse small target detection model, and allocate memory space; wherein, the inference engine in step S4.2 is a TensorRT-optimized inference engine; and the image scaling size in step S4.3 is 640×640 pixels.
[0097] Step S4.3: Preprocess the input drone panoramic image, including image scaling, normalization and noise suppression, call the inference engine to perform detection, and output the category of suspected illegal and irregular targets, target bounding box, confidence score and target polygon mask;
[0098] Step S4.4: Result visualization and interactive management. The user terminal overlays the bounding boxes and polygon masks of suspected illegal and irregular natural resource targets onto the original image in real time, and simultaneously displays the list and statistical information of suspected illegal and irregular natural resource targets.
[0099] Step S5: Based on the multi-view UAV images and their POS data, spatially locate the detected suspected illegal or irregular targets and obtain their geographic coordinates;
[0100] The specific steps of step S5 are as follows:
[0101] Step S5.1: Preprocess the multi-view drone images and drone panoramic images to obtain their metadata. The multi-view drone images refer to images taken by the drone from multiple angles in the air. The drone's built-in stitching and fusion algorithm is used to form the drone panoramic image.
[0102] Step S5.2: Based on metadata, construct a pixel spatial adjacency graph between multi-view UAV images, specifically including:
[0103] Step S5.2.1: Use the K-MEANS clustering algorithm to group the multi-view drone images according to the pitch angle value. The number of clusters is set according to the number of pitch angle groups in the actual drone panoramic image shooting.
[0104] Step S5.2.2: Sort the grouped multi-view drone images in ascending order of yaw angle, within the range of [-180, 180];
[0105] Step S5.2.3: Align the adjacent grouped multi-view drone images according to the yaw angle and connect them in pairs to generate a pixel space adjacency graph;
[0106] Step S5.3: Calculate the edge weights between adjacent images in the adjacency graph based on the LOFTR algorithm. The formula for calculating the edge weights is as follows:
[0107] ;
[0108] in, For edge weights, The number of interior points matched in the image pair;
[0109] Step S5.4: Read the Eulerian information of adjacent multi-view UAV image pairs, establish a rotation matrix, and unify the multi-view UAV images to a positive orientation through homography transformation. The formula for calculating the rotation matrix is:
[0110] ;
[0111] in For the perspective transformation matrix, To bypass Axis rotation Rotation matrix of degrees, To bypass Axis rotation Rotation matrix of degrees, To bypass Axis rotation The rotation matrix of degrees;
[0112] Step S5.5: Geospatial registration of the drone image directly below the multi-view drone image set with the high-resolution remote sensing image; the specific steps of step S5.5 are as follows:
[0113] Step S5.5.1: Based on the metadata of the UAV image directly below in the multi-view UAV image set, crop out the corresponding local high-resolution remote sensing image from the high-resolution remote sensing image.
[0114] Step S5.5.2: Using the pose information of the drone image directly below in the multi-view drone image set, convert it into a true-view corrected multi-view drone image through perspective transformation; the calculation formula for perspective transformation is:
[0115] ;
[0116] in, This is the output of the perspective transformation. ( ) represents the perspective transformation function. The image shown is the drone directly below in the original multi-view drone image set to be transformed by perspective. For the camera intrinsic parameter matrix, This is the inverse of the rotation matrix used to transform the view from the initial perspective to the BEV perspective. This represents the rotation matrix used to transform a true radio-corrected multi-view UAV image into a high-resolution remote sensing image.
[0117] Step S5.5.3: Use the LOFTR algorithm to perform image matching between the local high-resolution remote sensing image and the true radiometrically corrected multi-view UAV image to obtain initial matching point pairs; perform inverse transformation on the initial matching point pairs to obtain matching point pairs between the true radiometrically corrected multi-view UAV image and the local high-resolution remote sensing image; use the PyCOLMAP tool to solve and correct the absolute pose of the true radiometrically corrected multi-view UAV image using the matching point pairs.
[0118] Step S5.6: Based on the pixel spatial adjacency map and the absolute pose of the UAV image directly below in the registered multi-view UAV image set; through minimum weight distance path search, successively decompose and transfer the pose transformation relationship to complete the geospatial registration of all remaining multi-view UAV images with high-resolution remote sensing images, and output the simulated orthorectified multi-view UAV image set; the formula for calculating the absolute pose of the multi-view UAV images is:
[0119] ;
[0120] in, This represents the absolute pose of the current multi-view drone images. This represents the rotation matrix that transforms the multi-view drone image from the actual, corrected image to the current multi-view drone image. This represents the translation matrix from the truly corrected multi-view drone image to the current multi-view drone image. This represents the rotation matrix used to transform a true radio-corrected multi-view UAV image into a high-resolution remote sensing image. This represents the translation matrix used to transform a true, corrected multi-view UAV image into a high-resolution remote sensing image.
[0121] Step S5.7: Based on the pixel coordinates of the suspected illegal or irregular target on the drone panoramic image, traverse and locate the corresponding multi-view drone images, and then calculate the geographic coordinates of the suspected illegal or irregular target from the set of real-shot corrected multi-view drone images based on the registration information.
[0122] Step S6: Integrate the low-altitude UAV panoramic intelligent early warning device and execute steps S1 to S5.
[0123] like Figure 4 As shown in this embodiment, the low-altitude UAV panoramic intelligent early warning device for natural resource monitoring includes a device 10 comprising: a labeling and training device 01 for performing sample labeling, model training and optimization; a monitoring device 02 for collecting real-time status data of the area to be inspected; a positioning device 03 for obtaining precise location information; a server 04 that is communicatively connected to the labeling and training device 01, the monitoring device 02, the positioning device 03 and the client 05, for receiving and processing data and issuing inspection tasks; a client 05 for displaying inspection status, early warning information and processing results to the user; a controller 11 that communicates with the server 04 for controlling the inspection operation of the UAV 12; a drone nest 13 for providing parking and resupply support for the UAV 12; and a UAV 12 equipped with a camera for executing inspection tasks according to instructions and transmitting data back.
[0124] This embodiment conducts a panoramic data acquisition experiment based on multiple types of drones:
[0125] Experimental environment: The experimental hardware server uses an Intel i13 processor, a 5090TI GPU with 48GB of GPU memory, and the experimental operating system is CentOS. The experimental software includes Python 3.12, Visual Studio Code, PyTorch, QGIS, etc.
[0126] The experimental area overview and the panoramic data collection points and data collection route diagram of this application are shown in the following figures: Figure 5 As shown, this study selected simulation data from three areas generated from the real environment of Longtan Street for experimental analysis. The experimental parameters were set as follows: 5 flights, with a coverage radius of 750m.
[0127] The various types of drones mainly include airport drones and mobile personal drones. This application selects DJI 3rd generation airport drones and DJI M4E personal drones. The flight route planning is based on a single day, calculating several take-off and landing points according to the mission scope. The take-off and landing points need to be located in areas with good transportation access and far away from high-voltage power transmission lines and important linear projects such as high-speed railways.
[0128] Personal drone flight paths are also automatically generated by the system based on the effective coverage radius, and then set by the pilot based on the drone's endurance and the site environment. This mode emphasizes the safety and legality of flight data collection, and pilots must hold a CAAC drone license. It is suitable for flexible and efficient data collection of local areas and targets with short-term changes.
[0129] The panoramic image export size is 1448×7200, and the recommended effective coverage radius is 800 m, without distinguishing between mountainous areas and plains. When airspace conditions permit, the recommended flight altitude is 300 m in mountainous areas and 200 m in plains. The position and attitude information that needs to be recorded in the image includes: longitude of the shooting point, latitude of the shooting point, true altitude of the shooting point, altitude of the shooting point, starting azimuth angle, aircraft fuselage yaw angle, aircraft fuselage roll angle, and aircraft fuselage azimuth angle.
[0130] Experiment on classification and sample labeling of suspected illegal and non-compliant targets:
[0131] This application annotated 17,000 preprocessed UAV images, obtaining a total of 50,473 annotation results. The annotation data was organized in the MSCOCO dataset format, and the XML format annotation data file generated from the training sample annotation was converted into the MSCOCO dataset format.
[0132] Training experiments for a variable-scale sparse small target detection model for panoramic data:
[0133] In the training process of the variable-scale sparse small target detection model for panoramic data, the deployment and parameter settings of the relevant configuration files were completed first: the training set and validation set were stored in the dataset directory, the architecture definition files of each version of YOLO were placed in the \cfg\models path, and the dataset-related configurations were uniformly managed through the self-created zrzy.yaml file. At the same time, the training script train.py, the prediction script predict.py, and the basic model file yolov8n.pt were prepared. Then, the core configuration files were modified in a targeted manner. First, zrzy.yaml was adjusted, setting the dataset root path path to zrzy, the training set path train to images / train, and the validation set path val to images / val. Twenty categories of natural resource monitoring targets were defined in names. Then, train.py was modified to specify the basic training model and configure the training parameters (model.train(data='road_damage.yaml',workers=0,epochs=300,batch=16)). The train.py script was then run to start the training of the variable-scale sparse small target detection model for panoramic data.
[0134] During model training, the loss function (Loss) showed a decreasing trend, with the loss for detection boxes eventually stabilizing around 1.0, while the classification loss approached 0. The model was saved every 20 generations, with the generation with the highest accuracy saved as a "best.pt" file. Model accuracy gradually increased with each training generation, eventually stabilizing around 0.68, while recall also increased, stabilizing around 0.6. mAP50 also increased with each training generation, eventually stabilizing around 0.6. The model ultimately converged, exhibiting high accuracy, and can be used for practical detection tasks.
[0135] Experiment on UAV panoramic data projection transformation and image-by-image target detection inference:
[0136] By conducting inference and recognition targeting 20 categories of interest, the model detects image data and achieves good detection results with few false positives and false negatives, and the targets can be detected in most cases. Figure 6 This is a rendering of a prefabricated shed house, as predicted in this application. Figure 6 (a) represents the original drone image. Figure 6 (b) represents the model's prediction of the prefabricated sheds. As can be seen from these two figures, the classification results of the prefabricated sheds are relatively accurate, the boundaries of the detection boxes fit the actual ground features well, and the actual application effect is good.
[0137] Experiment on evaluation of monocular spatial positioning accuracy of panoramic data:
[0138] like Figure 3 The diagram shown is a flowchart of the geospatial registration process between the UAV image at the bottom and the high-resolution remote sensing image in the multi-view UAV image set of this application. Based on the panoramic data monocular omnidirectional spatial positioning algorithm, the solution accuracy and effectiveness of the bidirectional conversion algorithm between the geographic coordinates and panoramic coordinates of the low-altitude UAV panoramic image were verified through experiments and systems. Monocular spatial positioning can be achieved within a range of 800 meters. As the distance from the UAV aerial photography point increases, the bidirectional positioning error shows an upward trend.
[0139] Experiment on Algorithm for Converting Geographic Coordinate Latitude and Longitude Values to Panoramic Pixel Orientation Values / Experiment on Algorithm for Converting Panoramic Pixel Orientation Values to Geographic Coordinate Latitude and Longitude Values: Multiple ground points with known geographic coordinates are selected as markers. The panoramic coordinates of these points are calculated using UAV panoramic images, and then converted back to geographic coordinates. The conversion error is calculated.
[0140] Cross-validation of ground control points was conducted to compare the consistency of coordinates before and after the transformation, and the error distribution and standard deviation were statistically analyzed. It is evident that the bidirectional positioning error increases with the distance from the drone's aerial photography point, exhibiting a significant linear correlation. The overall positioning error remains within an acceptable range, exhibiting approximately one-thousandth of a positioning error.
[0141] The technical means disclosed in this application are not limited to the technical means disclosed in the above embodiments, but also include technical solutions composed of any combination of the above technical features.
Claims
1. A panoramic intelligent early warning method for low-altitude unmanned aerial vehicles (UAVs) for natural resource monitoring, characterized by: Specifically, the steps include the following: Step S1: Conduct panoramic data collection based on multiple types of drones: acquire multi-view drone images and drone panoramic images synthesized from multi-view drone images, and record POS data simultaneously; Step S2: Driven by natural resource monitoring operations, classify and label suspected illegal and non-compliant targets to construct a panoramic intelligent sample dataset of low-altitude UAVs for natural resource monitoring. Step S3: Construct and train a variable-scale sparse small target detection model for panoramic data; The specific steps of step S3 are as follows: S3.1: Constructing a variable-scale sparse small target detection model based on YOLOv8-seg: The variable-scale sparse small target detection model includes a backbone network, a neck network, a head network, and a post-processing module; In this process, the feature extraction module of the backbone network embeds an SE attention module after each bottleneck layer in the original Darknet Bottleneck sequence. The SE attention module compresses the spatial dimension through global average pooling, generates channel weights through convolution, ReLU activation, deconvolution, and the Sigmoid function, and repeats this process n times. Then, the channel weights are multiplied by the original feature map for feature recalibration. Simultaneously, the standard ConvModule modules in the backbone and neck network are replaced with ConvModule_Dino modules; the ConvModule_Dino module sequentially performs convolution, BatchNorm2d normalization, and DinoV3 ViT Block processing, where the processing flow of the DinoV3 ViT Block is as follows: First, the input state is denoted as state one. State one is subjected to layer normalization (LN) and multi-head self-attention (MSA) to obtain state two. State two is connected with the residual of state one and then subjected to layer normalization (LN) again to obtain state three. State three is activated by multilayer perceptron (MLP) and GELU to obtain state four. Finally, state one, state three and state four are fused with residuals to obtain multi-level feature maps. Step S4: Perform projection transformation on the collected drone panoramic image, and use the trained variable-scale sparse small target detection model to perform image-by-image target detection and inference on the transformed drone panoramic image to extract suspected illegal and irregular targets. Step S5: Based on the multi-view UAV images and their POS data, spatially locate the detected suspected illegal or irregular targets and obtain their geographic coordinates; Step S6: Integrate the low-altitude UAV panoramic intelligent early warning device and execute steps S1 to S5.
2. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 1, characterized in that, The specific steps of step S1 are as follows: Step S1.1: Prepare multiple types of drones. The drones are equipped with cameras and integrated with IMU inertial measurement units, RTK-level GPS modules and wireless data transmission links to synchronously collect road images and POS data. Step S1.2: Input the panoramic data acquisition points and flight path planning, set the drone's flight parameters, and plan the drone's data acquisition flight path; Step S1.3: The UAV conducts panoramic data acquisition; after the UAV flies to the panoramic data acquisition point, it acquires multiple multi-view UAV images at the same location, and the UAV imaging chip or subsequent processing combines the multiple multi-view UAV images into a single UAV panoramic image. Step S1.4: Output and store multiple multi-view drone images and drone panoramic images; synchronously record the POS data corresponding to each image, and complete local and backup storage and data integrity verification.
3. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 2, characterized in that, The specific steps of step S2 are as follows: Step S2.1: Driven by natural resource monitoring operations, classify and code suspected illegal and irregular targets; determine the list of categories of suspected illegal and irregular targets that need to be identified, and assign a unique code to each category; The category list includes at least the following 20 categories of targets: excavators, soil turners, dump trucks, transport vehicles, road rollers, bulldozers, pile drivers, cranes, mixers, material hoists, tower cranes, sheds and prefabricated houses, dust nets, scaffolding, brick houses under construction, piles of soil, piles of bricks, piles of steel bars, flames, and thick smoke. Step S2.2: Label suspected illegal and irregular activities driven by natural resource monitoring business to form a panoramic intelligent sample dataset of low-altitude UAVs for natural resource monitoring; Step S2.3: Divide the low-altitude UAV panoramic intelligent sample dataset for natural resource monitoring into a training set and a validation set in a 7:3 ratio; Used for training and evaluating models.
4. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 3, characterized in that, Step S3 further includes the following steps: Step S3.2: Design a multi-scale feature fusion mechanism: Input multi-level feature maps, the neck network adopts a CSP structure, integrating a path aggregation network and a feature pyramid network; fuse the multi-level feature maps output by the backbone network through upsampling and downsampling operations to generate feature pyramids of three scales: 80×80 pixels, 40×40 pixels, and 20×20 pixels. Step S3.3: Model Training and Optimization: The variable-scale sparse small target detection model is trained end-to-end using the training set. Data augmentation strategy is adopted, and the model parameters are iteratively updated through the gradient descent algorithm until the loss converges, thus obtaining the trained variable-scale sparse small target detection model. Step S3.4: Model Validation and Deployment: After training, the performance of the variable-scale sparse small target detection model is evaluated using a test set. The metrics include precision, recall, mean precision, and frame rate. Finally, the weights of the optimized variable-scale sparse small target detection model are converted into a deployment format and integrated into embedded devices or cloud platforms to achieve real-time small target detection.
5. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 4, characterized in that, The specific steps of step S4 are as follows: Step S4.1: Startup and configuration of the client application, run the PyQt client application; Step S4.2: Integrate and initialize the inference engine, load the trained variable-scale sparse small target detection model, and allocate memory space; Step S4.3: Preprocess the input drone panoramic image, including image scaling, normalization and noise suppression, call the inference engine to perform detection, and output the category of suspected illegal and irregular targets, target bounding box, confidence score and target polygon mask; Step S4.4: Result visualization and interactive management. The user terminal overlays the bounding boxes and polygon masks of suspected illegal and irregular natural resource targets onto the original image in real time, and simultaneously displays the list and statistical information of suspected illegal and irregular natural resource targets.
6. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 5, characterized in that, In step S4.2, the inference engine is a TensorRT-optimized inference engine; in step S4.3, the image scaling size is 640×640 pixels.
7. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 6, characterized in that, The specific steps of step S5 are as follows: Step S5.1: Preprocess the multi-view drone images and drone panoramic images to obtain their metadata. The multi-view drone images refer to images taken by the drone from multiple angles in the air. The drone's built-in stitching and fusion algorithm is used to form the drone panoramic image. Step S5.2: Based on metadata, construct a pixel spatial adjacency graph between multi-view UAV images, specifically including: Step S5.2.1: Use the K-MEANS clustering algorithm to group the multi-view drone images according to the pitch angle value. The number of clusters is set according to the number of pitch angle groups in the actual drone panoramic image shooting. Step S5.2.2: Sort the grouped multi-view drone images in ascending order of yaw angle, within the range of [-180, 180]; Step S5.2.3: Align the adjacent grouped multi-view drone images according to the yaw angle and connect them in pairs to generate a pixel space adjacency graph; Step S5.3: Calculate the edge weights between adjacent images in the adjacency graph based on the LOFTR algorithm. The formula for calculating the edge weights is as follows: ; in, For edge weights, The number of interior points matched in the image pair; Step S5.4: Read the Eulerian information of adjacent multi-view UAV image pairs, establish a rotation matrix, and unify the multi-view UAV images to a positive orientation through homography transformation. The formula for calculating the rotation matrix is: ; in For the perspective transformation matrix, To bypass Axis rotation The rotation matrix of degrees, To bypass Axis rotation Rotation matrix of degrees, To bypass Axis rotation The rotation matrix of degrees; Step S5.5: Geospatial registration of the drone image directly below the multi-view drone image set with the high-resolution remote sensing image; Step S5.6: Based on the pixel spatial adjacency map and the absolute pose of the UAV image directly below in the registered multi-view UAV image set; through minimum weight distance path search, successively decompose and transfer the pose transformation relationship to complete the geospatial registration of all remaining multi-view UAV images with high-resolution remote sensing images, and output the simulated orthorectified multi-view UAV image set; the formula for calculating the absolute pose of the multi-view UAV images is: ; in, This represents the absolute pose of the current multi-view drone images. This represents the rotation matrix that transforms the multi-view drone image from the actual, corrected image to the current multi-view drone image. This represents the translation matrix from the truly corrected multi-view drone image to the current multi-view drone image. This represents the rotation matrix used to transform a true radio-corrected multi-view UAV image into a high-resolution remote sensing image. This represents the translation matrix used to transform a true, corrected multi-view UAV image into a high-resolution remote sensing image. Step S5.7: Based on the pixel coordinates of the suspected illegal or irregular target on the drone panoramic image, traverse and locate the corresponding multi-view drone images, and then calculate the geographic coordinates of the suspected illegal or irregular target from the real-shot corrected multi-view drone image set and based on the registration information.
8. The low-altitude UAV panoramic intelligent early warning method for natural resource monitoring according to claim 7, characterized in that, The specific steps of step S5.5 are as follows: Step S5.5.1: Based on the metadata of the UAV image directly below in the multi-view UAV image set, crop out the corresponding local high-resolution remote sensing image from the high-resolution remote sensing image. Step S5.5.2: Using the pose information of the drone image directly below in the multi-view drone image set, convert it into a true-view corrected multi-view drone image through perspective transformation; the calculation formula for perspective transformation is: ; in, This is the output of the perspective transformation. ( ) represents the perspective transformation function. The image shown is the drone directly below in the original multi-view drone image set to be transformed by perspective. For the camera intrinsic parameter matrix, This is the inverse of the rotation matrix used to transform the view from the initial perspective to the BEV perspective. This represents the rotation matrix used to transform a true radio-corrected multi-view UAV image into a high-resolution remote sensing image. Step S5.5.3: Use the LOFTR algorithm to perform image matching between the local high-resolution remote sensing image and the true radiometrically corrected multi-view UAV image to obtain initial matching point pairs; perform inverse transformation on the initial matching point pairs to obtain matching point pairs between the true radiometrically corrected multi-view UAV image and the local high-resolution remote sensing image; use the PyCOLMAP tool to solve and correct the absolute pose of the true radiometrically corrected multi-view UAV image using the matching point pairs.
9. A panoramic intelligent early warning device for low-altitude unmanned aerial vehicles (UAVs) for natural resource monitoring, used to implement the panoramic intelligent early warning method for low-altitude UAVs for natural resource monitoring as described in any one of claims 1 to 8, characterized in that, The device (10) includes a labeling and training device (01) for performing sample labeling, model training and optimization; a monitoring device (02) for collecting real-time status data of the area to be inspected; a positioning device (03) for obtaining precise location information; a server (04) for communicating with the labeling and training device (01), the monitoring device (02), the positioning device (03) and the client (05) respectively, for receiving and processing data and issuing inspection tasks; a client (05) for displaying inspection status, early warning information and processing results to the user; a controller (11) for communicating with the server (04) for controlling the inspection operation of the drone (12); a nest (13) for providing parking and resupply support for the drone (12); and a drone (12) equipped with a camera for performing inspection tasks according to instructions and transmitting data back.