Unmanned aerial vehicle detection method and device for power line channel ground objects

By using a heterogeneous dual-backbone network architecture for parallel feature extraction and frequency domain fusion, the problem of limited receptive field and attenuation of edge details in the detection of ground features along power line corridors by convolutional neural networks is solved, thus achieving high-precision change detection.

CN122244731APending Publication Date: 2026-06-19WUYI UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
WUYI UNIV
Filing Date
2026-03-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing convolutional neural networks suffer from limited receptive field and attenuation of edge details in detecting ground feature changes along power line corridors, leading to target localization errors and coarse boundary recognition, making it difficult to improve the accuracy of ground feature change detection.

Method used

By adopting a heterogeneous dual-backbone network architecture, and through parallel feature extraction and frequency domain fusion, combined with global and local enhancement features, the accuracy of change detection is improved.

Benefits of technology

It enables multi-scale and multi-view change detection, effectively improving the accuracy of surface change detection, and can capture global contextual relationships and preserve detailed features.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244731A_ABST
    Figure CN122244731A_ABST
Patent Text Reader

Abstract

This disclosure presents a method and apparatus for detecting unmanned aerial vehicle (UAV) objects along power line corridors. The method involves acquiring first and second sample images along the flight path at first and second time nodes, respectively. A dual-backbone network is then used to extract features from the first and second sample images in parallel, yielding first and second image features. Frequency fusion is performed based on these first and second image features to obtain global enhancement features, local enhancement features, and frequency domain fusion features. These features are then input into a decoder to obtain difference features. A change detection model is trained based on these difference features. The trained change detection model is used to detect changes in both the first and second images. Multiple target difference features output by the decoder are summed pixel-by-pixel, and a binary change detection map is generated based on the summation result, thus improving the accuracy of change detection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of unmanned aerial vehicle (UAV) remote sensing technology, and in particular to UAV detection methods and devices for power line corridor features. Background Technology

[0002] In the protection of power facilities, detecting changes in transmission lines and surrounding features is a crucial prerequisite for ensuring the safe operation of the power grid. Remote sensing detection technology based on UAV aerial photography has become the mainstream solution in this field. With the development of deep learning, convolutional neural networks (CNNs) are widely used in feature change detection tasks, using CNNs to perform image recognition on aerial images to achieve change detection. However, the inherent locality of convolution operations results in a limited receptive field, making it difficult to effectively capture global contextual relationships even with multiple stacked layers. Furthermore, edge detail attenuation is a common problem during downsampling. After multiple convolution operations, fine information such as feature boundaries and contours is gradually lost, leading to target localization errors and coarse boundary recognition, severely restricting the overall accuracy improvement of surface change detection. Summary of the Invention

[0003] The following is an overview of the subject matter described in detail in this disclosure. This overview is not intended to limit the scope of the claims.

[0004] This disclosure provides a UAV detection method for ground features along power line corridors, which effectively improves the accuracy of ground change detection.

[0005] On one hand, this disclosure provides a method for detecting unmanned aerial vehicle (UAV) features along power line corridors, including: A first sample image of the target area is acquired at a first time node along the flight path, and a second sample image of the target area is acquired at a second time node along the flight path, wherein the first time node is before the second time node; The first backbone network and the second backbone network are invoked to perform parallel feature extraction on the first sample image and the second sample image, respectively, to obtain the first image feature of the first sample image and the second image feature of the second sample image. The first image feature and the second image feature are stitched together to obtain a preprocessed feature. The preprocessed feature is decomposed into a first low-frequency component and a first high-frequency component. A long-range dependency is modeled based on the first low-frequency component to obtain a global enhancement feature. Edge features are enhanced based on the first high-frequency component to obtain a local enhancement feature. The global enhancement feature and the local enhancement feature are fused to obtain a frequency domain fusion feature. The frequency domain fusion feature contains the difference information between the first image feature and the second image feature. The local enhancement features, the global enhancement features, and the frequency domain fusion features are input into the decoder for decoding to obtain the difference features, and the change detection model is trained based on the difference features. In response to the change detection request, the trained change detection model is invoked to perform change detection on the first image to be processed and the second image to be processed. The multiple target difference features output by the decoder are summed pixel by pixel, and a binary change detection map is generated based on the summation result.

[0006] On the other hand, this disclosure also provides a drone detection device for power line corridor features, including: The image acquisition module is used to acquire a first sample image of the target area at a first time node along the flight path, and to acquire a second sample image of the target area at a second time node along the flight path, wherein the first time node is before the second time node; The feature extraction module is used to call the first backbone network and the second backbone network to perform parallel feature extraction on the first sample image and the second sample image respectively, so as to obtain the first image feature of the first sample image and the second image feature of the second sample image. The frequency domain fusion module is used to stitch the first image feature and the second image feature to obtain preprocessed features, decompose the preprocessed features into a first low-frequency component and a first high-frequency component, model long-range dependencies based on the first low-frequency component to obtain global enhancement features, enhance edge features based on the first high-frequency component to obtain local enhancement features, and fuse the global enhancement features and the local enhancement features to obtain frequency domain fusion features, wherein the frequency domain fusion features contain difference information between the first image feature and the second image feature; The model training module is used to input the local enhancement features, the global enhancement features, and the frequency domain fusion features into the decoder for decoding to obtain the difference features, and to train the change detection model based on the difference features; The change detection module is used to respond to change detection requests, call the trained change detection model to perform change detection on the first image to be processed and the second image to be processed, sum the multiple target difference features output by the decoder pixel by pixel, and generate a binary change detection map based on the summation result.

[0007] Furthermore, the feature extraction module is also used for: The first sample image is embedded with features to obtain image embedding features. The image embedding features are then split into first embedding features and second embedding features along the channel dimension. Global modeling is performed on the first embedded feature to obtain spatial features. Frequency domain transformation is performed on the second embedded feature to obtain frequency domain features. The spatial features and the frequency domain features are fused at the channel level to obtain the first global feature output by the second backbone network.

[0008] Furthermore, the feature extraction module is also used for: Wavelet decomposition is performed on the second embedded feature to obtain the second low-frequency component and the second high-frequency component; The second low-frequency component and the second high-frequency component are convolved to obtain mixed frequency features. The mixed frequency features are then subjected to inverse wavelet transform to obtain frequency domain features.

[0009] Furthermore, the frequency domain fusion module is also used for: Large kernel convolution feature extraction is performed on the first low frequency component to obtain the first intermediate feature. Global context modeling is then performed on the first intermediate feature to obtain the global enhanced feature. Convolutional feature extraction is performed on the first high-frequency component to obtain the second intermediate feature, and edge enhancement is performed on the second intermediate feature to obtain the local enhancement feature; The global enhancement features and the local enhancement features are added element-wise to obtain the frequency domain fusion features.

[0010] Furthermore, the model training module is also used for: The local enhancement features from the two sets of output features are input into the high-frequency branch for differential absolute value operation to obtain high-frequency difference features; The global enhancement features from the two sets of output features are input into the low-frequency branch for difference absolute value operation to obtain low-frequency difference features; The frequency domain fusion features from the two sets of output features are input into the fusion branch for differential absolute value operation to obtain the fusion difference features.

[0011] Furthermore, the model training module is also used for: The high-frequency loss is determined based on the high-frequency feature differences and the real labels, the low-frequency loss is determined based on the low-frequency difference features and the real labels, and the fusion loss is determined based on the fusion difference features and the real labels. Weight coefficients are configured for the high-frequency loss, the low-frequency loss, and the fusion loss respectively. The high-frequency loss, the low-frequency loss, and the fusion loss are weighted and summed based on the weight coefficients to obtain the target loss. The change detection model is trained based on the target loss.

[0012] Furthermore, the change detection module is also used for: Initialize flight parameters, including area radius, number of path nodes, flight start point, and flight end point; Construct a virtual environment map, set obstacle areas in the virtual environment map, and determine the flight start point, flight end point, and path nodes; Construct a node evaluation function, and determine the node evaluation value of each node to be expanded based on the node evaluation function; Multiple target path nodes are determined based on node evaluation values, and a flight path is generated based on these multiple target path nodes.

[0013] Furthermore, the change detection module is also used for: Determine the actual cost from the flight start point to the current path node, the estimated cost from the current path node to the flight end point, the shooting angle cost of the shooting angle of the current path node, the safe distance cost between the current path node and obstacles, and the coverage cost of each path node. A node evaluation function is constructed based on the actual cost, the estimated cost, the shooting angle cost, the safe distance cost, and the coverage cost.

[0014] Furthermore, the change detection module is also used for: Arrange the node evaluation values ​​of each node to be expanded in ascending order, determine the node to be expanded corresponding to the smallest node evaluation value as the target path node, and record the current path node as the parent node of the target path node. When the target path node indicates the flight destination and the coverage cost is less than the preset coverage cost threshold, the flight path is generated by tracing back from the flight destination.

[0015] The embodiments disclosed herein include at least the following beneficial effects: By acquiring a first sample image of the target area at a first time node along the flight path, and acquiring a second sample image of the target area at a second time node along the flight path, the first backbone network and the second backbone network are invoked to perform parallel feature extraction on the first sample image and the second sample image, respectively, to obtain the first image feature of the first sample image and the second image feature of the second sample image. By employing a heterogeneous dual backbone network architecture for feature extraction, the problem of easy loss of edge information can be effectively solved. Next, the first image feature and the second image feature are stitched together to obtain preprocessed features. The preprocessed features are decomposed into a first low-frequency component and a first high-frequency component. Long-range dependencies are modeled based on the first low-frequency component to obtain global enhanced features. Edge features are enhanced based on the first high-frequency component to obtain local enhanced features. The global enhanced features and local enhanced features are fused to obtain frequency domain fusion features. Differentiated processing is performed on the features output by different backbone networks, so that the high-frequency component can retain detailed features and the low-frequency component can capture a large range of spatial dependencies. Based on this, local enhancement features, global enhancement features, and frequency domain fusion features are input into the decoder for decoding to obtain difference features. A change detection model is trained based on the difference features. In response to change detection requests, the trained change detection model is called to perform change detection on the first and second images to be processed. The multiple target difference features output by the decoder are summed pixel by pixel. The decoder can realize multi-scale and multi-view change detection. A binary change detection map is generated based on the summation result, which effectively improves the accuracy of surface change detection.

[0016] Other features and advantages of this disclosure will be set forth in the following description and will be apparent in part from the description or may be learned by practicing this disclosure. Attached Figure Description

[0017] The accompanying drawings are provided to further understand the technical solutions of this disclosure and constitute a part of the specification. They are used together with the embodiments of this disclosure to explain the technical solutions of this disclosure and do not constitute a limitation on the technical solutions of this disclosure.

[0018] Figure 1 This is a schematic diagram of an optional overall structure of the change detection model provided in an embodiment of the present disclosure; Figure 2 An optional flowchart of a UAV detection method for ground features along power line corridors provided in this embodiment of the present disclosure; Figure 3 This is an optional schematic diagram of a spatial frequency domain fusion module provided in an embodiment of this disclosure; Figure 4 This is an optional schematic diagram of a frequency domain fusion module provided in an embodiment of this disclosure; Figure 5 An optional flowchart illustrating the generation of a flight path provided in an embodiment of this disclosure; Figure 6 This is a schematic diagram of an optional overall process for a UAV detection method for ground features along power line corridors, provided in an embodiment of this disclosure. Figure 7 This is a schematic diagram of an optional structure of a drone detection device for power line corridor features provided in an embodiment of this disclosure. Detailed Implementation

[0019] To make the objectives, technical solutions, and advantages of this disclosure clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and are not intended to limit the scope of this disclosure.

[0020] It should be noted that in the various specific embodiments of this disclosure, when processing is required based on data related to the characteristics of the target object, such as target object attribute information or a set of attribute information, the permission or consent of the target object will be obtained first. Furthermore, the collection, use, and processing of this data will comply with relevant laws, regulations, and standards. The target object can be a user. In addition, when embodiments of this disclosure require obtaining target object attribute information, separate permission or consent from the target object will be obtained through pop-ups or redirection to a confirmation page. Only after obtaining the target object's separate permission or consent will the necessary target object-related data for the normal operation of the embodiments of this disclosure be obtained.

[0021] In this disclosure, the terms "module" or "unit" refer to a computer program or part of a computer program that has a predetermined function and works with other related parts to achieve a predetermined goal, and can be implemented wholly or partially using software, hardware (such as processing circuitry or memory), or a combination thereof. Similarly, a processor (or multiple processors or memory) can be used to implement one or more modules or units. Furthermore, each module or unit can be part of an overall module or unit that includes the functionality of that module or unit.

[0022] In terms of power facility protection, monitoring changes in transmission lines and surrounding features is a crucial prerequisite for ensuring the safe operation of the power grid. However, with urban expansion and land scarcity, illegal encroachment within power line corridors is becoming increasingly serious, including unauthorized construction and vegetation overgrowth. These violations can easily lead to significant hazards such as electric shock, short circuits, and fires, threatening the safe operation of the power grid. Therefore, timely detection and handling of such violations can effectively maintain the stable operation of the power system.

[0023] In existing technologies, surface change detection can be broadly categorized into three types. The first type involves manual on-site inspections, where workers patrol along power line corridors, relying on visual assessment to determine encroachment. This method is inefficient and struggles to cover large areas. The second type utilizes drone aerial photography, where cameras mounted on aircraft transmit images back to workers for manual identification. This method is an improvement over the first, but still suffers from subjectivity, fatigue, and oversight, particularly in its ability to identify small targets and subtle changes. The third type involves remote sensing image analysis, which uses image processing and pattern recognition techniques to extract change features from image data collected at different times. Compared to the first two methods, remote sensing offers wider coverage, is not constrained by terrain, acquires data rapidly, has a short update cycle, and allows for quick comparison of temporal changes in the same area, demonstrating significant advantages in terms of information volume and timeliness.

[0024] With the development of deep learning, its introduction has significantly improved the detection accuracy and automation of remote sensing image analysis methods. Convolutional neural network (CNN)-based methods are widely used due to their powerful feature extraction capabilities; however, the inherent locality of convolution operations limits their receptive field, making it difficult to effectively capture long-distance spatial dependencies even with multi-layer stacking. The Transformer architecture enhances its global modeling capabilities through self-attention mechanisms, but the computational cost increases quadratically with sequence length, resulting in huge resource overhead when dealing with high-resolution images. Emerging state-space models achieve linear complexity modeling through selective mechanisms, significantly reducing computational costs while maintaining global modeling capabilities; however, existing schemes do not adequately utilize the spatial structure of images and do not deeply explore frequency domain features. Furthermore, mainstream methods generally suffer from edge detail attenuation during downsampling. After multiple convolution operations, fine information such as ground feature boundaries and contours is gradually lost, leading to target localization errors and coarse boundary recognition, severely restricting the overall accuracy of surface change detection.

[0025] Based on this, the present disclosure provides a UAV detection method for ground features along power line corridors, which effectively improves the accuracy of change detection.

[0026] The UAV detection method for power line corridor features provided in this embodiment is implemented using a change detection model, referring to... Figure 1 , Figure 1This is a schematic diagram of an optional overall structure of the change detection model provided in this embodiment. The change detection model includes a heterogeneous dual-backbone module, a frequency domain fusion module, and a decoder. The heterogeneous dual-backbone module includes a first backbone network and a second backbone network, and the network architectures of the first backbone network and the second backbone network are different. The first backbone network includes an embedding layer and multiple convolutional inverse residual modules, and the second backbone network includes an embedding layer and multiple spatial-frequency domain fusion modules. The decoder is a three-branch decoder, with each branch used to process different types of features.

[0027] Reference Figure 2 , Figure 2 This is an optional flowchart of a method for detecting unmanned aerial vehicles (UAVs) along power line corridors, provided in an embodiment of the present disclosure. This method includes, but is not limited to, the following steps S201 to S205.

[0028] Step S201: Acquire the first sample image of the target area at the first time node along the flight path, and acquire the second sample image of the target area at the second time node along the flight path.

[0029] The flight path is the actual flight trajectory of the UAV, a continuous spatial curve connecting the flight start point, flight end point, and waypoints. The first and second time points are the image acquisition times, with the first time point preceding the second time point. The target area is the region where the image is acquired. There is at least one target area along the flight path, and each target area includes at least one instance object. For example, if the flight path is a power line corridor, then the target area could be at least one specific area within the power line corridor, and the instance objects within the target area could be buildings, trees, signs, roads, etc.

[0030] Specifically, a first sample image of the target area was acquired at the first time point along the flight path, and a second sample image of the target area was acquired at the second time point along the same flight path, with the flight altitude and angle being basically the same for both acquisitions. Next, the acquired first and second sample images were manually labeled, marking the areas of change between the two time points. These areas of change included newly added buildings, such as irregular constructions or house extensions; trees encroaching on safety distances; newly added irregular facilities, such as billboards, temporary buildings, or stockpiled materials; and ground changes, such as ground hardening or excavation. The first and second sample images of the same target area were used as sample pairs to construct a ground feature change detection dataset.

[0031] Next, the change detection dataset is preprocessed to improve data quality. The first step is image registration, where a registration model is used to detect and match feature points in the first and second sample images. The registration model can be implemented using at least one of the SIFT or ORB algorithms. Affine transformation is used to perform geometric correction on the registration model, ensuring precise alignment of the first and second sample images, ultimately achieving pixel-level registration accuracy. The second step is image denoising, using Gaussian filtering to remove noise from both the first and second sample images. The third step is image enhancement, employing histogram equalization to improve image contrast, effectively enhancing visual quality and detail. The fourth step is data augmentation, using random cropping, random rotation, random flipping, and color dithering to expand the training samples and improve the generalization ability of the change detection model. Specifically, the cropping size of the random cropping can be 256×256 pixels; the random rotation angle can be at least one of 0°, 90°, 180°, and 270°; the random flip can be at least one of horizontal flipping or vertical flipping; and the color jitter can be at least one of adjusting brightness, adjusting contrast, and adjusting saturation.

[0032] Step S202: Call the first backbone network and the second backbone network to perform parallel feature extraction on the first sample image and the second sample image respectively, to obtain the first image feature of the first sample image and the second image feature of the second sample image.

[0033] Specifically, the feature extraction process in step S202 is performed by a dual-branch backbone network. The first backbone network is a branch backbone network based on a convolutional neural network, which uses an inverted residual block structure to extract features from the first and second sample images, mainly focusing on extracting high-frequency detail information and local texture information. Specifically, it can be a MobileNetV2 network. For the input first and second sample images, the first backbone network extracts the local detail features and texture information of the first and second sample images respectively through layer-by-layer convolution operations, and outputs three multi-scale features with different resolutions. For example, taking the feature dimensions of the first and second sample images as H×W×3, the feature dimensions corresponding to its output multi-scale features can be... , , The first backbone network can effectively capture detailed changes such as edges or contours of instance objects in the target area by extracting high-frequency detail information and local texture features.

[0034] The second backbone network is a frequency-domain guided branch backbone network, primarily targeting change detection tasks. In one possible implementation, during the feature extraction process of the first sample image using the second backbone network, specifically, the first sample image is input into the embedding layer, and feature embedding is performed on the first sample image to obtain image embedding features. The image embedding features are sequentially input into multiple spatial-frequency domain fusion modules for spatial-frequency domain information fusion. In the spatial-frequency domain fusion module, referencing... Figure 3 , Figure 3 This is an optional schematic diagram of a spatial frequency domain fusion module provided in an embodiment of the present disclosure, in which image features are embedded along the channel dimension. Split into first embedded features Second Embedded Features The splitting process can be represented by the following formula: For feature splitting function, This represents the number of channels for the first embedded feature. This represents the number of channels for the second embedding feature.

[0035]

[0036] Next, in the upper branch, the first embedded feature is processed using a four-way cross-scan mechanism. Global modeling is performed, and then the feature size is dynamically adjusted through the baseline size parameters to obtain spatial features. The four-way cross-scanning mechanism includes four scanning directions: forward raster scan, column-first scan, reverse raster scan, and reverse column-first scan. Through the four-way cross-scanning mechanism, complementary contextual information can be captured from multiple spatial dimensions, comprehensively modeling long-range dependencies in two-dimensional space. The global modeling process can be represented by the following formula: SS2D This indicates a four-way cross-scanning mechanism.

[0037]

[0038] Next, in the next branch, the second embedding feature is processed. Perform frequency domain transformation to obtain frequency domain features Finally, spatial features Frequency domain characteristics Channel-level fusion is performed to obtain the first global feature output by the second backbone network. The fusion process can be represented by the following formula.

[0039]

[0040] Finally, the second backbone network outputs three multi-scale features at different resolutions. It's important to note that the feature dimensions of the multi-scale features output by the second backbone network are the same as those output by the first backbone network. For example, taking the feature dimension of the first sample image as H×W×3, the feature dimension of its output multi-scale features could be... , , The second backbone network, by leveraging linear complexity to enhance the modeling of global context and long-range dependencies, can effectively capture the linear structural features of power line corridors in flight paths.

[0041] It is understandable that the process of the second backbone network extracting features from the second sample image is similar to the process described above, and will not be repeated here.

[0042] In one possible implementation, during the process of performing a frequency domain transformation on the second embedded feature to obtain the frequency domain feature, specifically, referring again... Figure 3 , embed the second feature The input is processed in the first transform module using Discrete Wavelet Transform (DWT). The Discrete Wavelet Transform process can be represented by the following formula. , The second low-frequency component was obtained. With the second high-frequency component For the second low-frequency component With the second high-frequency component Convolutional processing is performed to enhance change-sensitive features such as edges and textures, resulting in mixed frequency features. These mixed frequency features are then input into the second transform module for inverse wavelet transform (IDWT) to restore spatial resolution, yielding frequency domain features with the same feature dimensions as the second embedded features. The entire frequency domain transformation process can be represented by the following formula.

[0043]

[0044] Step S203: The first image features and the second image features are stitched together to obtain preprocessed features. The preprocessed features are decomposed into a first low-frequency component and a first high-frequency component. Long-range dependence is modeled based on the first low-frequency component to obtain global enhancement features. Edge features are enhanced based on the first high-frequency component to obtain local enhancement features. The global enhancement features and local enhancement features are fused to obtain frequency domain fusion features.

[0045] The process of step S203 is executed in the frequency domain fusion module.

[0046] Specifically, refer to Figure 4 , Figure 4This is an optional schematic diagram of the frequency domain fusion module provided in this embodiment. After stitching together the first image features and the second image features, the stitching result is preprocessed using a 1×1 convolution to unify the channel dimensions, resulting in preprocessed features. The preprocessing process can be represented by the following formula. The first image feature, This is a second image feature. This indicates a splicing operation.

[0047]

[0048] Next, the preprocessed features are input into the first transform module for discrete wavelet transform, decomposing the preprocessed features into the first low-frequency component. Second high frequency component Its discrete wavelet transform can be expressed by the following formula.

[0049]

[0050] For the first low-frequency component, a 7×7 large receptive field convolution is used. Feature extraction is performed to obtain the first intermediate feature. This first intermediate feature is then input into the global context module to model long-range dependencies, resulting in the global augmented feature. The processing of the first low-frequency component can be represented by the following formula, GCB This indicates the processing procedure of the global context module.

[0051]

[0052] For the first high-frequency component, the first high-frequency component The concatenation is performed, and the concatenation result is used to extract features to obtain the second intermediate feature. The Sobel operator is used to enhance the edge of the second intermediate feature to obtain the local enhanced feature. The processing of the first high-frequency component can be represented by the following formula.

[0053]

[0054] Finally, the global enhancement features were analyzed separately. With local enhancement features An upsampling operation is performed to restore the spatial resolution of the global and local enhancement features to the same spatial resolution as the first and second image features. , Low-frequency enhancement features and high frequency enhancement features Element-wise addition is performed to obtain the frequency domain fusion features. .

[0055] It should be noted that the frequency domain fusion module outputs frequency domain fusion features. Low-frequency enhancement features and high-frequency enhancement features The frequency domain fusion features contain a complete spatial-frequency domain feature representation. The high-frequency enhancement features contain local features that are highly sensitive to detailed changes in building outlines, tree edges, and other details. The low-frequency enhancement features contain global features that capture large-scale land use changes and overall structural changes. By outputting these three types of features, we can not only capture local detailed changes and global semantic changes simultaneously, but also achieve multi-scale information complementarity using fusion features, thereby improving the robustness of the change detection model to complex environments.

[0056] Step S204: Input the local enhancement features, global enhancement features, and frequency domain fusion features into the decoder for decoding to obtain the difference features, and train the change detection model based on the difference features.

[0057] Among them, frequency domain fusion features, local enhancement features, and global enhancement features contain difference information between the first image features and the second image features at different granularity levels.

[0058] In one possible implementation, the change detection model performs two frequency fusion processes in parallel to obtain two sets of independent output features corresponding to the first and second sample images. The output features include local enhancement features, global enhancement features, and frequency domain fusion features. The decoder includes a high-frequency branch, a low-frequency branch, and a fusion branch. During the process of inputting the local enhancement features, global enhancement features, and frequency domain fusion features into the decoder for decoding to obtain difference features, the local enhancement features from the two sets of output features are input into the high-frequency branch for differential absolute value operation to obtain high-frequency difference features; the global enhancement features from the two sets of output features are input into the low-frequency branch for differential absolute value operation to obtain low-frequency difference features; and the frequency domain fusion features from the two sets of output features are input into the fusion branch for differential absolute value operation to obtain fused difference features.

[0059] In any branch, the corresponding features from the two sets of output features are input into the temporal difference module. The absolute difference operation is used to calculate the feature difference between the first sample image and the second sample image. The calculation process of the absolute difference operation can be expressed by the following formula. Indicates the difference characteristics, This represents the output feature corresponding to the first sample image. This represents the output feature corresponding to the second sample image.

[0060]

[0061] Next, the high-frequency difference features, low-frequency difference features, and fused difference features obtained are refined by convolution, and then the spatial resolution of the refined features is restored to the same resolution as the first sample image and the second sample image by upsampling.

[0062] In one possible implementation, during the training of the change detection model based on differential features, specifically, the high-frequency loss is obtained by calculating the cross-entropy loss based on high-frequency differential features and the true label; the low-frequency loss is obtained by calculating the cross-entropy loss based on low-frequency differential features and the true label; and the fusion loss is obtained by calculating the cross-entropy loss based on the fused differential features and the true label. Weight coefficients are assigned to the high-frequency loss, low-frequency loss, and fusion loss respectively. The high-frequency loss, low-frequency loss, and fusion loss are then weighted and summed based on these weight coefficients to obtain the target loss. The change detection model is then trained based on the target loss. The target loss can be expressed by the following formula.

[0063] Where g represents the real label. High-frequency difference characteristics, It is a low-frequency difference feature. In order to integrate the differences, For high-frequency loss, For low-frequency loss, For the loss of fusion, These are the weighting coefficients for high-frequency loss. The weighting coefficients for low-frequency loss are... In this embodiment of the disclosure, the weighting coefficients for the fusion loss are... , , Set to 1. Step S205: In response to the change detection request, the trained change detection model is invoked to perform change detection on the first image to be processed and the second image to be processed. The multiple target difference features output by the decoder are summed pixel by pixel, and a binary change detection map is generated based on the summation result.

[0064] Specifically, in response to a change detection request, the system acquires a first image to be processed, gathered at a historical time point, and a second image to be processed, gathered at the current time point, both along the same flight path. A trained change detection model is then invoked to perform change detection on both the first and second images. After dual-core feature extraction and frequency domain fusion, the high-frequency difference features, low-frequency difference features, and fused difference features output by the decoder are summed pixel-by-pixel. The maximum value of the summation result is then taken to generate a binary change detection map. This binary change detection map consists of 0s and 1s; a pixel value of 1 represents a detected change area, and a pixel value of 0 represents a no-change area.

[0065] Next, the binarized change detection map undergoes post-processing. Morphological operations are used to extract independent change regions through connected component analysis, filtering out regions with areas smaller than a threshold. Finally, a change detection report is generated based on the analysis results. The report includes information such as the GPS coordinates of the change region, the type of change, the distance from the power line, and the risk level. Based on the change detection report, it is determined whether any non-compliant situations have occurred in the corresponding areas, and corresponding handling plans are formulated to promptly eliminate safety hazards.

[0066] In one possible implementation, before acquiring the first sample image of the target area along the flight path at the first time point, it is necessary to first determine the UAV's flight path, referring to... Figure 5 , Figure 5 This is a schematic diagram of an optional process for generating a flight path according to an embodiment of this disclosure. The first step is to initialize flight parameters, including the region radius, number of path nodes, flight start point, flight end point, number of iterations, safety distance threshold, and optimal shooting angle range. The number of path nodes can be the number of power poles along the power line, the safety distance threshold can be determined based on the voltage level, and the optimal shooting angle range can specifically be a downward angle of 45° to 60°. The second step is to construct a virtual environment map, setting obstacle areas and safety distance buffer zones in the virtual environment map, and determining the flight start point, flight end point, and path nodes. The obstacle areas can be power poles, buildings, trees, etc., the safety distance buffer zone is the area extending to both sides of the power line, and the path nodes are nodes distributed along the power line corridor. The third step is to construct a node evaluation function, determining the node evaluation value of each node to be expanded based on the node evaluation function. The fourth step is to determine multiple target path nodes based on the node evaluation values ​​and generate a flight path based on these multiple target path nodes.

[0067] In one possible implementation, during the construction of the node evaluation function, specifically, the actual cost is determined based on the specific flight distance from the flight origin to the current path node, and the estimated cost is determined based on the Manhattan distance or Euclidean distance between the current path node and the flight destination. The shooting angle cost is determined based on the actual shooting angle and the optimal shooting angle at the current path node. ,

[0068] in, The actual shooting angle of the current path node n. For the best shooting angle, To determine the maximum angular deviation, this embodiment sets the maximum angular deviation to 30°. The safe distance cost is determined based on the distance between the current path node and the nearest obstacle.

[0069] in, The distance from the current path node n to the nearest obstacle. This represents the safe distance threshold. The coverage cost of the flight path is determined based on the current covered safe distance range and the total safe distance range.

[0070] in, The area covered by the safe distance. This represents the area of ​​the total safe distance range.

[0071] Next, cost weights are assigned to the shooting angle cost, safety distance cost, and coverage cost. Based on these cost weights, the actual cost, estimated cost, shooting angle cost, safety distance cost, and coverage cost are weighted and summed to obtain the node evaluation function. The node evaluation function can be expressed by the following formula:

[0072] in, For the actual cost, To estimate the cost, The cost weight of the shooting angle. The cost weight of the safe distance cost, The cost weight is the cost of coverage.

[0073] By constructing node evaluation functions from multiple dimensions, this approach breaks through the limitations of traditional evaluation functions that only focus on path length. It achieves an improvement from geometric shortest to comprehensive optimization, ensuring shooting quality, safe operation, and coverage integrity while taking into account path efficiency. This effectively enhances the robustness and scene adaptability of path planning, providing a more accurate and comprehensive evaluation basis for intelligent decision-making in complex task scenarios.

[0074] In one possible implementation, the nodes to be expanded are the nodes surrounding the current path node. In the process of determining multiple target path nodes based on node evaluation values ​​and generating a flight path based on these target path nodes, specifically, as follows: Figure 5 As shown, the process involves acquiring the nodes to be expanded around the current path node, storing them in a priority queue, calculating the node evaluation value for each node, arranging the evaluation values ​​in ascending order, identifying the node with the lowest evaluation value as the target path node, and recording the current path node as the parent node of the target path node. Based on the node with the lowest evaluation value, the coverage cost is calculated. When the covered safe distance area reaches a preset coverage threshold, and the target path node indicates the flight endpoint, the flight path is generated by backtracking sequentially from the flight endpoint.

[0075] Reference Figure 6 , Figure 6 This is a schematic diagram of an optional overall process for a UAV detection method for power line corridor features provided in this embodiment of the disclosure. The UAV detection method for power line corridor features provided in this embodiment of the disclosure can be applied to outdoor detection fields, specifically to power line change detection scenarios. The principle of the UAV detection method for power line corridor features in this embodiment of the disclosure is described in general and complete below: First, the change detection model was trained. The specific training parameters included: the Adam optimizer was used for parameter optimization, with momentum parameters β1 = 0.9 and β2 = 0.999. The initial learning rate was set to 0.001, and a linear decay strategy was used to gradually reduce the learning rate. The batch size was set to 16, and the maximum number of training epochs was set to 200. During training, the model performance was evaluated on the validation set every 5 epochs, and the model with the highest F1-score on the validation set was saved as the best model for inference.

[0076] Next, the drone's flight path is determined by initializing flight parameters and constructing a virtual environment map. Then, a node evaluation function is built to obtain the nodes to be expanded around the current path node. These nodes are stored in a priority queue, and their evaluation values ​​are calculated. The evaluation values ​​are then arranged in ascending order, and the node with the lowest evaluation value is identified as the target path node. The current path node is recorded as the parent node of the target path node. Based on the node with the lowest evaluation value, the coverage cost is calculated. When the covered safe distance area reaches a preset coverage threshold, and the target path node indicates the flight endpoint, the flight path is generated by backtracking from the endpoint.

[0077] Next, using a drone to acquire images, a first image to be processed is acquired along the flight path at the first time point, and a second image to be processed is acquired along the same flight path at the second time point. Both the first and second images are then input into a change detection network for change detection. Figure 1As shown, the first local features and the first global features of the first image to be processed are extracted through the first backbone network and the second backbone network, respectively. The second local features and the second global features of the second image to be processed are extracted through the first backbone network and the second backbone network, respectively. The first local features and the first global features are input into a frequency domain fusion module for spatial-frequency domain information fusion. Simultaneously, the second local features and the second global features are input into another frequency domain fusion module for spatial-frequency domain information fusion, resulting in high-frequency enhancement features, low-frequency enhancement features, and fused enhancement features corresponding to the first and second images to be processed. Subsequently, the high-frequency enhancement features of the first and second images to be processed are input into the high-frequency branch, and the feature difference between the first and second sample images is calculated using the absolute difference operation to obtain the high-frequency difference features. Similarly, the low-frequency difference features are input into the low-frequency branch, and the fused difference features are input into the fusion branch to obtain the fused difference features. The high-frequency difference features, low-frequency difference features, and fused difference features are summed pixel-by-pixel, and the maximum value of the summation result is taken to generate a binary change detection map.

[0078] Next, the binary change detection map is returned for post-processing. Morphological operations are used to extract independent changed regions through connected component analysis and filter regions with areas smaller than a threshold. A change detection report is generated based on the analysis results. Operators can then use the change detection report, combined with the actual site conditions, to further determine whether any intrusion has occurred in the corresponding areas and formulate appropriate handling plans for confirmed intrusions.

[0079] In summary, the UAV detection method for power line corridor features provided in this embodiment acquires a first sample image of the target area at a first time node along the flight path, and a second sample image of the target area at a second time node along the flight path. It then calls a first backbone network and a second backbone network to perform parallel feature extraction on the first and second sample images, respectively, to obtain first image features of the first sample image and second image features of the second sample image. By employing a heterogeneous dual-backbone network architecture for feature extraction, the problem of easy loss of edge information can be effectively solved. Next, the first and second image features are stitched together to obtain preprocessed features. These preprocessed features are decomposed into a first low-frequency component and a first high-frequency component. Long-range dependencies are modeled based on the first low-frequency component to obtain global enhanced features, and edge features are enhanced based on the first high-frequency component to obtain local enhanced features. The global and local enhanced features are then fused to obtain frequency domain fusion features. Differential processing is applied to the features output by different backbone networks, ensuring that the high-frequency component retains detailed features and the low-frequency component captures a wide range of spatial dependencies. Based on this, local enhancement features, global enhancement features, and frequency domain fusion features are input into the decoder for decoding to obtain difference features. A change detection model is trained based on the difference features. In response to change detection requests, the trained change detection model is called to perform change detection on the first and second images to be processed. The multiple target difference features output by the decoder are summed pixel by pixel. The decoder can realize multi-scale and multi-view change detection. A binary change detection map is generated based on the summation result, which effectively improves the accuracy of surface change detection.

[0080] It is understood that although the steps in the above flowcharts are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated in this embodiment, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the above flowcharts may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages in other steps.

[0081] Reference Figure 7 , Figure 7 This is a schematic diagram of an optional structure of a drone detection device for power line corridor features provided in an embodiment of this disclosure. The drone detection device 700 for power line corridor features includes: The image acquisition module 701 is used to acquire a first sample image of the target area at a first time node along the flight path, and to acquire a second sample image of the target area at a second time node along the flight path, wherein the first time node is before the second time node. The feature extraction module 702 is used to call the first backbone network and the second backbone network to perform parallel feature extraction on the first sample image and the second sample image respectively, so as to obtain the first image feature of the first sample image and the second image feature of the second sample image. The frequency domain fusion module 703 is used to stitch together the first image features and the second image features to obtain preprocessed features, decompose the preprocessed features into a first low-frequency component and a first high-frequency component, model long-range dependence based on the first low-frequency component to obtain global enhancement features, enhance edge features based on the first high-frequency component to obtain local enhancement features, and fuse the global enhancement features and local enhancement features to obtain frequency domain fusion features, wherein the frequency domain fusion features contain the difference information between the first image features and the second image features; The model training module 704 is used to input local enhancement features, global enhancement features and frequency domain fusion features into the decoder for decoding to obtain difference features, and to train the change detection model based on the difference features. The change detection module 705 is used to respond to change detection requests, call the trained change detection model to perform change detection on the first image to be processed and the second image to be processed, sum the multiple target difference features output by the decoder pixel by pixel, and generate a binary change detection map based on the summation result.

[0082] The terms “first,” “second,” “third,” “fourth,” etc. (if present) in this disclosure and the foregoing drawings are used to distinguish similar objects and are not necessarily used to describe a particular order or sequence. It should be understood that such data can be interchanged where appropriate to describe embodiments of this disclosure, for example, those that can be implemented in orders other than those illustrated or described herein. Furthermore, the terms “comprising” and “having,” and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatuses.

[0083] It should be understood that in this disclosure, "at least one item" means one or more, and "more than one" means two or more. "And / or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.

[0084] It should be understood that in the description of the embodiments of this disclosure, "multiple" means two or more, "greater than", "less than", "exceeding" etc. are understood to exclude the number itself, and "above", "below", "within" etc. are understood to include the number itself.

[0085] In the embodiments provided in this disclosure, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.

[0086] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0087] Furthermore, the functional units in the various embodiments of this disclosure can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0088] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this disclosure, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this disclosure. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0089] It should also be understood that the various implementation methods provided in this disclosure can be combined arbitrarily to achieve different technical effects.

[0090] The above is a detailed description of the preferred embodiments of this disclosure. However, this disclosure is not limited to the above embodiments. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of this disclosure. All such equivalent modifications or substitutions are included within the scope defined by the claims of this disclosure.

Claims

1. A method for detecting unmanned aerial vehicle (UAV) features along power line corridors, characterized in that, include: A first sample image of the target area is acquired at a first time node along the flight path, and a second sample image of the target area is acquired at a second time node along the flight path, wherein the first time node is before the second time node; The first backbone network and the second backbone network are invoked to perform parallel feature extraction on the first sample image and the second sample image, respectively, to obtain the first image feature of the first sample image and the second image feature of the second sample image. The first image feature and the second image feature are stitched together to obtain a preprocessed feature. The preprocessed feature is decomposed into a first low-frequency component and a first high-frequency component. A long-range dependency is modeled based on the first low-frequency component to obtain a global enhancement feature. Edge features are enhanced based on the first high-frequency component to obtain a local enhancement feature. The global enhancement feature and the local enhancement feature are fused to obtain a frequency domain fusion feature. The frequency domain fusion feature contains the difference information between the first image feature and the second image feature. The local enhancement features, the global enhancement features, and the frequency domain fusion features are input into the decoder for decoding to obtain the difference features, and the change detection model is trained based on the difference features. In response to the change detection request, the trained change detection model is invoked to perform change detection on the first image to be processed and the second image to be processed. The multiple target difference features output by the decoder are summed pixel by pixel, and a binary change detection map is generated based on the summation result.

2. The UAV detection method for ground features along power line corridors according to claim 1, characterized in that, Before acquiring the first sample image of the target area along the flight path at the first time point, the UAV detection method for power line corridor features further includes: Initialize flight parameters, including area radius, number of path nodes, flight start point, and flight end point; Construct a virtual environment map, set obstacle areas in the virtual environment map, and determine the flight start point, flight end point, and path nodes; Construct a node evaluation function, and determine the node evaluation value of each node to be expanded based on the node evaluation function; Multiple target path nodes are determined based on node evaluation values, and a flight path is generated based on these multiple target path nodes.

3. The UAV detection method for ground features along power line corridors according to claim 2, characterized in that, The node evaluation function includes: Determine the actual cost from the flight start point to the current path node, the estimated cost from the current path node to the flight end point, the shooting angle cost of the shooting angle of the current path node, the safe distance cost between the current path node and obstacles, and the coverage cost of each path node. A node evaluation function is constructed based on the actual cost, the estimated cost, the shooting angle cost, the safe distance cost, and the coverage cost.

4. The UAV detection method for ground features along power line corridors according to claim 3, characterized in that, The nodes to be expanded are the nodes surrounding the current path node. The process of determining multiple target path nodes based on the node evaluation values ​​and generating a flight path based on these target path nodes includes: Arrange the node evaluation values ​​of each node to be expanded in ascending order, determine the node to be expanded corresponding to the smallest node evaluation value as the target path node, and record the current path node as the parent node of the target path node. When the target path node indicates the flight destination and the coverage cost is less than the preset coverage cost threshold, the flight path is generated by tracing back from the flight destination.

5. The UAV detection method for ground features along power line corridors according to claim 1, characterized in that, The first image feature includes the first global feature output by the second backbone network. The second backbone network is invoked to extract features from the first sample image, including: The first sample image is embedded with features to obtain image embedding features. The image embedding features are then split into first embedding features and second embedding features along the channel dimension. Global modeling is performed on the first embedded feature to obtain spatial features. Frequency domain transformation is performed on the second embedded feature to obtain frequency domain features. The spatial features and the frequency domain features are fused at the channel level to obtain the first global feature output by the second backbone network.

6. The UAV detection method for ground features along power line corridors according to claim 5, characterized in that, The step of performing a frequency domain transformation on the second embedded feature to obtain a frequency domain feature includes: Wavelet decomposition is performed on the second embedded feature to obtain the second low-frequency component and the second high-frequency component; The second low-frequency component and the second high-frequency component are convolved to obtain mixed frequency features. The mixed frequency features are then subjected to inverse wavelet transform to obtain frequency domain features.

7. The UAV detection method for ground features along power line corridors according to claim 1, characterized in that, The process involves modeling long-range dependencies based on the first low-frequency component to obtain global enhancement features, enhancing edge features based on the first high-frequency component to obtain local enhancement features, and fusing the global enhancement features and the local enhancement features to obtain frequency domain fusion features, including: Large kernel convolution feature extraction is performed on the first low frequency component to obtain the first intermediate feature. Global context modeling is then performed on the first intermediate feature to obtain the global enhanced feature. Convolutional feature extraction is performed on the first high-frequency component to obtain the second intermediate feature, and edge enhancement is performed on the second intermediate feature to obtain the local enhancement feature; The global enhancement features and the local enhancement features are added element-wise to obtain the frequency domain fusion features.

8. The UAV detection method for ground features along power line corridors according to claim 1, characterized in that, The change detection model performs two frequency fusion processes in parallel, obtaining two sets of output features. These output features include local enhancement features, global enhancement features, and frequency domain fusion features. The decoder includes a high-frequency branch, a low-frequency branch, and a fusion branch. The local enhancement features, global enhancement features, and frequency domain fusion features are input into the decoder for decoding to obtain difference features, including: The local enhancement features from the two sets of output features are input into the high-frequency branch for differential absolute value operation to obtain high-frequency difference features; The global enhancement features from the two sets of output features are input into the low-frequency branch for difference absolute value operation to obtain low-frequency difference features; The frequency domain fusion features from the two sets of output features are input into the fusion branch for differential absolute value operation to obtain the fusion difference features.

9. The UAV detection method for ground features along power line corridors according to claim 8, characterized in that, The training of the change detection model based on the differential features includes: The high-frequency loss is determined based on the high-frequency feature differences and the real labels, the low-frequency loss is determined based on the low-frequency difference features and the real labels, and the fusion loss is determined based on the fusion difference features and the real labels. Weight coefficients are configured for the high-frequency loss, the low-frequency loss, and the fusion loss respectively. The high-frequency loss, the low-frequency loss, and the fusion loss are weighted and summed based on the weight coefficients to obtain the target loss. The change detection model is trained based on the target loss.

10. A UAV detection device for power line corridor features, characterized in that, include: The image acquisition module is used to acquire a first sample image of the target area at a first time node along the flight path, and to acquire a second sample image of the target area at a second time node along the flight path, wherein the first time node is before the second time node; The feature extraction module is used to call the first backbone network and the second backbone network to perform parallel feature extraction on the first sample image and the second sample image respectively, so as to obtain the first image feature of the first sample image and the second image feature of the second sample image. The frequency domain fusion module is used to stitch the first image feature and the second image feature to obtain preprocessed features, decompose the preprocessed features into a first low-frequency component and a first high-frequency component, model long-range dependencies based on the first low-frequency component to obtain global enhancement features, enhance edge features based on the first high-frequency component to obtain local enhancement features, and fuse the global enhancement features and the local enhancement features to obtain frequency domain fusion features, wherein the frequency domain fusion features contain difference information between the first image feature and the second image feature; The model training module is used to input the local enhancement features, the global enhancement features, and the frequency domain fusion features into the decoder for decoding to obtain the difference features, and to train the change detection model based on the difference features; The change detection module is used to respond to change detection requests, call the trained change detection model to perform change detection on the first image to be processed and the second image to be processed, sum the multiple target difference features output by the decoder pixel by pixel, and generate a binary change detection map based on the summation result.