Method, device, storage medium and product for detecting external obstacles of power transmission line

By improving the backbone and neck network structure of the YOLO-SAS model and combining it with the SIoU loss function, the problem of insufficient accuracy and efficiency in detecting small external obstacles on power transmission lines was solved, achieving efficient and accurate obstacle detection.

CN118429615BActive Publication Date: 2026-06-23YUNCHENG POWER SUPPLY COMPANY OF STATE GRID SHANXI ELECTRIC POWER

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
YUNCHENG POWER SUPPLY COMPANY OF STATE GRID SHANXI ELECTRIC POWER
Filing Date
2024-04-23
Publication Date
2026-06-23

Smart Images

  • Figure CN118429615B_ABST
    Figure CN118429615B_ABST
Patent Text Reader

Abstract

The application discloses a detection method and device for external damage obstacles of a power transmission line, a storage medium and a product, and relates to the technical field of power transmission line inspection. The method comprises the following steps: inputting an image of a target area power transmission line into a target detection model to obtain whether there is an external damage obstacle on the target area power transmission line and the category of the external damage obstacle; the target detection model is obtained by training a YOLO-SAS model; the backbone network of the YOLO-SAS model is obtained by replacing a CBS module in the backbone network of a YOLOv7 network with a ShuffleNetv2 model; and the neck network of the YOLO-SAS model is obtained by adding an ACmix module to the neck network of the YOLOv7 network. The application can improve the detection precision and efficiency of small target external damage obstacles of the power transmission line.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of power transmission line inspection technology, and in particular to a method, equipment, storage medium, and product for detecting external obstructions on power transmission lines. Background Technology

[0002] Currently, common types of external obstacles that damage power transmission lines mainly include illegal construction, illegal buildings, and excessively tall vegetation. Common power transmission line inspection methods can be divided into two main categories: line patrol and online monitoring. Line patrol is further divided into manual patrol and drone patrol: manual patrol requires regular on-site observation along the line, which is resource-intensive and inefficient; drone patrol reduces patrol costs by using drones equipped with cameras to take pictures along the line, but still requires manual image screening to identify potential hazards, resulting in low detection efficiency and difficulty meeting practical needs.

[0003] Online monitoring acquires monitoring data by installing sensors or cameras on power transmission lines, and then uses computer vision technology to extract information from the monitoring data to accurately identify potential hazards. This eliminates the need for manual screening and improves detection efficiency. With the continuous development of computer vision technology, deep learning algorithms based on deep convolutional neural networks (DCNN) have achieved remarkable results in target detection, which also has important application value in the field of detecting potential hazards caused by external forces on power transmission lines.

[0004] Current mainstream object detection algorithms based on DCNN can be broadly categorized into two types: two-stage and one-stage detection networks, depending on whether a candidate box generation stage is involved. Two-stage detection networks first acquire target candidate regions, then extract target features, classify the target, and regress bounding boxes. These algorithms offer high accuracy but suffer from slow detection speed. Typical algorithms include Faster R-CNN, Mask R-CNN, and Cascade R-CNN. One-stage detection networks directly perform target classification and bounding box regression after feature extraction, outputting the predicted target location and category. Typical algorithms include the YOLO series and SSD. Among these, the YOLO series, as a leading object detection algorithm, is widely popular due to its high speed and accuracy. In recent years, scholars have conducted some research in the field of detecting external damage obstacles to power transmission lines. Hao Shuai et al. proposed a YOLOv3 power transmission line fault detection method based on a convolutional block attention model to address the problem that the targets to be detected on power transmission lines are easily affected by complex backgrounds and partial occlusion. Wei Xianzhe et al. used an instance segmentation neural network algorithm with partially bounding box annotations, transferring features from the detection branch to the mask branch, achieving an average recognition accuracy of over 91% for common external damage categories. Long Leyun et al. used a self-attention mechanism (CBAM) in YOLOv5 to enhance the model's feature extraction capabilities and added a multi-scale domain adaptive network for adversarial training on the training set, enhancing the model's generalization ability to different weather conditions and scenarios, achieving an average accuracy of 92.2%. However, while the above algorithms have improved the detection effect of external damage obstacles, there is still considerable room for improvement in the detection accuracy and efficiency for small targets. Summary of the Invention

[0005] The purpose of this invention is to provide a method, equipment, storage medium, and product for detecting external obstacles on power transmission lines, which can improve the detection accuracy and efficiency of small-target external obstacles on power transmission lines.

[0006] To achieve the above objectives, the present invention provides the following solution:

[0007] A method for detecting external obstructions on power transmission lines, comprising:

[0008] Acquire images of power transmission lines in the target area;

[0009] The image of the transmission line in the target area is input into the target detection model to obtain whether there are any externally damaged obstacles on the transmission line in the target area and the category of the externally damaged obstacles. The target detection model is obtained by training the YOLO-SAS model. The YOLO-SAS model includes a backbone network, a neck network, and a head network connected in sequence. The backbone network of the YOLO-SAS model is obtained by replacing the CBS module in the backbone network of the YOLOv7 network with the ShuffleNetv2 model. The neck network of the YOLO-SAS model is obtained by adding the ACmix module to the neck network of the YOLOv7 network. The SPCSPC module in the neck network is connected to the first Concat module and the first CBS module in the neck network through ACmix. The head network of the YOLO-SAS model has the same structure as the head network of the YOLOv7 network.

[0010] Optionally, the neck network of the YOLO-SAS model includes: SPPCSPC module, ACmix module, four CBS modules, two Upsample modules, four ELAN-W modules, four Concat modules, and two MP-2 modules;

[0011] The inputs of the SPPCSPC module, the second CBS module, and the third CBS module are all connected to the backbone network. The output of the SPPCSPC module is connected to the input of the ACmix module, and the output of the ACmix module is connected to the inputs of the first CBS module and the first Concat module, respectively. The output of the first CBS module is connected to the input of the first Upsample module. The outputs of the first Upsample module and the second CBS module are both connected to the input of the second Concat module. The output of the second Concat module is connected to the input of the second ELAN-W module, and the output of the second ELAN-W module is connected to the inputs of the fourth CBS module and the fourth Concat module, respectively. The output of the fourth CBS module is connected to the input of the second Upsample module. Next, the outputs of the second Upsample module and the third CBS module are both connected to the input of the third Concat module; the output of the third Concat module is connected to the input of the fourth ELAN-W module; the output of the fourth ELAN-W module is connected to the input of the second MP-2 module and the header network, respectively; the output of the second MP-2 module is connected to the input of the fourth Concat module; the output of the fourth Concat module is connected to the input of the third ELAN-W module, and the output of the third ELAN-W module is connected to the input of the first MP-2 module and the header network, respectively; the output of the first MP-2 module is connected to the input of the first Concat module; the output of the first Concat module is connected to the input of the first ELAN-W module, and the output of the first ELAN-W module is connected to the header network.

[0012] Optionally, the loss function used during the training of the YOLO-SAS model is the SIoU function.

[0013] Optionally, the backbone network of the YOLO-SAS model includes a ShuffleNetv2 model module, four ELAN modules, and three MP-1 modules;

[0014] The ShuffleNetv2 model module includes four ShuffleNetv2 models. The output of the first ShuffleNetv2 model is connected to the input of the second ShuffleNetv2 model, the output of the second ShuffleNetv2 model is connected to the input of the third ShuffleNetv2 model, the output of the third ShuffleNetv2 model is connected to the input of the fourth ShuffleNetv2 model, and the output of the fourth ShuffleNetv2 model is connected to the input of the first ELAN module. The output of the first ELAN module is... Do not connect the first MP-1 module to the input terminal of the first MP-1 module or the third CBS module; connect the output terminal of the first MP-1 module to the input terminal of the second ELAN module; connect the output terminal of the second ELAN module to the input terminal of the second MP-1 module; connect the output terminal of the second MP-1 module to the input terminal of the third ELAN module; connect the output terminal of the third ELAN module to the input terminal of the second CBS module and the third MP-1 module respectively; connect the output terminal of the third MP-1 module to the input terminal of the fourth ELAN module; connect the output terminal of the fourth ELAN module to the input terminal of the SPC module.

[0015] Optionally, the output of the first ELAN-W module is connected to the input of the first RepConv module in the header network; the output of the third ELAN-W module is connected to the input of the second RepConv module in the header network; and the output of the fourth ELAN-W module is connected to the input of the third RepConv module in the header network.

[0016] A computer device includes: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the above-described method for detecting external obstacles to power transmission lines.

[0017] A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the above-described method for detecting external obstacles to power transmission lines.

[0018] A computer program product includes a computer program that, when executed by a processor, implements the above-described method for detecting external obstructions to power transmission lines.

[0019] According to specific embodiments provided by the present invention, the present invention discloses the following technical effects:

[0020] This invention inputs an image of the transmission line in the target area into a target detection model to determine whether there are any externally damaged obstacles on the transmission line and the type of such obstacles. The target detection model is obtained by training the YOLO-SAS model. The backbone network of the YOLO-SAS model is obtained by replacing the CBS module in the backbone network of the YOLOv7 network with the ShuffleNetv2 model. The neck network of the YOLO-SAS model is obtained by adding an ACmix module to the neck network of the YOLOv7 network and introducing the ShuffleNetv2 model to reduce the number of parameters in the YOLO-SAS model and improve detection efficiency. The ACmix attention mechanism is introduced into the neck network to enhance the feature extraction capability of the model and improve the model's accuracy in recognizing small targets. This can improve the detection accuracy and efficiency of small externally damaged obstacles on transmission lines. Attached Figure Description

[0021] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0022] Figure 1 This is a flowchart of a method for detecting external obstacles on power transmission lines provided in an embodiment of the present invention;

[0023] Figure 2 This is a diagram of the YOLOv7 network structure.

[0024] Figure 3 This is a structural diagram of the YOLO-SAS model provided by the present invention;

[0025] Figure 4 This is a diagram of the ShuffleNetv2 model structure.

[0026] Figure 5 Here is a diagram of the ACmix module structure;

[0027] Figure 6 This is a graph containing label information from the experimental dataset;

[0028] Figure 7 The graph shows the loss decrease curves for different IOU loss functions on the training and validation sets.

[0029] Figure 8 A comparison chart showing the detection results of each comparative experimental model;

[0030] Figure 9 This is a comparison chart of detection results in multi-target scenarios;

[0031] Figure 10 A comparison chart of detection results in complex nighttime scenes;

[0032] Figure 11 This is a diagram of the internal structure of a computer device. Detailed Implementation

[0033] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0034] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0035] This invention provides a method for detecting external obstructions on power transmission lines, such as... Figure 1 As shown, the method includes:

[0036] Step 101: Obtain images of the transmission lines in the target area.

[0037] Step 102: Input the image of the transmission line in the target area into the target detection model to obtain whether there are any external obstacles on the transmission line in the target area and the category of the external obstacles; the target detection model is obtained by training the YOLO-SAS model; the YOLO-SAS model includes a backbone network, a neck network and a head network connected in sequence; the backbone network of the YOLO-SAS model is obtained by replacing the CBS module in the backbone network of the YOLOv7 network with the ShuffleNetv2 model; the neck network of the YOLO-SAS model is obtained by adding the ACmix module to the neck network of the YOLOv7 network, wherein the SPPCSPC module in the neck network is connected to the first Concat module and the first CBS module in the neck network through ACmix respectively; the head network of the YOLO-SAS model has the same structure as the head network of the YOLOv7 network.

[0038] YOLOv7, proposed in 2022 by the authors of YOLOv4, is a relatively advanced end-to-end object detection model. It is an improvement upon YOLOv5, optimizing network structure, data augmentation, and activation functions to enhance expressive power and detection accuracy. The YOLOv7 network consists of four distinct modules: Input, Backbone, Neck, and Head. The network structure is as follows: Figure 2 As shown, the input image is first preprocessed, with its size uniformly adjusted to 640×640×3 to facilitate feature extraction in the backbone network. Then, the head network outputs three layers of feature maps of different sizes—large, medium, and small—resulting in large-scale features. Finally, the detection results are output after structural reparameterization and convolution. YOLOv7 incorporates an Extended-Efficient Long-range Attention Network (E-ELAN) into its backbone feature extraction network. This allows the network to learn more diverse features without disrupting the original gradient path, while accelerating model convergence, enhancing learning capabilities, and improving robustness. The enhanced feature extraction network introduces a Spatial Pyramid Pooling-Cross-Stage Partial Connection (SPPCSPC) module to increase the receptive field and achieve multi-scale feature fusion. A Reparameterized Convolutional Layer (RepConv) is added to the prediction head layer, borrowing from RepVGG to adjust the number of network channels in the output features, improving inference speed and reducing model complexity. This invention, building upon YOLOv7, introduces ShuffleNetv2 as the backbone network model, embeds an ACmix attention mechanism module in the Neck layer, and uses the SIoU loss function as the loss function for bounding box regression to achieve lightweight and accurate detection of externally broken obstacles. The improved network results are as follows Figure 3As shown, the neck network of the YOLO-SAS model includes: an SPCSPC module, an ACmix module, four CBS modules, two Upsample modules, four ELAN-W modules, four Concat modules, and two MP-2 modules. The inputs of the SPCSPC module, the second CBS module, and the third CBS module are all connected to the backbone network. The output of the SPCSPC module is connected to the input of the ACmix module, and the output of the ACmix module is connected to the inputs of the first CBS module and the first Concat module, respectively. The output of the first CBS module is connected to the input of the first Upsample module. The outputs of the first Upsample module and the second CBS module are both connected to the input of the second Concat module. The output of the second Concat module is connected to the input of the second ELAN-W module, and the output of the second ELAN-W module is connected to the inputs of the fourth CBS module and the fourth Concat module, respectively. The output of the fourth CBS module is connected to the input of the second Upsample module. Next, the outputs of the second Upsample module and the third CBS module are both connected to the input of the third Concat module; the output of the third Concat module is connected to the input of the fourth ELAN-W module; the output of the fourth ELAN-W module is connected to the input of the second MP-2 module and the head network, respectively; the output of the second MP-2 module is connected to the input of the fourth Concat module; the output of the fourth Concat module is connected to the input of the third ELAN-W module, and the output of the third ELAN-W module is connected to the input of the first MP-2 module and the head network, respectively; the output of the first MP-2 module is connected to the input of the first Concat module; the output of the first Concat module is connected to the input of the first ELAN-W module, and the output of the first ELAN-W module is connected to the head network. The ACmix attention mechanism combines traditional convolution and self-attention mechanisms, fusing the local feature extraction capability of traditional convolution with the global correlation capability of self-attention to obtain better feature representation. In traditional convolution, input features are mapped to a rich set of intermediate features by the convolution operation. Convolutional operations can effectively capture local features, and the granularity of these local features can be controlled by adjusting parameters such as kernel size and stride. In the self-attention module, intermediate features are used to compute multi-head self-attention, allowing the model to focus on different parts of the input features and effectively capture global information. The ACmix attention mechanism aggregates the output features of the convolutional and self-attention paths to obtain the final feature representation.This approach effectively avoids two complex projection operations while retaining the advantages of convolution and self-attention mechanisms, as shown in the structure below. Figure 5 As shown:

[0039] First, the input feature map of size H×W×C is projected through three 1×1 convolutions and reshaped into N blocks, resulting in an intermediate feature set of 3N feature maps. In the convolutional path with kernel size k, a lightweight fully connected layer is used to generate k² feature maps, which are divided into N groups. Then, the generated features are translated and aggregated, and the input features are processed by convolution, using the convolutional kernel to collect information from the local receptive field. In the self-attention path, the intermediate features are aggregated into N groups, each containing 3 feature maps, each from a 1×1 convolution. The three corresponding feature maps serve as the query, key, and value, respectively, following the traditional multi-head self-attention model. Finally, the outputs of the two paths are summed, as shown in Equation (1), where α and β are the parameters for convolution and self-attention learning, respectively, and their values ​​are set to 1. out F is the final output of the ACmix module. att F conv These are the outputs on the self-attention path and the convolution path, respectively.

[0040] F out =αF conv +βF att (1).

[0041] The backbone network of the YOLO-SAS model includes a ShuffleNetv2 model module, four ELAN modules, and three MP-1 modules. The ShuffleNetv2 model module comprises four ShuffleNetv2 models; the output of the first ShuffleNetv2 model is connected to the input of the second ShuffleNetv2 model, the output of the second ShuffleNetv2 model is connected to the input of the third ShuffleNetv2 model, the output of the third ShuffleNetv2 model is connected to the input of the fourth ShuffleNetv2 model, and the output of the fourth ShuffleNetv2 model is connected to the input of the first ELAN module. The output of the first ELAN module is... The first MP-1 module's output is connected to the input of the second ELAN module; the second ELAN module's output is connected to the input of the second MP-1 module; the second MP-1 module's output is connected to the input of the third ELAN module; the third ELAN module's output is connected to both the input of the second CBS module and the input of the third MP-1 module; the third MP-1 module's output is connected to the input of the fourth ELAN module; and the fourth ELAN module's output is connected to the input of the SPPCSPC module. YOLOv7 uses CSP-Darknet53 as its backbone network, inheriting the advantages of the Darknet series and improving upon them to better extract target features. The Darknet network has a complex structure and a large number of model parameters, which affects detection speed. To address this issue, the lightweight ShuffleNetv2 network is introduced to optimize the YOLOv7 backbone network, reducing the number of model parameters and thus improving detection efficiency. ShuffleNetv2 is a lightweight convolutional neural network designed to reduce computational cost and parameter count while maintaining high accuracy. It is an upgrade from ShuffleNetv1. ShuffleNetv1 suffered from slow performance due to excessive use of 1×1 group convolutions. Therefore, ShuffleNetv2 introduces a novel channel splitting operation, where basic units replace group convolutions with regular convolutions after channel splitting. Figure 4This demonstrates the basic unit of the ShuffleNetv2 module. First, the input feature map undergoes channel splitting, then it is divided into two branches. The left branch performs no operation, while the right branch contains two regular convolutions and one depthwise separable convolution (DWConv). These two branches undergo a concat operation to fuse features. Finally, a channel shuffling operation is used to mix the feature maps from different groups, enabling cross-channel information exchange.

[0042] The head network consists of three RepConv modules and one Detect module. The outputs of the three RepConv modules are all connected to the input of the Detect module. The output of the first ELAN-W module is connected to the input of the first RepConv module in the head network. The output of the third ELAN-W module is connected to the input of the second RepConv module in the head network. The output of the fourth ELAN-W module is connected to the input of the third RepConv module in the head network. The YOLOv7 coordinate loss function is calculated using CIoU, and the specific calculation formula is as follows:

[0043]

[0044]

[0045]

[0046] In the formula, IoU represents the intersection-union ratio between the predicted bounding box and the ground truth bounding box, and b and b gt ρ and h represent the center points of the predicted bounding box and the ground truth bounding box, respectively; ρ represents the Euclidean distance between the two center points; c represents the diagonal distance of the smallest closure region containing both the predicted and ground truth bounding boxes; α is a positive tradeoff parameter; v measures the consistency of the aspect ratio between the predicted and ground truth bounding boxes. w and h represent the width and height of the predicted bounding box; w gt and h gt This represents the width and height of the labeled bounding box. CIoU introduces an additional penalty term to update the loss function based on DIoU, adding loss for the length and width of the detection box, making the target box regression more stable and avoiding the divergence problems that occur during training, as seen with IoU and GIoU.

[0047] However, CIoU only considers the center point distance, overlap area, and aspect ratio of the detection boxes. The aspect ratio is described as a relative value, failing to reflect the true difference between width / height and confidence level. It also doesn't account for the mismatch in orientation between the detection and prediction boxes, sometimes hindering the model's optimization of similarity and leading to reduced detection efficiency. Therefore, to accelerate detection and improve the convergence of the loss function, this invention replaces the original bounding box loss function CIoU with SIoU.

[0048] The SIoU loss function consists of four loss functions: Angle cost, Distance cost, Shape cost, and IoU cost. It not only considers the angle between the vectors of the expected regressions, redefining the angle penalty metric, but also allows the predicted bounding box to quickly drift to the nearest axis, effectively reducing the total number of degrees of freedom. Its core calculation formula is as follows:

[0049]

[0050] Where Δ represents the distance cost, describing the distance between the center points. Its penalty cost is positively correlated with the angle cost, and its specific expression is:

[0051] Δ=∑ t=x,y (1-e -ρt(2-Λ) (6)

[0052] Where Λ represents the Angle cost, and t represents the x and y coordinates, describing the minimum angle between the center point and the x and y axes. The specific calculation formula is as follows:

[0053]

[0054] Where Ω represents Shape cost, defined by calculating the difference in length and width between the two frames and the maximum aspect ratio between them, and θ represents the degree of attention paid to shape loss, the specific calculation formula is as follows:

[0055]

[0056] The present invention also provides another embodiment to illustrate the effect of the method for detecting external obstacles to transmission lines provided in the above embodiments, specifically including:

[0057] 3.1 Experimental Environment and Training Strategy

[0058] To ensure the reliability of the experiment, the development environment was PyTorch 2.0.1 and CUDA 11.3, with Python version 3.9.0. Server configuration details are shown in Table 1. The input training samples were 640x640 three-channel images, with an initial learning rate of 0.01, momentum of 0.937, weight decay of 0.0005, a batch size of 16, and a total of 300 batches trained.

[0059] Table 1 Server Configuration

[0060]

[0061]

[0062] In object detection tasks, the loss function mainly includes classification loss, localization loss, and object confidence loss. Classification loss uses the cross-entropy loss function to measure the difference between the predicted and actual classes. Localization loss typically uses the mean squared error (MSE) loss function to measure the difference between the predicted and actual bounding box positions. Object confidence loss, also known as object-specific loss, measures the difference in IoU (Intersection over Union) between the predicted and actual bounding boxes. In this embodiment of the invention, the loss function consists of classification loss, localization loss, and object confidence loss; the total loss = classification loss + localization loss + object confidence loss.

[0063] 3.2 Dataset

[0064] The dataset consists of power transmission line images captured by video surveillance devices in various scenarios, including five typical categories of potential hazards: trucks, cranes, excavators, hoists, and trees. The raw data was then processed using LabelMe software to label the potential hazards threatening the power transmission lines. The original images and labeled files were then converted and integrated according to the YOLOv7 dataset format to generate a power transmission line safety hazard target detection dataset. The constructed dataset contains 1307 images, divided into training and testing sets in an 8:2 ratio. The label information included in the dataset can be found in [link to dataset description]. Figure 6 .

[0065] 3.3 Evaluation Indicators

[0066] When evaluating object detection algorithms, metrics such as detection accuracy, detection speed, and memory usage need to be considered. Therefore, this invention uses seven metrics—precision (P), recall (R), average precision (AP), mean average precision (mAP), frame rate (FPS), gigaflops (GFLOPs), and number of parameters—to accurately and objectively evaluate the model's performance.

[0067]

[0068]

[0069]

[0070]

[0071] In the formula, TP is the number of correctly identified samples; FP is the number of samples misidentified as other categories; FN is the number of samples misidentified as this category from other categories; AP is the average precision for each category; P is the precision; and R is the recall.

[0072] 3.4 Experimental Results and Analysis

[0073] 3.4.1 Attention Mechanism Selection Experiment

[0074] This embodiment compares the ACMix attention mechanism with SE, ECA, CA, and CBAM attention mechanisms based on the original YOLOv7 network, analyzing the impact of different attention mechanisms on the model's detection performance. The same parameters and experimental environment were used during training, and the results are shown in Table 2. Experimental results show that the model with the ACmix attention mechanism exhibits the best detection accuracy, while the changes in parameter count, computational cost, and detection speed are negligible. Therefore, adding the ACmix attention mechanism to the model can effectively improve its detection accuracy.

[0075] Table 2 Comparison of recognition results for different attention mechanisms

[0076]

[0077] 3.4.2 Loss Function Selection Experiment

[0078] The original YOLOv7 network uses CIoU as the loss function between the target bounding box and the ground truth bounding box. However, because CIoU does not consider the impact of aspect ratio on the detection results during training, its convergence speed is slow. Therefore, this embodiment compares different IoU loss functions to analyze the impact of each loss function on model performance. The experimental results are shown in Table 3. The loss descent curves of different loss functions on the training and validation sets are shown in Table 3. Figure 7 Part (a) shows and Figure 7 As shown in section (b). Combined with Figure 7 As shown in Table 3, SIoU has smaller overall fluctuations compared to CIoU, its loss decreases faster and it has a smaller convergence loss. It improves by 0.7% at mAP 0.5 and decreases by 0.2% at mAP 0.5-0.95, showing the best performance.

[0079] Table 3 Comparison of identification results for different IoU loss functions

[0080]

[0081] 3.4.3 Ablation Experiment

[0082] To evaluate the specific performance of the improved modules, under the same environment and parameter settings, YOLOv7 was used as the baseline model, and the performance of the model was verified by adding different modules. The experimental results are shown in Table 4, with "√" indicating the corresponding added method.

[0083] Table 4 Ablation Experiment Results

[0084]

[0085]

[0086] Group A presents the experimental results of the original YOLOv7 algorithm, serving as the benchmark for the subsequent six groups. The detection accuracy is 89.7%, FPS is 65.8 frames / second, parameter count is 36.5M, and FLOPs are 103.2 GFLOPs. Group B, after optimizing the CBS module in the backbone using the ShuffleNetv2 module, reduced model parameters, FLOPs, and detection mAP by 12M, 18.6G, and 1.3%, respectively, while increasing detection speed by 1.7 frames / second. This indicates that introducing the ShuffleNetv2 network can reduce the number of parameters and computational cost, improving detection speed, but at the cost of some accuracy. Group C, after embedding the ACmix attention mechanism in the SPPCSPC layer of the neck network, improved detection accuracy by 1.2%, indicating that adding the attention mechanism can enhance the model's feature extraction and feature integration capabilities. Group D, after changing the loss function to SIoU, improved detection speed by 2.9 frames / second, demonstrating that SIoU can accelerate model convergence and reduce the loss value. Groups E and F, based on the ShuffleNetv2 backbone network, respectively added the ACmix attention mechanism and optimized the loss function. It can be seen that model parameters and FLOPs decreased to varying degrees. Group G used the YOLO-SAS model. Compared to previous experiments, the FPS increased to 68.8 frames / second, the model's parameter count and computational cost were significantly reduced, and the detection accuracy reached its highest level. This indicates that although introducing the ShuffleNetv2 network leads to a decrease in detection accuracy, by embedding the ACmix attention mechanism and improving the loss function, the detection accuracy can be restored to its pre-improvement state or even further improved. In summary, the YOLO-SAS model of this invention improves average detection accuracy while increasing model computation speed, reducing time consumption, and improving real-time detection, effectively balancing accuracy and lightweight design, and providing feasibility for detecting abnormal obstacles in power transmission lines.

[0087] 3.4.4 Comparative Experiment

[0088] To further evaluate the practicality of the YOLO-SAS model provided in the above embodiments of the present invention, the YOLO-SAS model provided in the above embodiments of the present invention was compared with the current mainstream object detection models Faster R-CNN, SSD, YOLOv3, YOLO v5m, YOLOv5s, YOLOX, YOLOv7-tiny, and YOLOv7 on the same dataset using the same training method and parameters. The experimental results are shown in Table 5.

[0089] Table 5 Comparison of experimental results

[0090]

[0091] Experimental results show that the YOLO-SAS model significantly outperforms other models in terms of detection accuracy and speed, achieving an average accuracy of 92.6%. It also effectively detects occluded targets, demonstrating the superior performance of the YOLO-SAS model proposed in this invention. Compared to the original YOLOv7 network, the introduction of the ShuffleNetv2 lightweight network significantly reduces both the number of parameters and computational cost, resulting in superior accuracy and efficiency. Although the YOLO-SAS model provided by this invention is slightly inferior to models such as YOLOv5s, YOLOX, and YOLOv7-tiny in terms of parameter count and floating-point operation rate, it offers superior detection speed, making it more suitable for deployment on edge devices with limited computing power, thus meeting the needs of intelligent development in power transmission line inspection and maintenance methods.

[0092] To more intuitively demonstrate the comparison effect, this embodiment of the invention selects representative images that are difficult to detect for visual analysis. Figure 8 This is a comparison chart of the detection results of various experimental models. Figure 8 Part (a) in the middle represents the original image. Figure 8 Part (b) shows the detection results of the Faster R-CNN model. Figure 8 The middle (c) section shows the detection results of the YOLOv5m model. Figure 8 The middle (d) section represents the detection results of the YOLOX model. Figure 8 The middle (e) section shows the detection results of the YOLOv7 model. Figure 8 The middle (f) section shows the detection results of the YOLO-SAS model. Figure 8 The English text above the bounding box indicates the hazard target category identified by the model, while the numbers represent the model's confidence level in predicting that category. Figure 8 It can be seen that the Faster R-CNN model has serious false positives and false negatives. Although the YOLOv5m and YOLOX models have improved in detection performance and results, they still have false positives and false negatives.

[0093] Figure 9 The images show a comparison of detection results in multi-target scenes. The detection image resolution is 1024*1024. Figure 9 Part (a) in the middle represents the original image. Figure 9 Part (b) shows the detection results of the Faster R-CNN model. Figure 9 The middle (c) section shows the detection results of the YOLOv5m model. Figure 9 The middle (d) section represents the detection results of the YOLOX model. Figure 9 The middle (e) section shows the detection results of the YOLOv7 model. Figure 9 The middle (f) section represents the detection results of the YOLO-SAS model, from... Figure 9 It can be seen that when dealing with images containing many externally damaged targets, the average accuracy of the YOLO-SAS model provided by this invention is much higher than that of other models. Although the Faster R-CNN model has good accuracy, its detection time is too long, making it unsuitable for detecting targets with potential external damage. Figure 10 This is a comparison of detection results in complex nighttime scenes. Figure 10 Part (a) in the middle represents the original image. Figure 10 Part (b) shows the detection results of the Faster R-CNN model. Figure 10 The middle (c) section shows the detection results of the YOLOv5m model. Figure 10 The middle (d) section represents the detection results of the YOLOX model. Figure 10 The middle (e) section shows the detection results of the YOLOv7 model. Figure 10 The middle (f) section represents the detection results of the YOLO-SAS model, which is... Figure 10 It is evident that in scenarios with complex backgrounds and low lighting, other models exhibit more severe false detections and missed detections. However, the YOLO-SAS model provided in this invention can mitigate the interference of background noise, significantly improving the false detection and missed detection problems of small targets in the original model. In summary, the YOLO-SAS model, in the context of power transmission lines, can better integrate the actual situation and incorporate more feature information of small target obstacles at different scales, resulting in a more significant improvement in the detection of external obstacles.

[0094] 4. Experimental Conclusions

[0095] To address the issues of false detection, missed detection, inaccurate identification, and low detection efficiency of the original YOLOv7 model in detecting small external obstacles in power transmission line environments, this invention proposes a YOLO-SAS model for detecting external obstacles in power transmission lines. Firstly, it introduces a ShuffleNetv2 network to reduce the number of model parameters and improve network detection speed. Secondly, it introduces an ACmix attention mechanism module in the Neck layer to reduce the impact of complex backgrounds on feature extraction, enhance the model's feature extraction capabilities, and improve the model's attention to and recognition accuracy of small targets. Finally, it uses the SIoU loss function as the loss function for bounding box regression, leveraging its smoothing function properties to improve the stability and convergence speed of model training. Experimental results show that the YOLO-SAS model achieves an average accuracy (mAP) of 92.6% in power transmission line scenarios, a 2.9% improvement over the original YOLOv7 network. The number of parameters and computational cost are reduced by 11.1% and 45.8%, respectively, and the detection speed is increased to 68.8 frames per second, detecting approximately 300 images per second. Compared with other existing models, it exhibits superior performance and effectiveness.

[0096] The present invention has the following technical effects:

[0097] (1) The lightweight network ShuffleNetv2 was introduced into the backbone of the YOLOv7 network, which effectively reduced the number of parameters in the YOLO-SAS model and significantly improved the detection speed.

[0098] (2) The ACmix attention mechanism module was embedded into the neck network of the network, which enhanced the feature extraction and feature integration capabilities of the YOLO-SAS model and improved the recognition accuracy of small targets.

[0099] (3) The loss function of the original network was optimized by using the SIoU loss function, which improved the localization performance of the YOLO-SAS model.

[0100] In one embodiment, a computer device is also provided, including a memory and a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the method for detecting external obstacles to transmission lines as described in the above method embodiments. The computer device may be a database, and its internal structure diagram may be as follows: Figure 11As shown, this computer device includes a processor, memory, input / output (I / O) interfaces, and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides the environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The database stores pending transactions. The I / O interfaces are used for exchanging information between the processor and external devices. The communication interface is used for communicating with external terminals via a network connection. When the computer program is executed by the processor, it implements a data processing method.

[0101] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, which, when executed by a processor, implements the method for detecting external obstacles to power transmission lines as described in the above method embodiments.

[0102] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the method for detecting external obstacles to power transmission lines as described in the above method embodiments.

[0103] It should be noted that the object information (including but not limited to object device information, object personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the object or fully authorized by all parties, and the collection, use and processing of related data must comply with relevant laws, regulations and standards.

[0104] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0105] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0106] Specific examples have been used in the embodiments of this invention to illustrate the principles and implementation methods of the invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of this invention; at the same time, those skilled in the art will recognize that, based on the ideas of this invention, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of this invention.

Claims

1. A method for detecting external obstructions on power transmission lines, characterized in that, include: Acquire images of power transmission lines in the target area; The image of the transmission line in the target area is input into the target detection model to obtain whether there are any external obstacles on the transmission line in the target area and the type of external obstacles; The target detection model is obtained by training the YOLO-SAS model. The YOLO-SAS model includes a backbone network, a neck network, and a head network connected in sequence. The backbone network of the YOLO-SAS model is obtained by replacing the CBS module in the backbone network of the YOLOv7 network with the ShuffleNetv2 model. The neck network of the YOLO-SAS model is obtained by adding an ACmix module to the neck network of the YOLOv7 network. The SPCSPC module in the neck network is connected to the first Concat module and the first CBS module in the neck network through ACmix. The head network of the YOLO-SAS model has the same structure as the head network of the YOLOv7 network. The neck network of the YOLO-SAS model includes: SPPCSPC module, ACmix module, four CBS modules, two Upsample modules, four ELAN-W modules, four Concat modules, and two MP-2 modules; The inputs of the SPPCSPC module, the second CBS module, and the third CBS module are all connected to the backbone network. The output of the SPPCSPC module is connected to the input of the ACmix module, and the output of the ACmix module is connected to the inputs of the first CBS module and the first Concat module, respectively. The output of the first CBS module is connected to the input of the first Upsample module. The outputs of the first Upsample module and the second CBS module are both connected to the input of the second Concat module. The output of the second Concat module is connected to the input of the second ELAN-W module, and the output of the second ELAN-W module is connected to the inputs of the fourth CBS module and the fourth Concat module, respectively. The output of the fourth CBS module is connected to the input of the second Upsample module. Next, the outputs of the second Upsample module and the third CBS module are both connected to the input of the third Concat module; the output of the third Concat module is connected to the input of the fourth ELAN-W module; the output of the fourth ELAN-W module is connected to the input of the second MP-2 module and the header network, respectively; the output of the second MP-2 module is connected to the input of the fourth Concat module; the output of the fourth Concat module is connected to the input of the third ELAN-W module, and the output of the third ELAN-W module is connected to the input of the first MP-2 module and the header network, respectively; the output of the first MP-2 module is connected to the input of the first Concat module; the output of the first Concat module is connected to the input of the first ELAN-W module, and the output of the first ELAN-W module is connected to the header network; The backbone network of the YOLO-SAS model includes a ShuffleNetv2 model module, four ELAN modules, and three MP-1 modules; The ShuffleNetv2 model module includes four ShuffleNetv2 models. The output of the first ShuffleNetv2 model is connected to the input of the second ShuffleNetv2 model, the output of the second ShuffleNetv2 model is connected to the input of the third ShuffleNetv2 model, the output of the third ShuffleNetv2 model is connected to the input of the fourth ShuffleNetv2 model, and the output of the fourth ShuffleNetv2 model is connected to the input of the first ELAN module. The output of the first ELAN module is... Do not connect the output of the first MP-1 module to the input of the third CBS module; connect the output of the first MP-1 module to the input of the second ELAN module; connect the output of the second ELAN module to the input of the second MP-1 module; connect the output of the second MP-1 module to the input of the third ELAN module; connect the output of the third ELAN module to the input of both the second CBS module and the third MP-1 module; connect the output of the third MP-1 module to the input of the fourth ELAN module; connect the output of the fourth ELAN module to the input of the SPC module. The output of the first ELAN-W module is connected to the input of the first RepConv module in the head network; the output of the third ELAN-W module is connected to the input of the second RepConv module in the head network; and the output of the fourth ELAN-W module is connected to the input of the third RepConv module in the head network.

2. The method for detecting external obstructions on transmission lines according to claim 1, characterized in that, The loss function used during the training of the YOLO-SAS model is the SIoU function.

3. A computer device, comprising: A memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that the processor executes the computer program to implement the method for detecting external obstructions to a transmission line as described in any one of claims 1-2.

4. A computer-readable storage medium having a computer program stored thereon, characterized in that, When executed by a processor, the computer program implements the method for detecting external obstructions to power transmission lines as described in any one of claims 1-2.

5. A computer program product, comprising a computer program, characterized in that, When executed by a processor, the computer program implements the method for detecting external obstructions to power transmission lines as described in any one of claims 1-2.