Insulator defect detection method and device for high-resolution images and medium

By generating a clipping window based on defect annotation boxes and using a semi-decoupled detection head with a priori-driven paradigm, the accuracy and efficiency issues of insulator defect detection in high-resolution images are solved, achieving efficient and accurate detection on UAV edge computing devices.

CN122199518APending Publication Date: 2026-06-12STATE GRID SICHUAN ELECTRIC POWER CORP ELECTRIC POWER RES INST +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
STATE GRID SICHUAN ELECTRIC POWER CORP ELECTRIC POWER RES INST
Filing Date
2026-04-22
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

In high-resolution images from drone inspections, existing technologies struggle to balance detail preservation and the integrity of defective targets with limited computing power, resulting in low detection accuracy.

Method used

By generating a clipping window based on defect annotation boxes, the target detection model is trained, and only the region of interest is detected during the inference phase. Defect detection is performed using a semi-decoupled detection head based on the prior-driven paradigm.

🎯Benefits of technology

It enables efficient and accurate insulator defect detection with limited computing power, improving detection accuracy and computational efficiency while reducing the demand for computing power.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122199518A_ABST
    Figure CN122199518A_ABST
Patent Text Reader

Abstract

This invention discloses a method, device, and medium for insulator defect detection in high-resolution images, relating to the field of image processing technology. The method includes: acquiring multiple labeled high-resolution inspection images, each carrying insulator defect bounding boxes and defect types; determining multiple candidate cropping windows based on the defect bounding boxes in each image, each window including a cropping center point and a cropping size; calculating the number of defect bounding boxes covered by each window, and cropping the image using the window with the most bounding boxes to obtain a high-density sub-image, improving the quality of training samples; training a defect detection model based on the sub-images; generating regions of interest (ROIs) for unlabeled inspection images; inputting only the ROIs into the model to obtain defect detection boxes and their types, significantly reducing computational overhead, and mapping the coordinates back to the original image coordinate system to obtain the final detection result, thereby achieving efficient and accurate detection of insulator defects in high-resolution images with limited computing power.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of image data processing technology, and specifically to a method, device, and medium for detecting insulator defects in high-resolution images. Background Technology

[0002] With the rapid development of UAV remote sensing technology and power line inspection equipment, high-resolution images acquired by UAVs have been widely used for defect detection of insulators in power transmission lines. By analyzing the inspection images using target detection models, the automatic location and identification of insulators and their defects can be achieved.

[0003] However, UAV inspection images are typically characterized by high resolution, large image size, and complex backgrounds, while the UAV itself or its onboard edge computing devices are generally limited in terms of computing power, storage, and power consumption. To adapt to the fixed input size of the target detection model, existing technologies usually perform overall image compression, but this leads to the loss of fine-grained spatial information in the image. This makes it difficult to effectively distinguish small-scale targets such as insulator defects from the background after scaling, severely affecting detection accuracy.

[0004] To avoid feature loss caused by overall compression, another existing technique employs image cropping, dividing the inspection image into multiple sub-images for separate detection. This method can amplify the relative scale of defective targets, reduce background interference, and improve the detection performance of small targets. However, traditional cropping strategies are mostly task-independent fixed rule divisions. On the one hand, this can easily generate a large number of sub-images that do not contain valid targets, resulting in a waste of computational resources; on the other hand, fixed cropping methods can easily lead to the separation of defective targets, destroying their complete structural features and hindering model learning.

[0005] In summary, how to achieve efficient and accurate detection of insulator defects while maintaining the detail of high-resolution images and the integrity of defect targets under limited computing power has become an urgent technical problem to be solved. Summary of the Invention

[0006] The technical problem to be solved by this invention is how to achieve efficient and accurate detection of insulator defects in high-resolution images under limited computing power. The purpose is to provide an insulator defect detection method, device and medium for high-resolution images, thereby solving the above-mentioned problem.

[0007] This invention is achieved through the following technical solution:

[0008] In a first aspect, the present invention provides an insulator defect detection method for high-resolution images, comprising:

[0009] Acquire multiple labeled inspection images; each inspection image is a high-resolution image containing insulators and carrying tag information; the tag information includes defect annotation boxes on the insulators and the corresponding defect types;

[0010] Based on the defect annotation boxes in each inspection image, multiple candidate cropping windows are determined; each candidate cropping window includes a cropping center point and a cropping size; the cropping size does not exceed the width and height of each inspection image;

[0011] Calculate the number of defect annotation boxes covered by each candidate cropping window, and crop each inspection image according to the candidate cropping window that covers the most defect annotation boxes to obtain a sub-image;

[0012] The target detection model is trained by cropping multiple sub-images from multiple inspection images to obtain a trained defect detection model.

[0013] Based on the unlabeled inspection image, at least one region of interest is generated; the region of interest is an image region that may contain insulators.

[0014] Each region of interest is input into the defect detection model to obtain the defect detection box and its defect type within each region of interest. The coordinates of the defect detection box are then mapped back to the coordinate system of the unlabeled inspection image to obtain the insulator defect detection result of the unlabeled inspection image.

[0015] Optionally, the cutting dimensions are determined by the following constraints:

[0016]

[0017] Where D is the size base, which takes values ​​that are positive integer multiples of 32; n is the size scaling factor; w is the width of the inspection image; h is the height of the inspection image; and min() represents the minimum value function.

[0018] Optionally, the defect detection model includes:

[0019] The feature extraction layer is used to scale and extract features from the input image to be detected according to a preset ratio, and output multiple feature maps at different scales; the image to be detected is the sub-image or the region of interest.

[0020] A multi-scale feature fusion layer is used to perform cross-scale fusion of feature maps of different scales to generate an enhanced fused feature map.

[0021] The detection output layer employs a semi-decoupled detection head based on a priori-driven paradigm, including:

[0022] The feature decoupling module is used to decouple the fused feature map into mutually independent category feature maps and location feature maps to form category prediction branches and location prediction branches; wherein, the location prediction branch uses the prior anchor box generated based on the defect annotation box as the regression benchmark.

[0023] The branch fusion module is used to concatenate the category feature map and the location feature map;

[0024] The unified prediction module is used to perform 1×1 convolution, feature map reshaping, and feature map dimension replacement on the stitched feature map, and output the defect detection box and defect type of the image to be detected.

[0025] Optionally, the prior anchor frame is generated through the following steps:

[0026] Generate an initial set of anchor boxes; each set of anchor boxes contains a group of anchor boxes, each anchor box being represented by its width and height.

[0027] The average maximum intersection-union ratio between the anchor frame and the defect annotation frame is used as the fitness function;

[0028] The initial set of anchor frames is iteratively optimized based on the fitness function until a preset termination condition is reached, and the set of anchor frames with the highest fitness is taken as the prior anchor frames.

[0029] Optionally, the step of iteratively optimizing the initial anchor box set based on the fitness function until a preset termination condition is reached, and using the anchor box set with the highest fitness as the prior anchor boxes, includes:

[0030] In each iteration, the selection probability of each anchor box set is calculated according to the fitness function;

[0031] Multiple parent anchor frame sets are selected from the current anchor frame set based on the selection probability;

[0032] Perform a cross operation on the corresponding anchor frames in any two parent anchor frame sets to generate child anchor frames;

[0033] Perform a mutation operation on the offspring anchor frame;

[0034] Repeat the selection, crossover, and mutation operations until the preset termination condition is met, and use the set of anchor frames with the highest fitness as the prior anchor frames.

[0035] Optionally, the fitness function is expressed as follows:

[0036]

[0037] Where t represents the generation number; This represents the set of the m-th anchor frames in generation t; For anchor frame set The fitness of the defect annotation boxes; N is the total number of defect annotation boxes; IoU represents the intersection-union function; This represents the i-th defect annotation box. Represents the set of anchor frames The j-th anchor frame in the set; k is the number of anchor frames contained in each anchor frame set.

[0038] Optionally, the step of determining multiple candidate cropping windows based on the defect annotation boxes in each inspection image includes:

[0039] Obtain the center point coordinates of each defect annotation box in each inspection image, perform cluster analysis on the center point coordinates, and determine the neighborhood where the cluster center is located as the defect-dense area.

[0040] Multiple candidate clipping windows are generated centered on the region with dense defects.

[0041] Optionally, the coordinates of the defect detection box are mapped back to the coordinate system of the unlabeled inspection image, calculated using the following formula:

[0042] ;

[0043] in, The global coordinates of the top left corner of the defect detection box in the unlabeled inspection image; The local pixel coordinates of the top-left corner of the defect detection box within the corresponding region of interest sub-image; The global coordinates of the upper left corner of the region of interest in the unlabeled inspection image; These are the width and height of the region of interest in the unlabeled inspection image, respectively. and These are the width and height of the region of interest sub-image input into the object detection model during inference; This represents the element-wise multiplication operation where corresponding elements are multiplied.

[0044] In a second aspect, the present invention provides a computer device comprising a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement an insulator defect detection method for high-resolution images as described in any one of the first aspects.

[0045] Thirdly, the present invention provides a computer-readable storage medium storing a computer program, wherein a processor executes the computer program to implement an insulator defect detection method for high-resolution images as described in any one of the first aspects.

[0046] Compared with the prior art, the present invention has the following advantages and beneficial effects:

[0047] This invention provides a method for insulator defect detection in high-resolution images. During the training phase, a cropping window is generated based on the distribution of defect bounding boxes to ensure the defect target is fully contained, enabling the model to learn complete defect morphological features and improve detection accuracy. Simultaneously, guided cropping is performed based on the principle of "covering the largest number of defect bounding boxes," generating high-density sub-images. Compared to random cropping or sliding window generation of numerous redundant samples, this invention uses fewer sub-images to cover richer defect information, reducing the need for massive training data and accelerating model convergence. During the inference phase, only regions of interest are detected, avoiding unnecessary computation caused by full image traversal. The detection results are then restored to the original large image through coordinate mapping, preserving the detailed information of the high-resolution image while significantly reducing computational requirements, thereby achieving efficient and accurate insulator defect detection. Attached Figure Description

[0048] To more clearly illustrate the technical solutions of the exemplary embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly described below. It should be understood that the following drawings only show some embodiments of the present invention and should not be considered as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort. In the drawings:

[0049] Figure 1 A flowchart illustrating an insulator defect detection method for high-resolution images provided in this application embodiment;

[0050] Figure 2 This is a schematic diagram of the structure of the defect detection model provided in the embodiments of this application;

[0051] Figure 3 This is a schematic diagram of the structure of the semi-decoupled detection head provided in the embodiments of this application;

[0052] Figure 4 This is a schematic diagram illustrating the specific process of generating prior anchor frames using a genetic algorithm, as provided in an embodiment of this application. Detailed Implementation

[0053] To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the embodiments and accompanying drawings. The illustrative embodiments and descriptions of the present invention are only used to explain the present invention and are not intended to limit the present invention.

[0054] Existing drone-based insulator defect detection technologies still have the following shortcomings:

[0055] 1. There is a serious scale mismatch between high-resolution images and the fixed input size of the detection model, and the overall scaling process leads to the loss of small-scale defect features;

[0056] 2. Traditional clipping or sliding window methods lack task guidance, have large computational redundancy, and low resource utilization efficiency;

[0057] 3. Inconsistent processing strategies between the training and inference phases affect the stability of the model in actual deployment;

[0058] 4. The detection model structure is too general and lacks targeted optimization for the characteristics of insulator defects, making it difficult to balance detection accuracy and inference efficiency under low computing power conditions.

[0059] Therefore, this application provides a method for detecting insulator defects in high-resolution images, the specific process of which is as follows: Figure 1 As shown. Below is... Figure 1 This paper introduces a method for detecting insulator defects in high-resolution images.

[0060] S1. Obtain multiple labeled inspection images.

[0061] In the specific implementation process, high-resolution inspection images containing insulators of transmission lines are acquired by imaging equipment carried by drones. The resolution range is approximately 3000×5000~6000×8000 pixels. These images are visible light images, and the specific resolution is determined according to the requirements of the inspection task. They are significantly higher than the fixed input size (640×640) of the target detection model.

[0062] The collected inspection images are manually or automatically annotated to generate labeled images carrying tag information. The tag information includes defect annotation boxes on the insulators and their corresponding defect types; wherein, the defect annotation box is the smallest bounding rectangle containing the defect target, and the defect type includes at least one of the following: insulator spontaneous explosion, breakage, dirt, and corrosion.

[0063] S2. Based on the defect annotation boxes in each inspection image, determine multiple candidate clipping windows.

[0064] In one possible embodiment, the coordinates of the center point of each defect annotation box in each inspection image are obtained, cluster analysis is performed on the center point coordinates, and the neighborhood where the cluster center is located is determined as the defect-dense region; multiple candidate clipping windows are generated with the defect-dense region as the center.

[0065] In this embodiment, cluster analysis is used to locate densely populated defect regions, and cropping windows are generated centered on these regions. This allows the cropping strategy to focus on the effective regions in the image that actually contain defects. Compared to traditional sliding window or regular grid cropping methods, this method avoids generating a large number of invalid cropping windows that do not contain any defective targets on the entire background area of ​​the image, significantly reducing redundant computation and saving training resources.

[0066] Each candidate cropping window includes a cropping center point and a cropping size. The cropping center point can be selected from the cluster center point or a point within its neighborhood. The cropping size does not exceed the width and height of each inspected image and is determined by the following constraints:

[0067]

[0068] Where D is the size basis, a constant that takes a value that is a positive integer multiple of 32. This value is based on the fact that the commonly used convolutional kernel size in object detection models is 32×32. Using a multiple of 32 as the size basis ensures that the feature maps maintain size alignment during model downsampling, avoiding feature loss or computational redundancy due to size mismatch. n is the size scaling factor, a variable used to control the specific size of the cropping window. Theoretically, as long as... n can take any positive integer. In practical applications, the value of n is determined comprehensively based on factors such as data distribution, defect target scale, and model input requirements. w is the width of the inspection image; h represents the height of the inspection image; min( The value ) indicates that the smaller of the width and height of the inspection image is taken, i.e., the shorter side dimension of the image.

[0069] In this embodiment, the constraint ensures that the size of the cropped sub-image never exceeds the boundary of the original image. At the same time, through the multiple design of the size basis D, the sub-image can adapt to the downsampling factor of the model, ensuring the integrity of feature extraction.

[0070] S3. Calculate the number of defect annotation boxes covered by each candidate cropping window, and crop each inspection image according to the candidate cropping window that covers the most defect annotation boxes to obtain a sub-image.

[0071] In the specific implementation process, the number of defect annotation boxes covered by each candidate cropping window is calculated, and the candidate cropping window covering the largest number of defect annotation boxes is selected as the optimal cropping window for the current inspection image. Then, based on the cropping center point and cropping size of the optimal cropping window, the original inspection image is cropped to obtain sub-images.

[0072] In one possible embodiment, if multiple candidate cropping windows cover the same number of defect annotation boxes, the cropping window that is closest to the input size required by the target detection model is selected, and each inspection image is cropped to reduce the scaling during subsequent model processing, thereby preserving the original image details to the greatest extent.

[0073] S4. The target detection model is trained based on multiple sub-images obtained by cropping multiple inspection images to obtain a trained defect detection model.

[0074] In the specific implementation process, multiple sub-images cropped from multiple inspection images are used as training samples. These sub-images are input into the target detection model in batches, and the prediction results are calculated through forward propagation. Defect annotation boxes are used as supervision signals to construct a loss function to optimize the model parameters. The loss function includes two parts: category loss and location loss. Through multiple rounds of iterative optimization, the model's detection performance on the validation set reaches the preset requirements or the loss function converges, resulting in a well-trained defect detection model.

[0075] S5. Generate at least one region of interest based on the unlabeled inspection image.

[0076] In the specific implementation process, unlabeled high-resolution inspection images are input into a pre-trained coarse screening model. The model outputs several candidate regions and their confidence scores. Candidate regions with confidence scores higher than a preset threshold are selected as Regions of Interest (ROIs). Finally, ROIs are cropped from the unlabeled inspection images. ROIs are image regions that may contain insulators; each ROI is defined by its bounding box coordinates in the original image (such as the coordinates of the top-left corner, width, and height). The coarse screening model is a simple, low-parameter model that can quickly identify regions in an image that may contain targets (especially small or densely packed targets) with low computational cost, while effectively suppressing a large number of background regions that do not contain targets.

[0077] S6. Input each region of interest into the defect detection model to obtain the defect detection box and its defect type within each region of interest. Map the coordinates of the defect detection box back to the coordinate system of the unlabeled inspection image to obtain the insulator defect detection results of the unlabeled inspection image.

[0078] In the implementation process, each region of interest (ROI) is input into the defect detection model. The model automatically scales the ROI according to a preset ratio to fit the model's fixed input size. Forward inference is then performed, outputting the defect detection results within that ROI, including one or more defect detection boxes and their corresponding defect types. Each defect detection box is defined by its relative coordinates within the ROI and its category confidence score. The relative coordinates include the top-left corner coordinates, width, and height of the defect detection box.

[0079] The coordinates of the defect detection box are then mapped back to the coordinate system of the unlabeled inspection image. Finally, based on the global coordinates of the defect detection box obtained after mapping and its corresponding defect type, the defect detection box is drawn on the original unlabeled inspection image, the defect category name and confidence score are labeled, and an intuitive defect detection visualization result is generated.

[0080] In one possible embodiment, the coordinates of the defect detection box are mapped back to the coordinate system of the unlabeled inspection image, calculated using the following formula:

[0081] ;

[0082] in, The global coordinates of the top-left corner of the defect detection box in the unlabeled inspection image; The local pixel coordinates of the top-left corner of the defect detection box within the corresponding region of interest sub-image; The global coordinates of the top-left corner of the region of interest in the unlabeled inspection image; These represent the width and height of the region of interest in the unlabeled inspection image, respectively. and These are the width and height of the region of interest sub-image input into the object detection model during inference; This represents the element-wise multiplication operation where corresponding elements are multiplied.

[0083] Please refer to Figure 2 This is a schematic diagram of the defect detection model provided in an embodiment of this application. The defect detection model includes: a feature extraction layer, a multi-scale feature fusion layer, and a detection output layer.

[0084] The feature extraction layer is used to scale and extract features from the input image to be detected according to a preset ratio, and output multiple feature maps at different scales. During the model training phase, the input image to be detected for the feature extraction layer is the sub-image generated by S3, and during the model inference phase, the input image to be detected for the feature extraction layer is the region of interest generated by S5.

[0085] The multi-scale feature fusion layer is used to fuse feature maps of different scales across scales to generate an enhanced fused feature map; the detection output layer is used to predict the location and category of the fused feature map.

[0086] In one possible embodiment, the detection output layer employs a semi-decoupled detection head based on a priori-driven paradigm; please refer to [reference needed]. Figure 3 This is a schematic diagram of the structure of a semi-decoupled detection head provided in an embodiment of this application. The semi-decoupled detection head includes: a feature decoupling module, a branch fusion module, and a unified prediction module.

[0087] The feature decoupling module is used to decouple the fused feature map into independent category feature maps and location feature maps to form category prediction branches and location prediction branches. The location prediction branch uses prior anchor boxes generated based on defect annotation boxes as regression benchmarks. Each branch includes a 3×3 depthwise separable convolution (3×3 DWConv) for further processing of its respective feature map.

[0088] The branch fusion module is used to concatenate the category feature map and the location feature map;

[0089] The unified prediction module is used to perform 1×1 convolution (1×1 Conv), feature map reshaping, and feature map dimension replacement on the stitched feature map, and output the defect detection box and its defect type of the image to be detected.

[0090] Among them, feature map reshaping is used to flatten the spatial dimension so that each spatial location corresponds to a candidate detection box; feature map dimension permutation is used to adjust the order of channel dimension and spatial dimension so that the output format is adapted to subsequent loss calculation or result parsing.

[0091] The semi-decoupled detection head of the priori-driven paradigm proposed in this application has significant advantages in terms of lightweight design, and its parameter calculation formula is as follows:

[0092]

[0093] in, The parameters representing the semi-decoupled detection head in the prior-driven paradigm. This represents the number of parameters in a 3×3 depth-separable convolution. This represents the number of parameters in a 1×1 convolution, where K represents the kernel size. Indicates the number of input channels. Indicates the number of output channels.

[0094] In contrast, the formula for calculating the parameters of a standard detection head is as follows:

[0095]

[0096] in, This indicates the parameter values ​​of the standard detection head. This represents the number of parameters in the location prediction branch. The number of parameters representing the category prediction branch. and N represents the number of channels in the intermediate layer of the standard detection head, N represents the number of categories being detected, and K represents the kernel size. Indicates the number of input channels. Indicates the number of output channels.

[0097] From the above formula, we can see that... The computational load is ,and The computational load is The significant differences between the two demonstrate the advantage in parameter quantity between the semi-decoupled detection head of the prior-driven paradigm proposed in this invention and the standard detection head. Obviously, the semi-decoupled detection head proposed in this invention is significantly lighter than the standard detection head and is more suitable for deployment on edge devices with limited computing power.

[0098] In this embodiment, the present invention employs a semi-decoupled detection head with prior constraints. This structure performs relatively independent feature modeling for the category prediction and location prediction of the defect target, forming parallel category prediction branches and location prediction branches, reducing feature coupling and mutual interference between the two tasks. Simultaneously, a prior anchor frame generated based on the geometric characteristics of the defect annotation box is introduced into the location prediction branch as a regression benchmark, transforming the model from "directly predicting absolute coordinates" to "predicting the offset relative to the prior anchor frame," significantly reducing the learning difficulty of target localization. Since the prior anchor frame reflects the true size distribution of the defect, the model does not need to learn the target scale from zero during prediction, thus achieving stable localization accuracy even with limited training samples.

[0099] The defect detection model adopts a lightweight structural design, replacing standard convolution with operations such as depthwise separable convolution and 1×1 convolution, significantly reducing the number of model parameters and computational load. While maintaining detection accuracy, the model can be deployed on hardware platforms with limited computing power, such as UAV-borne edge computing devices, meeting the real-time requirements of actual inspection scenarios.

[0100] In one possible embodiment, the specific process of generating prior anchor boxes using a genetic algorithm is as follows: Figure 4 As shown, the prior anchor frame is generated through the following steps:

[0101] S11. Generate the initial anchor frame set;

[0102] Generate an initial set of anchor frames Where the superscript 0 represents a generation of 0, and M represents the maximum number of anchor frame sets; each anchor frame set It contains a set of anchor frames, each represented by its width w and height h:

[0103]

[0104] Where k is the number of anchor frames contained in each anchor frame set.

[0105] S12. The average maximum crossover ratio between the anchor frame and the defect annotation frame is used as the fitness function.

[0106] The fitness function is used to evaluate the quality of the anchor box set, and its expression is as follows:

[0107]

[0108] Where t represents the generation number; This represents the set of the m-th anchor frames in generation t; For anchor frame set The fitness of the defect annotation boxes; N is the total number of defect annotation boxes; IoU represents the intersection-union function; This represents the i-th defect annotation box. Represents the set of anchor frames The j-th anchor frame in the set; k is the number of anchor frames contained in each anchor frame set.

[0109] S13. Iteratively optimize the initial anchor box set based on the fitness function until the preset termination condition is reached, and take the anchor box set with the largest fitness as the prior anchor box.

[0110] Specifically, S13 includes the following sub-steps:

[0111] S131. In each iteration, calculate the selection probability of each anchor box set according to the fitness function;

[0112]

[0113] in, Represents the set of anchor frames The probability of being selected; M represents the maximum number of anchor boxes in the set.

[0114] S132. Select multiple parent anchor box sets from the current anchor box set based on selection probability. The higher the fitness of the anchor box set, the greater the probability of it being selected.

[0115] S133. Perform a cross operation on the corresponding anchor frames in any two parent anchor frame sets to generate child anchor frames.

[0116] Crossover operations are implemented using the following formula:

[0117]

[0118] in, and These are the corresponding anchor frames in the two parent anchor frame sets; Indicates the child anchor frame; is a random number that is uniformly distributed in the interval [0,1].

[0119] S134. Perform a mutation operation on the offspring anchor frame.

[0120] Mutation operations are performed using the following formula:

[0121]

[0122] in, Anchor frame before mutation; δ represents the anchor frame after the mutation; δ is a random number drawn from a normal distribution with a mean of 0 and a standard deviation of σ.

[0123] S135. Repeat the selection, crossover, and mutation operations until the preset termination condition is met, and use the set of anchor boxes with the highest fitness as the prior anchor boxes.

[0124] Repeat steps S131-S134 until a preset termination condition is met. The termination condition includes any of the following: reaching the preset maximum number of iterations, fitness convergence (the rate of change of the fitness function value across multiple generations is less than a preset threshold), or meeting the recall threshold (the average maximum intersection-union ratio of the anchor box set and the defect annotation boxes reaches a preset recall threshold). After the iteration ends, the anchor box set with the highest fitness is used as the final generated prior anchor boxes.

[0125] In this embodiment, the genetic algorithm uses the average maximum intersection-union ratio (MUC) between anchor frames and defect annotation frames as the fitness function. Through iterative optimization via selection, crossover, and mutation operations, the generated prior anchor frames accurately reflect the true size distribution of insulator defects. Compared to traditional anchor frame generation methods based on empirical settings or simple clustering, the anchor frames generated by the genetic algorithm are more representative and adaptable, covering diverse scales of various defects (such as spontaneous explosion, breakage, and contamination). The genetic algorithm only runs offline once before model training, and the generated prior anchor frames can be embedded in the semi-decoupled detection head without increasing the computational burden during the model inference stage.

[0126] To verify the effectiveness of the present invention, a comparative experiment was designed based on a dataset of transmission line images in a real power grid environment to evaluate the comprehensive performance of the present invention in insulator location, identification and defect detection from multiple dimensions.

[0127] Step 1: Model training.

[0128] All 4717 real images were divided into training, validation, and test sets in a 7:2:1 ratio. The batch size was set to 32, the training epochs were 200, the learning rate was set to 2e-3, the AdamW optimizer was used, and a cosine annealing scheduler was used to dynamically adjust the learning rate to improve the training effect.

[0129] Step 2: Model evaluation.

[0130] The model's overall performance was verified using a pre-defined test set. The definitions of the different partitioning methods and inference strategies involved in the experiment are shown in Table 1.

[0131] Table 1

[0132]

[0133] To verify the effectiveness of the block partitioning strategy, four mainstream models—YOLOv8, YOLOv9, YOLOv10, and YOLO11—were selected, and comparative experiments were conducted using Method 1 (no optimization, direct compressed inference) and Method 2 (ordinary sequential block partitioning), respectively. The experimental results are shown in Table 2.

[0134] Table 2

[0135]

[0136] Experimental results show that the detection accuracy of the model using Method 2 is significantly better than that of Method 1, with an average improvement of 5.6 percentage points. This indicates that in high-resolution image scenarios, block processing can effectively avoid the loss of small target information caused by image compression, thereby improving defect detection accuracy.

[0137] Comparative experiments were conducted using several mainstream lightweight models, including YOLO11, GC-YOLO, BC-YOLO, FINet, and AE-YOLO. Each model was trained and inferred using Method 2 (ordinary sequential block partitioning) and Method 3 (the method of this invention), respectively. The experimental results are shown in Table 3.

[0138] Table 3

[0139]

[0140] Experimental results show that for all comparative models, Method 3 achieves higher detection accuracy than Method 2, verifying the effectiveness of the segmentation and inference strategy proposed in this invention. The defect detection model proposed in this invention achieves a mAP of 90.1% under Method 2, significantly outperforming other comparative models; under Method 3, it further improves to 92.9%, fully demonstrating the superiority of the defect detection model proposed in this invention.

[0141] To comprehensively evaluate the overall performance of the model proposed in this invention, several mainstream detection models, including the NanoDet series and RT-DETR series, were selected and trained and inferred using the method three proposed in this invention. The detection accuracy, number of parameters, and computational cost were compared. The experimental results are shown in Table 4.

[0142] Table 4

[0143]

[0144] Experimental results show that the model of this invention significantly outperforms all comparative models with an average accuracy of 92.9%. The model of this invention has only 2.17M parameters, lower than all comparative models; and a computational cost of only 4.7 GFLOPs, also lower than all comparative models. This demonstrates that the model of this invention achieves the highest detection accuracy while maintaining extremely low parameter and computational costs.

[0145] In summary, this application provides a method for detecting insulator defects in high-resolution images, which has the following technical features and beneficial effects:

[0146] First, the guided cropping strategy during the training phase improves sample quality and training efficiency. During model training, by optimizing the cropping position and range, the cropped sub-images can cover more effective target regions, thus generating training samples with higher information density. This strategy effectively reduces computational redundancy and feature interference caused by irrelevant background regions, improving both training sample quality and computational efficiency during the training phase.

[0147] Second, adaptive region of interest (ROI) detection during the inference phase avoids full-image traversal. During inference, ROIs that may contain defective targets are quickly identified at a coarse-grained level, and then fine-grained detection is performed only on these ROIs, thus avoiding full-image traversal inference across the entire high-resolution image. This strategy significantly reduces the computational overhead of the inference phase while preserving the original image details, meeting the computational constraints of edge devices.

[0148] Third, the consistency of data distribution between training and inference enhances detection stability. During the training phase, task-guided cropping amplifies the relative scale of defective targets, enabling the model to learn clear target features. During the inference phase, adaptive region-of-interest detection reproduces similar target scale characteristics in the original high-resolution image. This design ensures that the data distribution encountered by the model during training remains consistent with the actual data distribution detected during inference, effectively enhancing the stability and generalization ability of the detection results.

[0149] Fourth, prior constraints and a semi-decoupled structure improve the detection accuracy of small targets. Considering the relatively stable characteristics of insulator defects in terms of geometry and size distribution, a prior geometric constraint based on a genetic algorithm is introduced into the detection model. A relatively decoupled prediction structure is also adopted, modeling category prediction and location prediction independently. This design significantly reduces the learning difficulty for the model to accurately locate small-scale, weak-feature defect targets, thus improving the recall and localization accuracy of defect detection.

[0150] Based on the same inventive concept, this application also provides a computer device, which includes a processor, a memory, and a computer program stored in the memory. The computer program is executed by the processor to implement the aforementioned insulator defect detection method for high-resolution images.

[0151] Based on the same inventive concept, this application also provides a computer storage medium storing a computer program, which is executed by a processor to implement the aforementioned insulator defect detection method for high-resolution images.

[0152] In some embodiments, the computer-readable storage medium may be a memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or it may be a device including one or any combination of the above-mentioned memories. The computer may be a variety of computing devices, including smart terminals and servers.

[0153] In some embodiments, executable instructions may take the form of a program, software, software module, script, or code, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and may be deployed in any form, including as a standalone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.

[0154] As an example, executable instructions may, but do not necessarily, correspond to files in the file system. They may be stored as part of a file that holds other programs or data, for example, in one or more scripts in a Hyper Text Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple collaborative files (e.g., a file that stores one or more modules, subroutines, or code sections).

[0155] As an example, executable instructions can be deployed to execute on a single computing device, or on multiple computing devices located in one location, or on multiple computing devices distributed across multiple locations and interconnected via a communication network.

[0156] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

[0157] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0158] The above specific embodiments further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for detecting insulator defects in high-resolution images, characterized in that, include: Acquire multiple labeled inspection images; Each inspection image is a high-resolution image containing insulators and carrying tag information; the tag information includes defect marking boxes on the insulators and the corresponding defect types; Based on the defect annotation boxes in each inspection image, multiple candidate cropping windows are determined; each candidate cropping window includes a cropping center point and a cropping size; the cropping size does not exceed the width and height of each inspection image; Calculate the number of defect annotation boxes covered by each candidate cropping window, and crop each inspection image according to the candidate cropping window that covers the most defect annotation boxes to obtain a sub-image; The target detection model is trained by cropping multiple sub-images from multiple inspection images to obtain a trained defect detection model. Based on the unlabeled inspection image, at least one region of interest is generated; the region of interest is an image region that may contain insulators. Each region of interest is input into the defect detection model to obtain the defect detection box and its defect type within each region of interest. The coordinates of the defect detection box are then mapped back to the coordinate system of the unlabeled inspection image to obtain the insulator defect detection result of the unlabeled inspection image.

2. The insulator defect detection method for high-resolution images according to claim 1, characterized in that, The cutting dimensions are determined by the following constraints: ; Where D is the size base, which takes values ​​that are positive integer multiples of 32; n is the size scaling factor; w is the width of the inspection image; h is the height of the inspection image; and min() represents the minimum value function.

3. The insulator defect detection method for high-resolution images according to claim 1, characterized in that, The defect detection model includes: The feature extraction layer is used to scale and extract features from the input image to be detected according to a preset ratio, and output multiple feature maps at different scales; the image to be detected is the sub-image or the region of interest. A multi-scale feature fusion layer is used to perform cross-scale fusion of feature maps of different scales to generate an enhanced fused feature map. The detection output layer employs a semi-decoupled detection head based on a priori-driven paradigm, including: The feature decoupling module is used to decouple the fused feature map into mutually independent category feature maps and location feature maps to form category prediction branches and location prediction branches; wherein, the location prediction branch uses the prior anchor box generated based on the defect annotation box as the regression benchmark. The branch fusion module is used to concatenate the category feature map and the location feature map; The unified prediction module is used to perform 1×1 convolution, feature map reshaping, and feature map dimension replacement on the stitched feature map, and output the defect detection box and defect type of the image to be detected.

4. The insulator defect detection method for high-resolution images according to claim 3, characterized in that, The prior anchor frame is generated through the following steps: Generate an initial set of anchor boxes; each set of anchor boxes contains a group of anchor boxes, each anchor box being represented by its width and height. The average maximum intersection-union ratio between the anchor frame and the defect annotation frame is used as the fitness function; The initial set of anchor frames is iteratively optimized based on the fitness function until a preset termination condition is reached, and the set of anchor frames with the highest fitness is taken as the prior anchor frames.

5. The insulator defect detection method for high-resolution images according to claim 4, characterized in that, The iterative optimization of the initial anchor box set based on the fitness function until a preset termination condition is reached, and the anchor box set with the highest fitness as the prior anchor boxes, includes: In each iteration, the selection probability of each anchor box set is calculated according to the fitness function; Multiple parent anchor frame sets are selected from the current anchor frame set based on the selection probability; Perform a cross operation on the corresponding anchor frames in any two parent anchor frame sets to generate child anchor frames; Perform a mutation operation on the offspring anchor frame; Repeat the selection, crossover, and mutation operations until the preset termination condition is met, and use the set of anchor frames with the highest fitness as the prior anchor frames.

6. The insulator defect detection method for high-resolution images according to claim 4, characterized in that, The fitness function is expressed as follows: ; Where t represents the generation number; This represents the set of the m-th anchor frames in generation t; For anchor frame set The fitness of the defect annotation boxes; N is the total number of defect annotation boxes; IoU represents the intersection-union function; This represents the i-th defect annotation box. Represents the set of anchor frames The j-th anchor frame in the set; k is the number of anchor frames contained in each anchor frame set.

7. The insulator defect detection method for high-resolution images according to claim 1, characterized in that, The step of determining multiple candidate cropping windows based on the defect annotation boxes in each inspection image includes: Obtain the center point coordinates of each defect annotation box in each inspection image, perform cluster analysis on the center point coordinates, and determine the neighborhood where the cluster center is located as the defect-dense area. Multiple candidate clipping windows are generated centered on the region with dense defects.

8. The insulator defect detection method for high-resolution images according to claim 1, characterized in that, The coordinates of the defect detection box are mapped back to the coordinate system of the unlabeled inspection image, calculated using the following formula: ; in, The global coordinates of the top left corner of the defect detection box in the unlabeled inspection image; The local pixel coordinates of the top-left corner of the defect detection box within the corresponding region of interest sub-image; The global coordinates of the upper left corner of the region of interest in the unlabeled inspection image; These are the width and height of the region of interest in the unlabeled inspection image, respectively. and These are the width and height of the region of interest sub-image input into the object detection model during inference; This represents the element-wise multiplication operation where corresponding elements are multiplied.

9. A computer device, characterized in that, The computer device includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to implement an insulator defect detection method for high-resolution images as described in any one of claims 1-8.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, and the processor executes the computer program to implement an insulator defect detection method for high-resolution images as described in any one of claims 1-8.