A power transmission line image segmentation method, device, electronic equipment, storage medium and program product

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using a residual block structure to perform feature fusion and deep semantic information processing on transmission line images, the robustness and computational overhead of existing methods in complex backgrounds are solved, and high-precision transmission line segmentation is achieved.

CN122265652APending Publication Date: 2026-06-23STATE GRID JIANGSU ELECTRIC POWER CO XUZHOU POWER SUPPLY CO +1

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: STATE GRID JIANGSU ELECTRIC POWER CO XUZHOU POWER SUPPLY CO
Filing Date: 2026-04-27
Publication Date: 2026-06-23

Application Information

Patent Timeline

27 Apr 2026

Application

23 Jun 2026

Publication

CN122265652A

IPC: G06V10/26; G06V10/80; G06V10/774; G06V10/764; G06V10/25; G06V10/82; G06V20/70; G06N3/0464; G06N20/00

AI Tagging

Application Domain

Character and pattern recognition Biological models

Technology Topics

Pattern recognition Image segmentation

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing transmission line segmentation methods are not robust in complex backgrounds and are difficult to adapt to changes in lighting, image noise, and occlusion interference. Furthermore, deep learning models based on self-attention mechanisms have high computational costs, making them difficult to deploy and promote in practical power inspection scenarios.

Method used

A residual block structure is used to perform feature fusion on transmission line images. By combining the initial image features and the feature fusion operation output by the residual blocks with deep semantic information, semantic segmentation is performed to improve segmentation accuracy.

Benefits of technology

It achieves high-precision segmentation of power transmission lines in complex backgrounds, improves detection accuracy and robustness, and is highly adaptable to power inspection scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122265652A_ABST

Patent Text Reader

Abstract

The application discloses a power transmission line image segmentation method and device, electronic equipment, storage medium and program product. Specifically, the power transmission line image is determined; the initial image features corresponding to the power transmission line image are input into a residual block structure to obtain residual block output; a feature fusion operation is performed on the initial image features and the residual block output to obtain at least one fusion feature map corresponding to the power transmission line image; and based on the initial image features, the deep semantic information corresponding to each fusion feature map and the residual block output, a semantic segmentation map corresponding to the power transmission line image is obtained. The application realizes the extraction and expression of the edge information of the power transmission line image through the residual block structure, realizes the fusion of multi-scale information by performing the feature fusion operation, improves the detection accuracy of the power transmission line, finally obtains the semantic segmentation map, realizes the more accurate extraction of the structure of the power transmission line, and improves the segmentation precision of the power transmission line image based on the deep semantic information.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of communication technology, and in particular to a method, apparatus, electronic device, storage medium, and program product for segmenting images of power transmission lines. Background Technology

[0002] Existing transmission line segmentation methods are based on traditional image processing methods, relying on shallow features such as gray-level distribution and edge morphology in the image, and combining filtering, geometric constraints or edge detection algorithms to extract the transmission lines.

[0003] However, existing methods exhibit poor robustness in complex environments, struggling to adapt to unstructured factors such as lighting variations, image noise, and occlusion interference. Furthermore, some studies have attempted to introduce deep learning model architectures based on self-attention mechanisms to enhance global dependency modeling capabilities, but these structures generally suffer from high computational overhead. Therefore, there is currently a lack of a transmission line segmentation technology that can adapt to complex environments and be deployed and widely applied in practical power line inspection scenarios. Summary of the Invention

[0004] This invention provides a method, apparatus, electronic device, storage medium, and program product for segmenting power transmission line images, thereby achieving semantic segmentation of power transmission line images and improving the segmentation accuracy of power transmission line images.

[0005] According to one aspect of the present invention, a method for segmenting images of power transmission lines is provided, comprising: Determine a power transmission line image, wherein the power transmission line image includes the power transmission line and the scene in which the power transmission line is located; The initial image features corresponding to the transmission line image are input into the residual block structure to obtain the residual block output. The residual block structure consists of at least one residual block. The residual block output indicates information about the transmission line image. The initial image features include the edge features of the transmission line in the transmission line image. A feature fusion operation is performed on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; Based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the output of the residual block, a semantic segmentation map corresponding to the transmission line image is obtained. The deep semantic information includes the context information output by the residual block, and the semantic segmentation map includes the image formed by the transmission line.

[0006] According to another aspect of the present invention, a power transmission line image segmentation apparatus is provided, comprising: A determination module is used to determine a power transmission line image, wherein the power transmission line image includes the power transmission line and the scene in which the power transmission line is located; The input module is used to input the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output. The residual block structure consists of at least one residual block. The residual block output indicates information about the transmission line image. The initial image features include the edge features of the transmission line in the transmission line image. The fusion module is used to perform a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; The segmentation module is used to obtain a semantic segmentation map corresponding to the power transmission line image based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the output of the residual block. The deep semantic information includes the context information output by the residual block, and the semantic segmentation map includes the image formed by the power transmission line.

[0007] According to another aspect of the present invention, an electronic device is provided, the electronic device comprising: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the power line image segmentation method according to any embodiment of the present invention.

[0008] According to another aspect of the present invention, a computer-readable storage medium is provided, the computer-readable storage medium storing computer instructions for causing a processor to execute and implement the power line image segmentation method according to any embodiment of the present invention.

[0009] According to another aspect of the present invention, a computer program product is provided, the computer program product comprising a computer program that, when executed by a processor, implements the power transmission line image segmentation method according to any embodiment of the present invention.

[0010] The technical solution of this invention involves determining a transmission line image; inputting the initial image features corresponding to the transmission line image into a residual block structure to obtain the residual block output; performing a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; and obtaining a semantic segmentation map corresponding to the transmission line image based on the initial image features, each fused feature map, and the deep semantic information corresponding to the residual block output. The residual block structure enables the extraction and representation of edge information from the transmission line image, the feature fusion operation achieves the fusion of multi-scale information, improving the accuracy of transmission line detection, and finally, the semantic segmentation map is obtained. This achieves more accurate extraction of the transmission line structure and improves the segmentation accuracy of the transmission line image based on deep semantic information.

[0011] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of the present invention, nor is it intended to limit the scope of the invention. Other features of the invention will become readily apparent from the following description. Attached Figure Description

[0012] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0013] Figure 1 This is a flowchart of a power transmission line image segmentation method provided in Embodiment 1 of the present invention; Figure 2 This is a schematic diagram of a power transmission line image segmentation process according to Embodiment 1 of the present invention; Figure 3 This is a schematic diagram of a feature fusion module provided according to Embodiment 1 of the present invention; Figure 4 This is a flowchart of a semantic segmentation graph determination method provided in Embodiment 2 of the present invention; Figure 5 This is a schematic diagram of a context extraction module provided according to Embodiment 2 of the present invention; Figure 6 This is a schematic diagram of the structure of a power transmission line image segmentation device according to Embodiment 3 of the present invention; Figure 7 This is a block diagram of an electronic device provided according to Embodiment 4 of the present invention. Detailed Implementation

[0014] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0015] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of the invention described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover a non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.

[0016] Example 1 Figure 1 This is a flowchart of a transmission line image segmentation method according to Embodiment 1 of the present invention. This embodiment is applicable to the segmentation of transmission line images. The method can be executed by a transmission line image segmentation device, which can be implemented in hardware and / or software. The transmission line image segmentation device can be configured in an electronic device, such as a PC or a server. Figure 1 As shown, the method includes: S110. Determine the image of the power transmission line.

[0017] The image of the power transmission line includes the power transmission line and the scene in which the power transmission line is located.

[0018] In this embodiment, the power transmission line image can be understood as an image containing the scene where the power transmission line is located. The power transmission line image can be an image obtained by preprocessing images taken in various scenes.

[0019] Specifically, images containing power transmission lines are acquired under various complex scenarios, including different shooting angles, lighting conditions, weather conditions, monitoring perspectives, and power transmission line distribution densities. These images are then incorporated into a dataset. Images can be acquired using equipment deployed on power towers, primarily from an upward shooting angle. The acquired images undergo preprocessing, including downsampling, cropping, and selection, to obtain the power transmission line images.

[0020] S120. Input the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output.

[0021] The residual block structure consists of at least one residual block, which outputs information indicating the transmission line image, and the initial image features include the edge features of the transmission line in the transmission line image.

[0022] In this embodiment, the initial image features can be understood as the edge features of the initially extracted transmission line image. The residual block structure can be understood as a structure composed of different residual blocks, which can be used to capture information from the transmission line image. The residual block output can be understood as the result obtained after processing the initial image features through the residual block structure.

[0023] Specifically, initial image features of the transmission line image can be extracted through an input module, which can be a module composed of convolutional layers and activation functions. The initial image features are then input into a residual block structure to obtain residual block data. This residual block structure includes at least one residual block. The initial image features are input into the first residual block to obtain its output. The output of the first residual block is then used as the input to the second residual block to obtain its output, and so on. Finally, the outputs of each residual block are used as the residual block output.

[0024] For example, Figure 2 This is a schematic diagram of a power transmission line image segmentation process according to Embodiment 1 of the present invention. Figure 2 As shown, the transmission line image is input into an Input Projection module to perform preliminary feature extraction, obtaining initial image features indicating the edge features of the transmission line image. This input module includes a 3×3 convolutional layer and a LeakyReLU activation function. The initial image features are then input into a residual block structure composed of ResBlocks to obtain the residual block output.

[0025] S130. Perform a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image.

[0026] In this embodiment, the fused feature map can be understood as the image after feature fusion of the residual block output and the initial feature image. The fused feature map retains the complete structural information of the transmission line.

[0027] For example, such as Figure 2As shown, the feature fusion operation can be implemented through the Feature Fusion Module (FSFF), which can include three modules with different parameters. For each feature fusion module, the initial image features and residual block outputs can be input into the feature fusion module to obtain the fused feature map corresponding to that feature fusion module.

[0028] S140. Based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the residual block output, a semantic segmentation map corresponding to the power transmission line image is obtained.

[0029] The deep semantic information includes the contextual information output by the residual block, and the semantic segmentation map includes an image formed by the transmission line.

[0030] In this embodiment, deep semantic information can be understood as information extracted from the residual block output through context. Deep semantic information can be the context information output by the residual block, and it can be used to enrich the semantic information contained in the initial image features. A semantic segmentation map can be understood as an image after semantic segmentation of the transmission line image. The semantic segmentation map contains only the transmission line, and its image quality is the same as that of the transmission line image.

[0031] Specifically, context extraction is performed on the residual block data to obtain deep semantic information. This deep semantic information and the residual block output can be input together into the decoding module for decoding. The decoded result is then input into the output module, which corresponds to the input module. Finally, the output of the output module, the decoded result, and the deep semantic information are upsampled and convolved to unify the number of channels in the results. Finally, the unified results are fused channel by channel to obtain the semantic segmentation map.

[0032] For example, such as Figure 2 As shown, Dout is the final semantic segmentation map, and the size of the semantic segmentation map is the same as the size of the power transmission line image.

[0033] The technical solution of this invention involves determining a transmission line image; inputting the initial image features corresponding to the transmission line image into a residual block structure to obtain the residual block output; performing a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; and obtaining a semantic segmentation map corresponding to the transmission line image based on the initial image features, each fused feature map, and the deep semantic information corresponding to the residual block output. The residual block structure enables the extraction and representation of edge information from the transmission line image, the feature fusion operation achieves the fusion of multi-scale information, improving the accuracy of transmission line detection, and finally, the semantic segmentation map is obtained. This achieves more accurate extraction of the transmission line structure and improves the segmentation accuracy of the transmission line image based on deep semantic information.

[0034] Based on the above embodiments, modified embodiments of the above embodiments are proposed. It should be noted that, in order to keep the description brief, only the differences from the above embodiments are described in the modified embodiments.

[0035] In one embodiment, performing a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image includes: Determine at least one feature fusion scale corresponding to the feature fusion operation; For each feature fusion scale, a feature fusion operation is performed on the initial image features and the residual block output according to the feature fusion scale to obtain a fused feature map.

[0036] In this embodiment, the feature fusion scale can be understood as a parameter for implementing the feature fusion operation. The feature fusion scale is related to the detailed information contained in the fused feature map.

[0037] Specifically, the feature fusion modules that implement the feature fusion operation are determined, and a feature fusion scale is set for each feature fusion module. For each feature fusion module, the feature fusion operation is performed on the initial image features and residual block output according to the feature fusion scale corresponding to that feature fusion module, to obtain the fused feature map at that feature fusion scale.

[0038] Optionally, performing a feature fusion operation on the initial image features and the residual block output to obtain a fused feature map includes: The initial image features are downsampled to obtain the first context features; Determine the first residual block output, the second residual block output, the third residual block output, and the fourth residual block output included in the residual block output; The output of the first residual block is downsampled a second time to obtain the first abstract semantic information, wherein the downsampling factor of the first downsampling is greater than the downsampling factor of the second downsampling. Convolution is performed on the output of the second residual block to obtain the integrated features; The output of the third residual block is first upsampled to obtain the second abstract semantic information; The output of the fourth residual block is upsampled a second time to obtain a second context feature, wherein the upsampling factor of the second upsampling is greater than the upsampling factor of the first upsampling. The first context feature, the first abstract semantic information, the integrated feature, the second abstract semantic information, and the second context feature are concatenated to obtain the concatenated information. Efficient additive attention calculation is performed on the spliced information to obtain a fused feature map.

[0039] In this embodiment, downsampling can be understood as an operation used to reduce spatial resolution and suppress overfitting. Different downsampling factors can achieve different effects. The first downsampling and the second downsampling are downsampling with different downsampling factors, where the downsampling factor of the first downsampling is greater than that of the second downsampling. Upsampling can be understood as an operation used to improve spatial resolution. Different upsampling factors can achieve different effects. The first upsampling and the second upsampling are upsampling with different upsampling factors, where the upsampling factor of the second upsampling is greater than that of the first upsampling.

[0040] The first, second, third, and fourth residual block outputs are the outputs of each residual block contained in the residual block structure. The first contextual feature can be understood as a coarse-grained contextual feature extracted from the initial image features. The first abstract semantic information can be understood as a relatively abstract semantic information extracted from the first residual block output. The integrated feature can be understood as the feature extracted from the second residual block data, maintaining the same original spatial size as the second residual block output. The second abstract semantic information can be understood as a more abstract semantic information extracted from the third residual block output. The second contextual feature can be understood as a fine-grained contextual feature extracted from the fourth residual block output. The concatenated information can be understood as the information obtained by concatenating the first contextual feature, the first abstract semantic information, the integrated feature, the second abstract semantic information, and the second contextual feature through channel concatenation.

[0041] For example, Figure 3 This is a schematic diagram of a feature fusion module provided according to Embodiment 1 of the present invention. Figure 3As shown, the feature fusion module that performs the feature fusion operation contains five parallel branches. IP represents the initial image features, which are shallower, high-resolution features containing more spatial detail. RB0, RB1, RB2, and RB3 are intermediate features, derived from network layers of different depths, representing different scales and semantic abstraction levels. The IP branch extracts the first contextual features by performing a first downsampling on the initial image features (convolution conv and 4x downsampling Maxpooling×4). The RB0 branch obtains the first abstract semantic information by performing a second downsampling on the output of the first residual block (convolution conv and 2x downsampling Maxpooling×2). The RB1 branch directly performs a convolution conv on the output of the second residual block to achieve feature integration and obtain the integrated features. The RB2 branch obtains the second abstract semantic information by performing a first upsampling on the output of the third residual block (convolution conv and 2x upsampling Upsample×2). The RB3 branch obtains the second contextual features by performing a second upsampling on the output of the fourth residual block (convolution conv and 4x upsampling Upsample×4). The outputs of the five parallel branches are all adjusted to the same size as RB1, and the first contextual feature, the first abstract semantic information, the integrated feature, the second abstract semantic information, and the second contextual feature are concatenated to obtain the concatenated information. The concatenated information is then input into the EASA efficient additive attention module, which performs efficient additive attention computation, to output the fused feature map.

[0042] In the EASA efficient additive attention module, the concatenated information of the input is determined. The concatenated information is then processed through two linear transformation matrices. and Convert to query ( ) and key-value pairs ),in , , The length of the token. The dimension of the embedded vector. The query value... Multiply by the learnable parameter vector Generate global attention query vector Query value Pooling is performed based on the learned attention weights to obtain a global query vector. Global query vector With key value The interactions between tokens are encoded through element-wise multiplication to form a global context matrix. This global context matrix captures information from each token and flexibly learns relevance in the input sequence. Furthermore, a linear transformation is introduced during the query-key interaction to learn the hidden representation of the token. Therefore, the fused feature map output by EASA can be represented as... ,in, This represents the normalized query value. This represents a linear transformation function.

[0043] In one embodiment, the step of inputting the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output includes: Extract initial image features from the transmission line image; According to the order of each residual block in the residual block structure, the first residual block, the second residual block, the third residual block and the fourth residual block that make up the residual block structure are determined; The initial image features are input into the first residual block to obtain the first residual block output; The output of the first residual block is input into the second residual block to obtain the output of the second residual block; The output of the second residual block is input into the third residual block to obtain the output of the third residual block; The output of the third residual block is input into the fourth residual block to obtain the output of the fourth residual block; The first residual block output, the second residual block output, the third residual block output, and the fourth residual block output are determined as residual block outputs.

[0044] For example, the first residual block, the second residual block, the third residual block, and the fourth residual block are all residual blocks that make up the residual block structure. These residual blocks may contain the same or different numbers of layers. For example, the first residual block has 3 layers (ResBlocks×3), the second residual block has 4 layers (ResBlocks×4), the third residual block has 6 layers (ResBlocks×6), and the fourth residual block has 3 layers (ResBlocks×3). The initial image features are input into the first residual block (ResBlocks×3) to obtain the first residual block output; the first residual block output is input into the second residual block (ResBlocks×4) to obtain the second residual block output; the second residual block output is input into the third residual block (ResBlocks×6) to obtain the third residual block output; and the third residual block output is input into the fourth residual block (ResBlocks×3) to obtain the fourth residual block output. The outputs of the first residual block, the second residual block, the third residual block, and the fourth residual block are defined as residual block outputs.

[0045] In one embodiment, determining the transmission line image includes: Acquire initial images of the power transmission line, which include the acquired images of the power transmission line and the scene in which the power transmission line is located; The initial transmission line image is downsampled to obtain a downsampled image; The downsampled image is cropped at a set interval to obtain at least one cropped image. For each cropped image, if the cropped image contains the power transmission line, the cropped image is determined to be a power transmission line image.

[0046] In this embodiment, the initial transmission line image can be understood as an image that includes the transmission line and the scene in which it is located. The downsampled image can be understood as the image obtained after performing a downsampling operation on the initial transmission line image. The set interval can be a cropping interval set for the initial transmission line image. The cropped image can be understood as the image obtained after cropping the downsampled image; one downsampled image can be cropped into multiple cropped images.

[0047] For example, due to the inconsistent resolution of the initial power transmission line images, the initial power transmission line images are first downsampled to obtain downsampled images. Then, the downsampled images are cropped at intervals of 300 pixels to expand the sample size, and the scale of the cropped images is 1280*960. Finally, power transmission line images are selected; for each cropped image, if it contains a power transmission line, it is identified as a power transmission line image.

[0048] Example 2 Figure 4 This is a flowchart of a semantic segmentation graph determination method according to Embodiment 2 of the present invention. This embodiment focuses on the semantic segmentation graph determination method described in the above embodiment. Figure 4 As shown, the method includes: S210. Determine the image of the power transmission line.

[0049] S220. Input the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output.

[0050] S230. Perform a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image.

[0051] S240. Determine the deep semantic information corresponding to the output of the residual block.

[0052] Specifically, context extraction can be performed on the lower-level outputs of the residual block to obtain deep semantic information. Context extraction can then be used to further extract and enhance the semantic information contained in the initial image features.

[0053] Optionally, the residual block output includes a first residual block output, a second residual block output, a third residual block output, and a fourth residual block output, and determining the deep semantic information corresponding to the residual block output includes: Determine the fourth residual block output contained in the residual block output; Context extraction is performed on the output of the fourth residual block to obtain parallel branch information, which includes horizontal global information, horizontal fine-grained information, spatial context information, vertical structure information, and vertical global information. The parallel branch information is added element-wise to the output of the fourth residual block to obtain deep semantic information.

[0054] In this embodiment, parallel branch information can be understood as information obtained after context extraction, and parallel branch information may include features of different scales and directions.

[0055] For example, Figure 5 This is a schematic diagram of a context extraction module provided according to Embodiment 2 of the present invention. Figure 5As shown, context extraction is performed on the output of the fourth residual block, which includes five parallel branches. Branch B1 is the horizontal branch, using a vertical global average pooling of size Pool(H,1), Conv1×1 convolution, BN, and ReLU activation functions to process the output of the fourth residual block, obtaining horizontal global information. Branch B2 is a Conv3×1 convolution branch, consisting of three consecutive 3×1 convolutional layers, with the number of channels in each layer varying sequentially, representing channel growth of ×1, ×2, and ×4 respectively. Processing the output of the fourth residual block through branch B2 yields fine-grained horizontal information. Branch B3 is a Conv3×3 convolution branch, consisting of three... The block consists of several consecutive 3×3 convolutional layers, with the number of channels varying sequentially to represent channel increases of ×1, ×2, and ×4, respectively. Branch B3 processes the output of the fourth residual block to obtain multi-scale spatial context information, which possesses strong local spatial features. Branch B4 is a Conv1×3 convolutional branch, composed of three stacked 1×3 convolutions, again with the number of channels varying sequentially to represent channel increases of ×1, ×2, and ×4, yielding vertical structural information. Branch B5 is a vertical pooling branch, using horizontal global average pooling of size Pool (1,W), Conv1×1 convolution, BN, and ReLU activation functions to process the output of the fourth residual block, obtaining vertical global information. The horizontal global information, horizontal fine-grained information, spatial context information, vertical structural information, and vertical global information are fused element-wise by concatenating with the channels to obtain a fused feature map. This fused feature map is then further integrated with information through a Conv1×1 convolution operation, controlling the number of channels to obtain parallel branch information. Finally, the parallel branch information and the output of the fourth residual block are joined by element-wise addition to obtain deep semantic information.

[0056] S250. Decode the initial image features, the deep semantic information, and each of the fused feature maps to obtain at least one decoded feature map.

[0057] In this embodiment, the decoded feature map can be understood as the image obtained after decoding the initial image features, deep semantic information and each fused feature map. The size of the decoded feature map is the same as the size of the transmission line image, and the decoded feature map can contain the details of the transmission line.

[0058] For example, such as Figure 2As shown, each decoding module includes convolution, batch normalization, and ReLU activation operations, and uses upsampling to gradually restore the size of the decoded feature map to the same size as the transmission line image. The fused feature map output from the FSFF module is input into the corresponding decoding modules (Decoder Blocks), and each decoding module also receives the output of the previous decoding module. For example, deep semantic information D5 and the fused feature map are used as input to the first decoding module to obtain the output D4; then the output of the first decoding module and the fused feature map are used as input to the second decoding module to obtain the output D3; then the output of the second decoding module and the fused feature map are used as input to the third decoding module to obtain the output D2; finally, the output of the third decoding module and the initial image features are used as input to the output module to obtain the output D1. The output module can be a decoding module. The outputs D1, D2, D3, and D4 of each decoding module are used as the decoded feature maps.

[0059] S260. Convolve the decoded feature maps and the deep semantic information to obtain the semantic segmentation map corresponding to the power transmission line image.

[0060] The image quality of the semantic segmentation map is the same as that of the transmission line image, and the image quality includes resolution and channels.

[0061] Specifically, the decoded feature maps and deep semantic information are convolved, and the convolution results are fused channel by channel to obtain the semantic segmentation map corresponding to the power transmission line image. The semantic segmentation map is a binary image with the same image quality as the power transmission line image; for example, the semantic segmentation map has the same resolution and channels as the power transmission line image.

[0062] For example, such as Figure 2 As shown, the decoded feature maps D1, D2, D3, D4 and the deep semantic information D5 are convolved respectively, and the convolved images are fused channel by channel to obtain the semantic segmentation map Dout.

[0063] The technical solution of this invention involves determining the deep semantic information corresponding to the output of the residual block; decoding the initial image features, the deep semantic information, and each fused feature map to obtain at least one decoded feature map; and convolving each decoded feature map with the deep semantic information to obtain a semantic segmentation map corresponding to the power transmission line image. By improving the segmentation accuracy of the power transmission line image through deep semantic information and enhancing feature representation capabilities, and by convolving the decoded feature map with the deep semantic information to obtain the semantic segmentation map, the structure of the power transmission line is extracted more accurately, demonstrating good adaptability and robustness in power transmission line segmentation.

[0064] Example 3 Figure 6 This is a schematic diagram of the structure of a power transmission line image segmentation device according to Embodiment 3 of the present invention. Figure 6 As shown, the device includes: The determining module 310 is used to determine the transmission line image, wherein the transmission line image includes the transmission line and the scene in which the transmission line is located; Input module 320 is used to input the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output. The residual block structure consists of at least one residual block. The residual block output indicates information about the transmission line image. The initial image features include the edge features of the transmission line in the transmission line image. The fusion module 330 is used to perform a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; The segmentation module 340 is used to obtain a semantic segmentation map corresponding to the power transmission line image based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the output of the residual block. The deep semantic information includes the context information output by the residual block, and the semantic segmentation map includes the image formed by the power transmission line.

[0065] The power transmission line image segmentation apparatus provided in this embodiment of the invention determines a power transmission line image through a determination module; inputs the initial image features corresponding to the power transmission line image into a residual block structure through an input module to obtain a residual block output; performs a feature fusion operation on the initial image features and the residual block output through a fusion module to obtain at least one fused feature map corresponding to the power transmission line image; and obtains a semantic segmentation map corresponding to the power transmission line image through a segmentation module based on the initial image features, each fused feature map, and the deep semantic information corresponding to the residual block output. Through the cooperation between the modules, the residual block structure realizes the extraction and expression of edge information of the power transmission line image, the feature fusion operation realizes the fusion of multi-scale information, improves the accuracy of power transmission line detection, and finally obtains a semantic segmentation map, achieving more accurate extraction of the power transmission line structure and improving the segmentation accuracy of the power transmission line image based on deep semantic information.

[0066] In one embodiment, the segmentation module 340 includes: The first determining unit is used to determine the deep semantic information corresponding to the output of the residual block; A decoding unit is used to perform decoding operations on the initial image features, the deep semantic information, and each of the fused feature maps to obtain at least one decoded feature map. A convolutional unit is used to convolve each of the decoded feature maps and the deep semantic information to obtain a semantic segmentation map corresponding to the power transmission line image. The image quality of the semantic segmentation map is the same as that of the power transmission line image, and the image quality includes resolution and channels.

[0067] In one embodiment, the first determining unit is specifically used for: Determine the fourth residual block output contained in the residual block output; Context extraction is performed on the output of the fourth residual block to obtain parallel branch information, which includes horizontal global information, horizontal fine-grained information, spatial context information, vertical structure information, and vertical global information. The parallel branch information is added element-wise to the output of the fourth residual block to obtain deep semantic information.

[0068] In one embodiment, the fusion module 330 includes: The second determining unit is used to determine at least one feature fusion scale corresponding to the feature fusion operation; The fusion unit is used to perform feature fusion operations on the initial image features and the residual block output according to the feature fusion scale for each feature fusion scale, so as to obtain a fused feature map.

[0069] In one embodiment, the fusion unit is specifically used for: The initial image features are downsampled to obtain the first context features; Determine the first residual block output, the second residual block output, the third residual block output, and the fourth residual block output included in the residual block output; The output of the first residual block is downsampled a second time to obtain the first abstract semantic information, wherein the downsampling factor of the first downsampling is greater than the downsampling factor of the second downsampling. Convolution is performed on the output of the second residual block to obtain the integrated features; The output of the third residual block is first upsampled to obtain the second abstract semantic information; The output of the fourth residual block is upsampled a second time to obtain a second context feature, wherein the upsampling factor of the second upsampling is greater than the upsampling factor of the first upsampling. The first context feature, the first abstract semantic information, the integrated feature, the second abstract semantic information, and the second context feature are concatenated to obtain the concatenated information. Efficient additive attention calculation is performed on the spliced information to obtain a fused feature map.

[0070] In one embodiment, the input module 320 is specifically used for: Extract initial image features from the transmission line image; According to the order of each residual block in the residual block structure, the first residual block, the second residual block, the third residual block and the fourth residual block that make up the residual block structure are determined; The initial image features are input into the first residual block to obtain the first residual block output; The output of the first residual block is input into the second residual block to obtain the output of the second residual block; The output of the second residual block is input into the third residual block to obtain the output of the third residual block; The output of the third residual block is input into the fourth residual block to obtain the output of the fourth residual block; The first residual block output, the second residual block output, the third residual block output, and the fourth residual block output are determined as residual block outputs.

[0071] In one embodiment, the determining module 310 is specifically used for: Acquire initial images of the power transmission line, which include the acquired images of the power transmission line and the scene in which the power transmission line is located; The initial transmission line image is downsampled to obtain a downsampled image; The downsampled image is cropped at a set interval to obtain at least one cropped image. For each cropped image, if the cropped image contains the power transmission line, the cropped image is determined to be a power transmission line image.

[0072] The power transmission line image segmentation device provided in this embodiment of the invention can execute the power transmission line image segmentation method provided in any embodiment of the invention. Through the cooperation and collaborative work between the modules, the segmentation of the power transmission line image is completed, and it has the corresponding functional modules and beneficial effects of the execution method.

[0073] Example 4 According to embodiments of the present invention, the present invention also provides an electronic device, a computer-readable storage medium, and a computer program product.

[0074] Figure 7This is a block diagram of an electronic device according to Embodiment 4 of the present invention, which implements the power line image segmentation method described in the embodiments of the present invention. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices (such as helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the invention described and / or claimed herein.

[0075] like Figure 7 As shown, the electronic device 410 includes at least one processor 411 and a memory, such as a read-only memory (ROM) 412 or a random access memory (RAM) 413, communicatively connected to the at least one processor 411. The memory stores computer programs executable by the at least one processor. The processor 411 can perform various appropriate actions and processes based on the computer program stored in the ROM 412 or loaded from storage unit 418 into the RAM 413. The RAM 413 may also store various programs and data required for the operation of the electronic device 410. The processor 411, ROM 412, and RAM 413 are interconnected via a bus 414. An input / output (I / O) interface 415 is also connected to the bus 414.

[0076] Multiple components in the electronic device are connected to the I / O interface 415, including: an input unit 416, such as a keyboard, mouse, etc.; an output unit 417, such as various types of displays, speakers, etc.; a storage unit 418, such as a disk, optical disk, etc.; and a communication unit 419, such as a network card, modem, wireless transceiver, etc. The communication unit 419 allows the electronic device to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.

[0077] Processor 411 can be a variety of general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of processor 411 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various processors running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. Processor 411 performs the various methods and processes described above, such as the power line image segmentation method.

[0078] In some embodiments, the power transmission line image segmentation method may be implemented as a computer program tangibly contained in a computer-readable storage medium, such as storage unit 418. In some embodiments, part or all of the computer program may be loaded and / or mounted on electronic device 410 via ROM 412 and / or communication unit 419. When the computer program is loaded into RAM 413 and executed by processor 411, one or more steps of the power transmission line image segmentation method described above may be performed. Alternatively, in other embodiments, processor 411 may be configured to perform the power transmission line image segmentation method by any other suitable means (e.g., by means of firmware).

[0079] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.

[0080] Computer programs used to implement the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, such that when executed by the processor, the computer programs cause the functions / operations specified in the flowcharts and / or block diagrams to be performed. The computer programs may be executed entirely on a machine, partially on a machine, or as a standalone software package, partially on a machine and partially on a remote machine, or entirely on a remote machine or server.

[0081] In the context of this invention, a computer-readable storage medium can be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, apparatus, or device. A computer-readable storage medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination thereof. Alternatively, a computer-readable storage medium may be a machine-readable signal medium. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fibers, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof.

[0082] To provide interaction with a user, the systems and techniques described herein can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user; and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the electronic device. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).

[0083] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as data servers), or middleware components (e.g., application servers), or frontend components (e.g., user computers with graphical user interfaces or web browsers through which users can interact with implementations of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., communication networks). Examples of communication networks include local area networks (LANs), wide area networks (WANs), blockchain networks, and the Internet.

[0084] A computing system can include clients and servers. Clients and servers are generally located far apart and typically interact through communication networks. The client-server relationship is created by computer programs running on the respective computers and having a client-server relationship with each other. The server can be a cloud server, also known as a cloud computing server or cloud host, which is a hosting product within the cloud computing service system to address the shortcomings of traditional physical hosts and VPS services, such as high management difficulty and weak business scalability.

[0085] In some embodiments, the computer program product includes a computer program that, when executed by a processor, implements the power line image segmentation method provided in the embodiments of the present invention.

[0086] The technical solution of this invention provides a method, apparatus, electronic device, storage medium, and program product for segmenting power transmission line images. It involves determining a power transmission line image; inputting initial image features corresponding to the power transmission line image into a residual block structure to obtain residual block output; performing a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the power transmission line image; and obtaining a semantic segmentation map corresponding to the power transmission line image based on the initial image features, each fused feature map, and the deep semantic information corresponding to the residual block output. The residual block structure enables the extraction and representation of edge information in the power transmission line image, the feature fusion operation achieves the fusion of multi-scale information, improving the accuracy of power transmission line detection, and finally, the semantic segmentation map is obtained. This achieves more accurate extraction of the power transmission line structure and improves the segmentation accuracy of the power transmission line image based on deep semantic information.

[0087] It should be understood that the various forms of processes shown above can be used, with steps reordered, added, or deleted. For example, the steps described in this invention can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution of this invention can be achieved, and this is not limited herein.

[0088] The specific embodiments described above do not constitute a limitation on the scope of protection of this invention. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.

Claims

1. A power transmission line image segmentation method characterized by, include: Determine a power transmission line image, wherein the power transmission line image includes the power transmission line and the scene in which the power transmission line is located; The initial image features corresponding to the transmission line image are input into the residual block structure to obtain the residual block output. The residual block structure consists of at least one residual block. The residual block output indicates information about the transmission line image. The initial image features include the edge features of the transmission line in the transmission line image. A feature fusion operation is performed on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; Based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the output of the residual block, a semantic segmentation map corresponding to the transmission line image is obtained. The deep semantic information includes the context information output by the residual block, and the semantic segmentation map includes the image formed by the transmission line.

2. The method according to claim 1, characterized in that, The step of obtaining a semantic segmentation map corresponding to the transmission line image based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the residual block output includes: Determine the deep semantic information corresponding to the output of the residual block; Decoding operations are performed on the initial image features, the deep semantic information, and each of the fused feature maps to obtain at least one decoded feature map; The decoded feature maps and the deep semantic information are convolved to obtain the semantic segmentation map corresponding to the power transmission line image. The image quality of the semantic segmentation map is the same as that of the power transmission line image, and the image quality includes resolution and channels.

3. The method according to claim 2, characterized in that, The residual block output includes a first residual block output, a second residual block output, a third residual block output, and a fourth residual block output. Determining the deep semantic information corresponding to the residual block output includes: Determine the fourth residual block output contained in the residual block output; Context extraction is performed on the output of the fourth residual block to obtain parallel branch information, which includes horizontal global information, horizontal fine-grained information, spatial context information, vertical structure information, and vertical global information. The parallel branch information is added element-wise to the output of the fourth residual block to obtain deep semantic information.

4. The method according to claim 1, characterized in that, The step of performing a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image includes: Determine at least one feature fusion scale corresponding to the feature fusion operation; For each feature fusion scale, a feature fusion operation is performed on the initial image features and the residual block output according to the feature fusion scale to obtain a fused feature map.

5. The method according to claim 4, characterized in that, The step of performing a feature fusion operation on the initial image features and the residual block output to obtain a fused feature map includes: The initial image features are downsampled to obtain the first context features; Determine the first residual block output, the second residual block output, the third residual block output, and the fourth residual block output included in the residual block output; The output of the first residual block is downsampled a second time to obtain the first abstract semantic information, wherein the downsampling factor of the first downsampling is greater than the downsampling factor of the second downsampling. Convolution is performed on the output of the second residual block to obtain the integrated features; The output of the third residual block is first upsampled to obtain the second abstract semantic information; The output of the fourth residual block is upsampled a second time to obtain a second context feature, wherein the upsampling factor of the second upsampling is greater than the upsampling factor of the first upsampling. The first context feature, the first abstract semantic information, the integrated feature, the second abstract semantic information, and the second context feature are concatenated to obtain the concatenated information. Efficient additive attention calculation is performed on the spliced information to obtain a fused feature map.

6. The method according to claim 1, characterized in that, The step of inputting the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output includes: Extract initial image features from the transmission line image; According to the order of each residual block in the residual block structure, the first residual block, the second residual block, the third residual block and the fourth residual block that make up the residual block structure are determined; The initial image features are input into the first residual block to obtain the first residual block output; The output of the first residual block is input into the second residual block to obtain the output of the second residual block; The output of the second residual block is input into the third residual block to obtain the output of the third residual block; The output of the third residual block is input into the fourth residual block to obtain the output of the fourth residual block; The first residual block output, the second residual block output, the third residual block output, and the fourth residual block output are determined as residual block outputs.

7. The method according to claim 1, characterized in that, The process of determining the transmission line image includes: Acquire initial images of the power transmission line, which include the acquired images of the power transmission line and the scene in which the power transmission line is located; The initial transmission line image is downsampled to obtain a downsampled image; The downsampled image is cropped at a set interval to obtain at least one cropped image. For each cropped image, if the cropped image contains the power transmission line, the cropped image is determined to be a power transmission line image.

8. A transmission line image segmentation device, characterized in that, include: A determination module is used to determine a power transmission line image, wherein the power transmission line image includes the power transmission line and the scene in which the power transmission line is located; The input module is used to input the initial image features corresponding to the transmission line image into the residual block structure to obtain the residual block output. The residual block structure consists of at least one residual block. The residual block output indicates information about the transmission line image. The initial image features include the edge features of the transmission line in the transmission line image. The fusion module is used to perform a feature fusion operation on the initial image features and the residual block output to obtain at least one fused feature map corresponding to the transmission line image; The segmentation module is used to obtain a semantic segmentation map corresponding to the power transmission line image based on the initial image features, each of the fused feature maps, and the deep semantic information corresponding to the output of the residual block. The deep semantic information includes the context information output by the residual block, and the semantic segmentation map includes the image formed by the power transmission line.

9. An electronic device, characterized in that, The electronic device includes: At least one processor; and A memory communicatively connected to the at least one processor; wherein, The memory stores a computer program that can be executed by the at least one processor, the computer program being executed by the at least one processor to enable the at least one processor to perform the power line image segmentation method according to any one of claims 1-7.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer instructions that cause a processor to execute the power line image segmentation method according to any one of claims 1-7.

11. A computer program product, characterized in that, The computer program product includes a computer program that, when executed by a processor, implements the power transmission line image segmentation method according to any one of claims 1-7.