A method for identifying a tower pole in foggy weather and related products

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining multi-scale analysis and adaptive thresholding techniques with multi-layer convolutional neural networks and feature fusion, the problem of tower identification in foggy weather has been solved, achieving high-precision and high-reliability tower identification and ensuring the stability of power, communication and transportation systems.

CN118865118BActive Publication Date: 2026-06-19SICHUAN POWER EHV OVERHAUL

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SICHUAN POWER EHV OVERHAUL
Filing Date: 2024-07-04
Publication Date: 2026-06-19

Application Information

Patent Timeline

04 Jul 2024

Application

19 Jun 2026

Publication

CN118865118B

IPC: G06V20/10; G06V10/44; G06V10/80; G06V10/82; G06N3/0464

AI Tagging

Application Domain

Character and pattern recognition Biological models

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN118865118B_ABST

Patent Text Reader

Abstract

This invention relates to the field of power maintenance technology, specifically to a method and related products for identifying power towers in foggy weather. The method includes: acquiring the coordinate data of the power tower and obtaining the original image at the coordinate location; preprocessing the original image to obtain a fog-free image; performing edge detection on the fog-free image using multi-scale analysis and adaptive thresholding technology to identify slender structures in the fog-free image; and identifying the power tower from multiple slender structures using feature fusion and attention mechanisms. This invention obtains a clearer fog-free image through multi-level blurring and Laplacian decomposition; performs edge detection using multi-scale analysis and adaptive thresholding technology to identify slender structures in the fog-free image; and finally, accurately identifies the power tower from multiple slender structures through the application of feature fusion and attention mechanisms. This effectively solves the problem of power tower identification under foggy weather conditions, improves identification accuracy and reliability, and ensures the safe operation of power, communication, and transportation sectors.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of power maintenance technology, specifically to a method for identifying power towers in foggy weather and related products. Background Technology

[0002] Tower poles are widely used in power, communication, and transportation sectors. Their main function is to support power lines, antennas, and other equipment, ensuring the stability of signal transmission and power supply. The primary purpose of tower pole identification is to monitor their status in real time under various weather conditions, promptly identify and address potential safety hazards, and ensure the safe operation of lines and equipment. However, foggy weather conditions significantly reduce visibility, affecting image quality and posing challenges to tower pole monitoring and maintenance.

[0003] Traditional image recognition methods often perform poorly when dealing with these slender structures, struggling to accurately identify the location and shape of towers. Existing technologies also struggle in complex scenes, and in foggy weather, slender structures in images are more easily obscured by noise and blur, further increasing the difficulty of automatic recognition algorithms. Summary of the Invention

[0004] The technical problem to be solved by this invention is that the existing technology does not achieve ideal recognition results in foggy weather. The purpose is to provide a method and related products for identifying towers in foggy weather. By combining multi-scale analysis and adaptive threshold technology for tower identification, higher recognition accuracy and robustness are achieved.

[0005] This invention is achieved through the following technical solution:

[0006] A method for identifying tower poles in foggy weather includes:

[0007] Obtain the coordinate data of the tower and the original image at the coordinate location;

[0008] Preprocess the original image to obtain a haze-free image;

[0009] By combining multi-scale analysis and adaptive thresholding techniques, edge detection is performed on fog-free images to identify slender structures in fog-free images;

[0010] Towers are identified from multiple slender structures by combining feature fusion and attention mechanisms.

[0011] Specifically, methods for preprocessing the original image include:

[0012] The original image is blurred in multiple layers using a Gaussian filter. Where (x,y) are pixel coordinates, I(x,y) is the original image, and σ i Let G be the standard deviation of the i-th layer in the Gaussian kernel.i (x,y) is the Gaussian blurred image of the i-th layer;

[0013] To obtain multi-scale details, subtract the next layer of Gaussian blurred image from the Gaussian blurred image, L(x,y,i)=I(x,y)*G i (x,y)-I(x,y)*G i+1 (x,y), where L(x,y,i) is the Laplace decomposition result of the i-th layer;

[0014] Estimate background light B(x,y) and transmittance Where ω is the transmittance calculation constant, and Ω(x,y) is the local region centered on the pixel (x,y);

[0015] Get the corrected image Where β is the atmospheric scattering coefficient, d(x,y) is the scene depth of pixel (x,y), and e is the natural constant;

[0016] Improving the contrast of the corrected image through local adaptive histogram equalization yields a fog-free image. Where α is the enhancement factor and N(x,y) is the neighborhood centered at pixel (x,y).

[0017] Optionally, the scene depth d(x,y) of pixel (x,y) can be obtained as follows:

[0018] A first image I1(x,y) is obtained at a first position, and after being translated a certain distance, a second image I2(x,y) is obtained at a second position.

[0019] Determine the coordinate data of the tower in the global geographic coordinate system (X). t ,Y t Z t ), determine the coordinate data (X) of the first location in the global geographic coordinate system. d1 ,Y d1 Z d1 ), determine the coordinate data (X) of the second location in the global geographic coordinate system. d2 ,Y d2 Z d2 );

[0020] Find the best match between the points in the first image and the corresponding points in the second image, and obtain the disparity dsip(x,y);

[0021] Calculate scene depth Where f is the focal length of the camera. This represents the distance between the first and second positions.

[0022] Furthermore, after obtaining the scene depth, the scene depth is verified. The verification methods include:

[0023] Determine the distance between the first position and the tower.

[0024] Determine the distance between the second position and the tower.

[0025] Determine the reference distance between two locations.

[0026] Set error threshold d T If |d(x,y)-d0|≤d T If |d(x,y)-d0|>d T If so, the first and second images are reacquired and the scene depth is recalculated.

[0027] Optionally, methods for obtaining disparity disp(x,y) include:

[0028] The first and second images are aligned on the horizontal line using image correction methods.

[0029] Determine the parallax search range [0, dsip max ], and randomly sample a preset parallax dsip, where dsip max Maximum parallax;

[0030] The absolute difference is calculated based on the preset disparity, and SAD(x,y,dsip) = ∑ (i,j)∈W |I1(x+i,y+j)-I2(x+i-dsip,y+j)|, where W is a window surrounding the point (x,y);

[0031] Based on pre-defined disparity calculation and normalized cross-correlation in, The average pixel value of the selected window W in the first image. The average pixel value of the selected window W in the second image;

[0032] The preset disparity d is sampled multiple times, and the disparity value dsip1 that minimizes the absolute difference and SAD(x,y,dsip) is obtained; the disparity value dsip2 that maximizes the normalized cross-correlation NCC(x,y,dsip) is obtained.

[0033] Obtaining parallax

[0034] Specifically, methods for edge detection include:

[0035] Applying Gaussian filtering at different scales σ i Calculate the haze-free image I enhanced gradient of (x,y) in, For scale σ i The response of the Gaussian derivative filter in the x-direction. For scale σ i The response of the Gaussian derivative filter in the y-direction;

[0036] Enhance the consistency of gradient direction for each pixel at different scales and obtain the normalization factor. in, For scale σ i The gradient direction, θ mean This is the average value across all scale directions at pixel (x, y);

[0037] Determine the adaptive threshold T adaptive (x,y)=μ local (x,y)+k·(σ global -σ local (x,y))·Θ cons (x,y), where μ local (x,y) represents the average brightness of pixel (x,y) within its local neighborhood, σ global σ represents the standard deviation of the haze-free image. local (x,y) represents the local standard deviation at pixel (x,y), and k is the adjustment factor;

[0038] Non-maximum suppression and edge-tracking connectivity are used to perform edge detection on haze-free images to obtain images of slender structures.

[0039] Specifically, methods for identifying towers include:

[0040] A multi-layer convolutional neural network is constructed, with batch normalization and ReLU activation functions applied after each convolutional layer, followed by max pooling to extract the image E of the slender structure. final Feature map of (x,y), F l =max(ReLU(BN(W) l *F l-1 +b l )),l=1,2,...,L,where,F l For the feature map of layer l, W l b is the weight of the convolutional layer at layer l. l is the bias of the l-th convolutional layer, BN is the normalization function, ReLU is the ReLU activation function, and * is convolution;

[0041] The feature maps from different layers are fused to obtain a fused feature map. Where, α l To learn and integrate weights;

[0042] Feature selection enhancement is performed on the fused features based on the normalization factor to obtain enhanced fused features, F. input =concat(F fusion ,Θ cons ), where concat means concatenating features along the channel dimension;

[0043] By employing a spatial attention mechanism, important features are enhanced and irrelevant features are suppressed through attention map A, resulting in attention-weighted feature map F. att =A⊙F input A = σ A (W A *F input +b A )⊙Θ cons Among them, W A b represents the weights of the attention layer. A For the bias of the attention layer, σ A Here, ⊙ represents the Sigmoid activation function;

[0044] The slender structure is finally classified using a fully connected layer and a Softmax function: Y = Softmax(W C ·F att +b C ), where W C b represents the weights of the classification layer. C This is the bias for the classification layer.

[0045] A tower identification terminal in foggy weather includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the tower identification method in foggy weather as described above.

[0046] A computer-readable storage medium storing a computer program that, when executed by a processor, implements a method for identifying tower poles in foggy weather as described above.

[0047] A computer program product includes a computer program / instructions that, when executed by a processor, implement a method for identifying towers in foggy weather as described above.

[0048] Compared with the prior art, the present invention has the following advantages and beneficial effects:

[0049] This invention effectively removes the effects of fog through multi-level blurring and Laplacian decomposition, obtaining clearer fog-free images and improving image quality. It combines multi-scale analysis and adaptive thresholding technology for edge detection, identifying slender structures in fog-free images. Finally, by applying feature fusion and attention mechanisms, it accurately identifies towers from multiple slender structures. This effectively solves the problem of tower identification under foggy weather conditions, improving recognition accuracy and reliability, and ensuring the safe operation of power, communication, and transportation sectors. Attached Figure Description

[0050] The accompanying drawings illustrate exemplary embodiments of the present invention and, together with the description thereof, serve to explain the principles of the invention. These drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, but do not constitute a limitation on the embodiments of the present invention.

[0051] Figure 1 This is a flowchart illustrating a method for identifying tower poles in foggy weather according to the present invention.

[0052] Figure 2 This is a flowchart illustrating the method for preprocessing an original image according to the present invention.

[0053] Figure 3 This is a flowchart illustrating the method for obtaining pixel scene depth according to the present invention.

[0054] Figure 4 This is a flowchart illustrating the method for obtaining parallax according to the present invention. Detailed Implementation

[0055] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the invention.

[0056] It should also be noted that, for ease of description, only the parts relevant to the present invention are shown in the accompanying drawings.

[0057] Where there is no conflict, the embodiments and features described in the present invention can be combined with each other. The present invention will now be described in detail with reference to the accompanying drawings and embodiments.

[0058] Example 1

[0059] like Figure 1 As shown, a method for identifying tower poles in foggy weather is provided, including:

[0060] Obtain the coordinate data of the tower and the original image at the coordinate location; the coordinate data of the tower can come from GPS, Beidou positioning or other positioning systems, or it can be directly from historical data that has already been recorded.

[0061] Preprocessing is performed on the original image to obtain a haze-free image; preprocessing removes the haze effect from the original image. Common preprocessing techniques include multi-level blurring and Laplacian decomposition to separate multi-scale details in the image. Simultaneously, by estimating background light and transmittance, the image is corrected and contrast is enhanced, ultimately resulting in a clear, haze-free image. Preprocessing improves image quality, making subsequent edge detection and tower recognition more accurate.

[0062] This study combines multi-scale analysis and adaptive thresholding techniques for edge detection in haze-free images, identifying elongated structures within them. Multi-scale analysis uses filters of different scales to process the image, acquiring features at various scales. Adaptive thresholding dynamically adjusts the edge detection threshold based on the local characteristics of the image. The combination of these two techniques allows for better detection of elongated structures, such as the contours and edges of towers.

[0063] This method combines feature fusion and attention mechanisms to identify tower poles from multiple slender structures. Feature fusion integrates features at different levels to improve recognition accuracy. Attention mechanisms are a method in neural networks that improves performance by focusing on important features and suppressing irrelevant features. In the tower pole recognition process, feature fusion and attention mechanisms can accurately identify tower poles from multiple detected slender structures.

[0064] Finally, features are extracted through a multi-layer convolutional neural network, and feature selection and enhancement are performed in conjunction with an attention mechanism to ensure the robustness and accuracy of the recognition.

[0065] Specific application examples are as follows:

[0066] Suppose that within an area managed by a power company, multiple power transmission towers have been installed. These towers are located in mountainous areas and are frequently affected by fog. Traditional monitoring methods are unable to accurately identify and monitor the towers' status under foggy conditions.

[0067] When installing towers, first determine the specific location of each tower and record its coordinates.

[0068] The system uses drones or ground-mounted cameras to acquire raw images of various coordinate areas. Then, through preprocessing, edge detection, and tower recognition, the final tower is identified, and subsequent monitoring is carried out.

[0069] Example 2

[0070] like Figure 2 As shown, methods for preprocessing the original image include:

[0071] The original image is blurred in multiple layers using a Gaussian filter. Where (x,y) are pixel coordinates, I(x,y) is the original image, and σ i Let G be the standard deviation of the i-th layer in the Gaussian kernel. i (x,y) represents the Gaussian blurred image of the i-th layer. The Gaussian filter smooths the image by applying a Gaussian function, reducing noise and details in the image. By using Gaussian filters with different standard deviations, multi-layered blurred images can be generated.

[0072] To obtain multi-scale details, subtract the next layer of Gaussian blurred image from the Gaussian blurred image, L(x,y,i)=I(x,y)*G i (x,y)-I(x,y)*G i+1 (x,y), where L(x,y,i) is the Laplacian decomposition result of the i-th layer; extracting detailed information of the image at different scales provides more accurate data for subsequent background light and transmittance estimation.

[0073] Estimate background light B(x,y) and transmittance Where ω is the transmittance calculation constant, Ω(x,y) is the local region centered on pixel (x,y); the background light B(x,y) is determined by the maximum value in the local region, and the transmittance T(x,y) is calculated by the minimum value in the local region and the constant ω. The background light estimation is used to determine the brightest area in the scene, and the transmittance estimation is used to describe the degree of attenuation of light as it propagates in the atmosphere.

[0074] Get the corrected image Where β is the atmospheric scattering coefficient, d(x,y) is the scene depth of pixel (x,y), and e is the natural constant; using background light and transmittance information, the original image is dehazed to obtain a corrected clear image, restoring the true color and details of the image.

[0075] Improving the contrast of the corrected image through local adaptive histogram equalization yields a fog-free image. Here, α is the enhancement factor, and N(x,y) is the neighborhood centered at pixel (x,y). The local adaptive histogram equalization method enhances image contrast and further improves image clarity.

[0076] Example 3

[0077] like Figure 3 As shown, the method for obtaining the scene depth d(x,y) of pixel (x,y) is as follows:

[0078] A first image I1(x,y) is obtained at a first position, and a second image I2(x,y) is obtained at a second position after being translated a certain distance; by taking images from different positions, the data required to calculate parallax and depth information is provided.

[0079] Determine the coordinate data of the tower in the global geographic coordinate system (X). t ,Y t Z t ), determine the coordinate data (X) of the first location in the global geographic coordinate system. d1 ,Y d1 Z d1 ), determine the coordinate data (X) of the second location in the global geographic coordinate system. d2 ,Y d2 Z d2 Global geographic coordinate systems include commonly used systems such as WGS-84, CGCS2000, Beijing 54, and Xi'an 80, which describe any location on the Earth's surface. By obtaining precise geographic coordinates, the distance between two locations and the positional relationship of the tower can be calculated.

[0080] Find the best match between the points in the first image and the corresponding points in the second image, and obtain the disparity disp(x,y). By matching the corresponding points in the first and second images, the disparity disp(x,y) is obtained, which is the pixel offset of the same object in the two images at different positions. The disparity reflects the positional change of the object under different viewpoints and is a key parameter for calculating depth information.

[0081] Calculate scene depth Where f is the focal length of the camera (the distance from the optical center of the camera lens to the image sensor). This represents the distance between the first and second positions. Depth information for each pixel is calculated using parallax and camera parameters to obtain the distance from each pixel in the image to the camera.

[0082] After obtaining the scene depth, the scene depth is verified. Verification methods include:

[0083] Determine the distance between the first position and the tower.

[0084] Determine the distance between the second position and the tower.

[0085] By averaging the distance between the two locations, a baseline value for the scene depth used in the verification calculation is obtained; that is, the baseline distance between the two locations is determined.

[0086] Error verification ensures the accuracy and reliability of the calculated depth information; that is, an error threshold d is set. T If |d(x,y)-d0|≤d T If |d(x,y)-d0|>d T If so, the first and second images are reacquired and the scene depth is recalculated.

[0087] Example 4

[0088] like Figure 4 As shown, the methods for obtaining disparity disp(x,y) include:

[0089] Image correction methods align the first and second images on a horizontal line. A stereo correction method can be used, the basic principle of which is to align the two images to the same reference plane using camera calibration techniques and geometric transformations. The steps include calibrating the camera using a calibration board or known reference points to determine its intrinsic and extrinsic parameters; then using these parameters to perform geometric transformations on the images, remapping the points in the images onto the corrected image plane, thereby aligning the images on a horizontal line. This ultimately eliminates tilt and distortion in the images, aligning the two images on a horizontal line and improving the accuracy of parallax calculation.

[0090] Determine the parallax search range [0, dsip max ], and randomly sample a preset parallax dsip, where dsip max The maximum disparity is defined; the search range of disparity is limited, and a disparity value is randomly selected within the disparity search range to calculate the matching degree of image points.

[0091] The absolute difference is calculated based on the preset disparity, and SAD(x,y,dsip) = ∑ (i,j)∈W |I1(x+i,y+j)-I2(x+i-dsip,y+j)|, where W is a window around the point (x,y); the absolute difference sum can measure the total pixel difference between two images at a certain disparity value, reflecting the degree of matching. The smaller the absolute difference sum, the better the matching.

[0092] Based on pre-defined disparity calculation and normalized cross-correlation in, The average pixel value of the selected window W in the first image. The average pixel value of the selected window W in the second image; normalized cross-correlation is used to evaluate the correlation between corresponding points in the two images. The higher the correlation, the better the match.

[0093] The preset disparity d is sampled multiple times to obtain the disparity value dsip1 that minimizes the absolute difference and SAD(x,y,dsip); and the disparity value dsip2 that maximizes the normalized cross-correlation NCC(x,y,dsip) is obtained.

[0094] The optimal disparity is determined by combining the calculation results of absolute disparity and normalized cross-correlation.

[0095] Example 5

[0096] Specific methods for edge detection include:

[0097] To capture edges and details of different sizes in the image, Gaussian filtering is applied at different scales σ. i Calculate the haze-free image I enhanced gradient of (x,y) in, For scale σ i The response of the Gaussian derivative filter in the x-direction. For scale σ i The response of the Gaussian derivative filter in the y-direction; gradient—the rate and direction of change in image brightness.

[0098] Enhance the consistency of gradient direction for each pixel at different scales and obtain the normalization factor. in, For scale σ i The gradient direction, θ mean It is the average value of all scale directions at pixel (x,y); by enhancing the consistency of gradient directions at different scales, the accuracy of edge detection is improved.

[0099] Determine the adaptive threshold T adaptive (x,y)=μ local (x,y)+k·(σ global -σ local (x,y))·Θ cons (x,y), where μ local (x,y) represents the average brightness of pixel (x,y) within its local neighborhood, σ global σ represents the standard deviation of the haze-free image. local (x,y) represents the local standard deviation at pixel (x,y), and k is the adjustment coefficient. The sensitivity of edge detection is dynamically adjusted by using an adaptive threshold to adapt to the local features of the image.

[0100] Non-maximum suppression and edge-tracking connectivity are used to perform edge detection on haze-free images to obtain images of slender structures. By using non-maximum suppression, pixels with the largest local gradient magnitudes can be preserved while other pixels are suppressed. Then, by tracing edge connections, an image of a slender structure can be obtained.

[0101] The purpose of Non-Maximum Suppression (NMS) is to preserve edge points with local maxima and suppress non-edge points during edge detection, thereby forming fine and continuous edges. Specific methods include:

[0102] For each pixel, calculate its gradient magnitude and direction. Assume the image gradient magnitude and direction have already been obtained through the preceding Gaussian filtering and gradient calculation steps.

[0103] For each pixel, its gradient magnitude is compared with the gradient magnitudes of its neighboring pixels along the gradient direction. This step involves sub-pixel interpolation calculations to obtain the gradient magnitude at non-integer pixel locations.

[0104] Based on the gradient direction, the gradient magnitude of the current pixel is compared with the gradient magnitudes in both the positive and negative directions along that gradient direction. If the gradient magnitude of the current pixel is not the maximum among these three values, the gradient magnitude of that pixel is set to 0; otherwise, its gradient magnitude is retained. The specific operation is as follows:

[0105] Horizontal edge (0 degrees): Compare the gradient magnitude of the current pixel with the gradient magnitudes of its two neighboring pixels to the left and right.

[0106] Vertical edge (90 degrees): Compare the gradient magnitude of the current pixel with the gradient magnitudes of its two adjacent pixels above and below it.

[0107] Diagonal edge (45 degrees): Compare the gradient magnitude of the current pixel with the gradient magnitudes of its two neighboring pixels, the top left and the bottom right.

[0108] Negative diagonal edge (135 degrees): Compares the gradient magnitude of the current pixel with the gradient magnitudes of its two neighboring pixels, the upper right and the lower left.

[0109] Example 6

[0110] The following are methods for identifying towers:

[0111] A multi-layer convolutional neural network is constructed, with batch normalization and ReLU activation functions applied after each convolutional layer, followed by max pooling to extract the image E of the slender structure. final Feature map of (x,y), F l =max(ReLU(BN(W) l *F l-1 +b l )),l=1,2,...,L,where,Fl For the feature map of layer l, W l b is the weight of the convolutional layer at layer l. l Let be the bias of the l-th convolutional layer, BN be the normalization function, ReLU be the ReLU activation function, and * be convolution; that is, construct a multi-layer convolutional neural network (CNN) to process the input elongated image E. final Feature extraction is performed on (x, y). The operations after each convolutional layer include batch normalization (BN) and ReLU activation function, followed by max pooling. This enables the extraction of high-level features of the image layer by layer through a multi-layer convolutional network and enhances the slender structural information related to the tower.

[0112] The feature maps from different layers are fused to obtain a fused feature map. Where, α l To learn the fusion weights; feature fusion is the weighted combination of feature maps extracted from multiple convolutional layers to obtain a more expressive fused feature map. Fusion weight α l These are parameters learned through training, representing the importance of the feature map of layer l in the fusion process. The weights can be determined through fully connected layers or other learning mechanisms. Finally, the feature maps of all layers are fused according to their weights α. l We perform a weighted summation to obtain the fused feature map F. fusion The main purpose of feature fusion is to combine feature maps from different convolutional layers, so that the fused feature map contains more information. Each layer's feature map F... l It can capture different details and patterns in the image, and through weighted fusion, it can comprehensively utilize the features of each layer to improve the richness and accuracy of the overall feature representation.

[0113] Feature selection enhancement is performed on the fused features based on the normalization factor to obtain enhanced fused features, F. input =concat(F fusion ,Θ cons ), where concat represents concatenating features along the channel dimension; feature selection enhancement is achieved by fusing the fused feature map F fusion With normalization factor Θ cons Concatenate the features to obtain the enhanced feature map F. input The main purpose of feature selection enhancement is to introduce a normalization factor Θ. consThis improves the model's focus on important features. The regularization factor provides information about the consistency of gradient directions; by concatenating it with the fused feature map, the model can simultaneously consider the combined information of the feature map and the consistency of gradient directions, thereby improving recognition accuracy.

[0114] By employing a spatial attention mechanism, important features are enhanced and irrelevant features are suppressed through attention map A, resulting in attention-weighted feature map F. att =A⊙F input A = σ A (W A *F input +b A )⊙Θ cons Among them, W A b represents the weights of the attention layer. A For the bias of the attention layer, σ A Here, ⊙ represents the Sigmoid activation function;

[0115] The slender structure is finally classified using a fully connected layer and a Softmax function: Y = Softmax(W C ·F att +b C ), where W C b represents the weights of the classification layer. C This is the bias for the classification layer. The fully connected layer converts the high-dimensional feature map into class scores, and the Softmax function converts the class scores into a probability distribution for classification decisions.

[0116] In summary, the model can enhance important information in the feature map using spatial attention mechanism, and perform final classification of the image through fully connected layers and the Softmax function to identify the existence and location of the tower.

[0117] Example 7

[0118] A tower identification terminal for foggy weather includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the tower identification method for foggy weather as described above.

[0119] Memory is used to store software programs and modules. The processor executes various terminal functions and data processing by running the software programs and modules stored in memory. Memory can mainly include a program storage area and a data storage area. The program storage area can store the operating system, at least one executable program required for a given function, etc.

[0120] The storage data area can store data created based on the use of the terminal. Furthermore, the memory can include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory, or other volatile solid-state storage devices.

[0121] A computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described method for identifying tower poles in foggy weather.

[0122] Without loss of generality, computer-readable media can include computer storage media and communication media. Computer storage media includes volatile and non-volatile, removable and non-removable media implemented using any method or technology for storing information such as computer-readable instruction data structures, program modules, or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storage technologies, CD-ROM, DVD or other optical storage, magnetic tape cassettes, magnetic tape, disk storage, or other magnetic storage devices. Of course, those skilled in the art will recognize that computer storage media are not limited to the above-mentioned types. The aforementioned system memories and mass storage devices can be collectively referred to as memory.

[0123] A computer program product includes a computer program / instructions that, when executed by a processor, implement the above-described method for identifying tower poles in foggy weather.

[0124] Computer program products include computer programs or instruction sets used to perform specific tasks or achieve specific functions. These programs or instructions are designed to be executed by a processor to implement a series of predefined steps or operations. The program product may be stored in various forms of computer storage media, such as memory, hard disks, solid-state drives, optical discs, or other forms of digital storage devices. It may exist in the form of compiled binary code or in the form of scripts or bytecode that can be executed by an interpreter. Through carefully designed algorithms and logical instructions, the program product enables the processor to process data in a specific order and manner, performing various functions such as data analysis, user interaction, and device control.

[0125] In the description of this specification, the references to terms such as "one embodiment / mode," "some embodiments / modes," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment / mode or example is included in at least one embodiment / mode or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment / mode or example. Moreover, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments / modes or examples. Furthermore, without contradiction, those skilled in the art can combine and integrate the different embodiments / modes or examples described in this specification, as well as the features of different embodiments / modes or examples.

[0126] Those skilled in the art should understand that the above embodiments are merely for illustrating the present invention and are not intended to limit the scope of the invention. Those skilled in the art can make other changes or modifications based on the above invention, and these changes or modifications still fall within the scope of the present invention.

Claims

1. A method of identifying a tower pole in foggy weather, characterized by, include: Obtain the coordinate data of the tower and the original image at the coordinate location; Preprocess the original image to obtain a haze-free image; By combining multi-scale analysis and adaptive thresholding techniques, edge detection is performed on fog-free images to identify slender structures in fog-free images; Tower poles were identified from multiple slender structures by combining feature fusion and attention mechanisms; Specific methods for edge detection include: Applying Gaussian filtering at different scales Calculate a fog-free image gradient ,in, In scale The Gaussian derivative filter under the following conditions Response in direction In scale The Gaussian derivative filter under the following conditions Response in direction; Enhance the consistency of gradient direction for each pixel at different scales and obtain the normalization factor. ,in, In order to scale The gradient direction below, For pixels The average value across all scale directions; Determine the adaptive threshold ,in, For pixels Average brightness within a local neighborhood. The standard deviation of the haze-free image. For pixels Local standard deviation at , For adjustment coefficients; Non-maximum suppression and edge-tracking connectivity are used to perform edge detection on haze-free images to obtain images of slender structures. .

2. The method of claim 1, wherein, Methods for preprocessing raw images include: The original image is blurred in multiple layers using a Gaussian filter. ,in, For pixel coordinates, For the original image, The first in Gaussian kernel Standard deviation of the layer For the first Gaussian blurred image of the layer; Subtracting the next layer of Gaussian blurred image from the Gaussian blurred image yields multi-scale details. ,in, For the first Laplace decomposition results of the layers; Estimating background light and transmittance , ,in, This is a constant for calculating transmittance. In pixels The local area centered on; Acquiring a corrected image wherein, is the atmospheric scattering coefficient, is the scene depth of the pixel point , is a natural constant; The contrast of the modified image is enhanced by local adaptive histogram equalization to obtain a fog-free image wherein is an enhancement factor, is a neighborhood centered at a pixel point .

3. The method of claim 1, wherein the tower is identified in a foggy weather. Pixel Scene depth Obtaining method: obtaining a first image at a first position obtaining a second image at a second position after translating a distance ; Determine the coordinate data of the tower in the global geographic coordinate system Determine the coordinates of the first location in the global geographic coordinate system. Determine the coordinates of the second location in the global geographic coordinate system. ; Obtain the best match between points in the first image and their corresponding points in the second image, and obtain the disparity. ; Calculate scene depth ,in, For the camera's focal length, This represents the distance between the first and second positions.

4. The method of claim 3, wherein the tower is identified in a foggy weather, and After obtaining the scene depth, the scene depth is verified. Verification methods include: determining a distance between the first position and the tower pole ; determining a distance between the second position and the tower pole ; Determining a fiducial distance of two positions ; Setting an error threshold , if , then determining the scene depth as ; if , then re-acquiring the first image and the second image and re-computing the scene depth.

5. The method of claim 3, wherein the tower pole is identified in a foggy weather, and Method of acquiring parallax comprises: The first and second images are aligned on the horizontal line using image correction methods. Determine the parallax search range And randomly select preset parallax ,in, Maximum parallax; Based on preset disparity calculation absolute difference and ,in, For the surrounding point The window; Based on pre-defined disparity calculation and normalized cross-correlation ,in, Select a window in the first image The average pixel value, Select a window in the second image The average pixel value; Multiple samplings are performed on the preset parallax, and the absolute difference and sum are obtained. Minimum disparity value ; to obtain normalized cross-correlation Maximum parallax value ; Acquiring parallax .

6. The method of claim 1, wherein, Methods for identifying towers include: A multi-layer convolutional neural network is constructed, with batch normalization and ReLU activation functions applied after each convolutional layer, followed by max pooling to extract images of slender structures. Feature map, ,in, For the first Feature map of the layer For the first Convolutional layer weights of the layers, For the first Convolutional layer bias, ReLU is the normalization function, and ReLU is the ReLU activation function. For convolution; Fusing the feature maps of different layers to obtain a fused feature map, wherein, is a learning fusion weight; The enhanced fusion feature is obtained by performing feature selection on the fusion feature based on a normalization factor, where concat represents concatenating features in a channel dimension. By employing a spatial attention mechanism, important features are enhanced and irrelevant features are suppressed through attention map A, resulting in an attention-weighted feature map. , ,in, For the weights of the attention layer, For the bias of the attention layer, For the Sigmoid activation function, Element-wise multiplication; Final classification of the elongated structure by a fully connected layer and a Softmax function wherein, is a weight of the classification layer, is a bias of the classification layer.

7. A terminal for identifying a tower pole in foggy weather, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements a method for identifying tower poles in foggy weather as described in any one of claims 1-6.

8. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by the processor, it implements a method for identifying tower poles in foggy weather as described in any one of claims 1-6.

9. A computer program product comprising a computer program / instructions, characterized in that, When the computer program / instruction is executed by the processor, it implements a method for identifying tower poles in foggy weather as described in any one of claims 1-6.