Infrared sensor and image enhancement intelligent analysis system for unmanned aerial vehicle inspection
By using adaptive thermal drift compensation and infrared image processing based on the Laplacian pyramid structure, combined with a temperature gradient-driven attention mechanism, selective enhancement of both large-scale and small targets is achieved. This solves the problem of small targets being easily obscured during UAV inspections and improves detection performance.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING WUZHISHU TECHNOLOGY CO LTD
- Filing Date
- 2026-03-10
- Publication Date
- 2026-06-12
AI Technical Summary
Existing UAV inspection infrared image enhancement technologies struggle to simultaneously optimize and enhance large-scale and small targets without increasing computational complexity and inference latency. This results in small target features being easily obscured by the texture of large targets, leading to poor detection performance.
An uncooled infrared focal plane array sensor with an adaptive thermal drift compensation algorithm acquires infrared images. Combined with a Laplacian pyramid structure and a temperature gradient-driven adaptive attention mechanism, selective enhancement and suppression of large-scale and small targets are achieved through multi-scale feature decomposition and cross-scale feature fusion.
It effectively solves the problem that small defects with low contrast in infrared images are easily overwhelmed by large targets at high temperatures, and realizes multi-scale adaptive enhancement and refined anomaly recognition, thereby improving the detection sensitivity and reliability in UAV inspection scenarios.
Smart Images

Figure CN122200433A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of unmanned aerial vehicle (UAV) inspection technology, and in particular to an infrared sensing and image enhancement intelligent analysis system for UAV inspection. Background Technology
[0002] Infrared inspection technology using drones has become an important means of monitoring the condition of critical infrastructure such as power facilities, photovoltaic panels, and industrial equipment. In typical power transmission line inspection scenarios, drones equipped with infrared thermal imaging sensors perform non-contact temperature measurements and defect identification on equipment such as towers, insulators, conductors, and fittings. However, due to limitations in drone flight altitude, field of view, and safe distance requirements, a single frame of infrared image often includes both large-scale targets (such as an entire power transmission tower, with an imaging size of 1024×1024 pixels or more) and small targets (such as pin-level defects or equalizing ring connection points, with an imaging size that may be less than 32×32 pixels), with the target size difference reaching more than 10 times.
[0003] Existing infrared image enhancement technologies mainly employ end-to-end processing methods based on deep learning, with typical technical approaches including: (1) Multi-scale feature pyramid network and its variants FPN extracts features at different resolution levels through top-down feature fusion. However, to reduce computational complexity, the low-level high-resolution features need to be downsampled multiple times (typically by a factor of 32) to generate the top-level semantic features. During this process, the spatial information of small targets is gradually compressed and even completely lost in successive pooling operations. Studies have shown that when the target size is smaller than 16×16 pixels, after processing by the ResNet-50 backbone network, its feature response intensity decays to less than 15% of the original value, making it difficult for the network to distinguish between real defects and background noise.
[0004] (2) Deformable convolutional networks By introducing learnable offsets, the sampling position of the convolutional kernel can be adaptively adjusted to cover targets of different scales. While this technique can improve multi-scale detection performance to some extent, it has the following drawbacks: First, offset learning increases the number of model parameters and computation by about 3 times, making it difficult to meet the real-time requirement of 25-30fps on UAV-borne embedded platforms (such as NVIDIA Jetson XavierNX, with a computing power of 21 TOPS); Second, deformable convolution has limited clustering effect on sampling points of small targets. When the target size is less than 1 / 10 of the receptive field of the convolutional kernel, there may be less than 3 effective sampling points, which cannot form a stable feature representation.
[0005] (3) Super-resolution reconstruction and detail enhancement Super-resolution techniques based on generative adversarial networks or diffusion models can magnify low-resolution infrared images by 4-8 times to recover details of minute targets. However, such methods are computationally intensive, with inference latency typically exceeding 500ms, and are prone to generating false textures. In the field of infrared thermal imaging, false temperature details can lead to serious risks of false detections or missed detections.
[0006] (4) Multi-branch parallel processing architecture Some existing technologies employ a large and small target separation detection strategy, which uses input images of different resolutions to process targets of different scales separately. While this method can alleviate the scale difference problem, it requires maintaining two independent network forward inference processes, doubling memory usage and power consumption. Furthermore, the fusion strategy for large and small target detection results is complex and prone to duplicate detection or missed detection of the same target.
[0007] In summary, existing UAV inspection infrared image enhancement technologies suffer from a technical contradiction: enhancing the features of small targets and preserving the texture of large targets are mutually exclusive. Specifically, this contradiction manifests as follows: When deep networks and downsampling operations are used to extract the global semantics and texture details of large targets (such as power transmission towers), the features of small targets (such as pin-level defects) are over-compressed at shallow layers, leading to missed defects. Conversely, if high-resolution feature maps are retained to enhance small targets, the network's receptive field is limited, failing to capture the contextual semantics of large targets, and the computational complexity increases dramatically, exceeding the real-time processing capabilities of airborne platforms. The root cause of this problem lies in the fact that existing feature extraction networks employ fixed spatial sampling strategies and a single scale processing flow, lacking an adaptive scale selection mechanism tailored to the characteristics of infrared thermal imaging. In infrared images, the contrast of temperature anomaly regions (potential defects) is often low, and the thermal signal intensity of small defects may only be 2-3°C higher than the background, making their feature responses easily overwhelmed by the high-temperature thermal signals of large targets. Therefore, there is an urgent need for an image enhancement method that can dynamically adjust the receptive field and spatial resolution within a single path, achieving simultaneous optimization and enhancement of multi-scale targets without increasing model complexity and inference latency. Summary of the Invention
[0008] To address the aforementioned technical problems, this invention provides an infrared sensing and image enhancement intelligent analysis system for unmanned aerial vehicle (UAV) inspection.
[0009] To achieve the above objectives, the technical solution adopted by the present invention is as follows: This invention discloses an infrared sensing and image enhancement intelligent analysis system for unmanned aerial vehicle (UAV) inspection, characterized by comprising: The infrared acquisition module is used to acquire raw infrared thermal radiation image sequences, including multi-scale temperature anomaly targets, through an uncooled infrared focal plane array sensor configured with an adaptive thermal drift compensation algorithm. The image preprocessing module is used to perform physical model-based non-uniformity correction and guided filtering-based detail-preserving noise reduction on the original infrared thermal radiation image sequence to generate a standard infrared thermal radiation image. The multi-scale feature decomposition module is used to input the standard infrared thermal radiation image into a feature decomposition network with a Laplacian pyramid structure to separate the base thermal radiation feature map that represents the temperature distribution trend of large-scale targets and the detail layer thermal radiation feature map that represents the details of small temperature abrupt changes in targets. An adaptive attention module is used to perform temperature gradient-driven spatial attention weight calculations on the base thermal radiation feature map and the detail layer thermal radiation feature map respectively, generating a low-resolution spatial attention weight matrix corresponding to the base thermal radiation feature map and a high-resolution spatial attention weight matrix corresponding to the detail layer thermal radiation feature map. The cross-scale feature fusion module is used to perform selective feature enhancement and suppression on the base layer thermal radiation feature map and the detail layer thermal radiation feature map by constructing a feature fusion network with a learnable scale selection gating mechanism based on the low-resolution spatial attention weight matrix and the high-resolution spatial attention weight matrix, and output a multi-scale optimized thermal radiation feature map. The contrast enhancement and detail reconstruction module is used to perform adaptive temperature dynamic range compression based on histogram specification and small target edge enhancement based on desharpening mask on the multi-scale optimized thermal radiation feature map to generate an enhanced infrared thermal radiation image. The multi-scale anomaly analysis module is used to input the enhanced infrared thermal radiation image into a deep neural network equipped with a scale-sensing detection head, simultaneously perform semantic segmentation of large-scale temperature anomaly targets and localization and identification of micro-scale temperature anomaly targets, and output inspection analysis results including temperature measurement values and defect category confidence levels.
[0010] Furthermore, the infrared acquisition module includes: An uncooled infrared focal plane array sensor unit is used to convert incident infrared thermal radiation signals into raw electrical signals through a microbolometer array. The microbolometer array has a pixel pitch of 17 micrometers, a thermal sensitivity of less than 50 milliklvin, and a frame rate of 30 Hz. A multispectral filter switching unit is configured at the optical front end of the uncooled infrared focal plane array sensor unit. It is used to switch narrowband filters with different cutoff wavelengths in the long-wave infrared band of 8 micrometers to 14 micrometers according to the scene temperature distribution range in the original infrared thermal radiation image sequence, so as to suppress environmental stray heat radiation interference in a specific wavelength range. The sensor housing temperature monitoring unit includes a high-precision thermistor array attached to the surface of the metal housing of the uncooled infrared focal plane array sensor unit, which is used to collect the temperature drift data of the metal housing under sunlight conditions in real time. A thermal drift compensation calculation unit is used to receive the temperature drift data and perform pixel-by-pixel temperature drift compensation calculation on the original electrical signal based on a pre-calibrated shell temperature-pixel response nonlinear mapping model to generate an infrared thermal radiation electrical signal after thermal drift correction. The shell temperature-pixel response nonlinear mapping model is pre-established by collecting standard temperature data of a blackbody radiation source under different shell temperature conditions. The analog-to-digital conversion and image formatting unit is used to convert the thermally drift-corrected infrared thermal radiation electrical signal into a 14-bit digital signal and encapsulate it into the original infrared thermal radiation image sequence with timestamp and location information.
[0011] Furthermore, the image preprocessing module includes: The non-uniformity correction unit is used to suppress fixed pattern noise in the original infrared thermal radiation image sequence through cascaded processing based on the two-point correction algorithm and the scene adaptive correction algorithm. The two-point correction algorithm uses a high-temperature blackbody and a low-temperature blackbody to pre-calibrate the pixel response gain and offset coefficient of the uncooled infrared focal plane array sensor unit. The scene adaptive correction algorithm dynamically updates the offset coefficient based on the local statistical characteristics of the original infrared thermal radiation image sequence. The temperature-guided filtering unit is used to take the output of the original infrared thermal radiation image sequence after processing by the non-uniformity correction unit as the guide image and the original infrared thermal radiation image sequence as the input image to perform joint bilateral filtering. The weight kernel of the joint bilateral filtering is jointly determined by the temperature spatial gradient of the guide image and the pixel spatial distance of the input image, so as to maintain the sharpness of temperature change edges while suppressing random noise. The bad pixel detection and replacement unit is used to identify overheated and undercooled pixels in the uncooled infrared focal plane array sensor unit by using a sliding window statistical method, and to perform interpolation replacement based on the median temperature of the effective neighboring pixels of the overheated and undercooled pixels. The image normalization and format conversion unit is used to map the image data processed by the bad pixel detection and replacement unit to a 16-bit unsigned integer dynamic range, and to add the UAV platform attitude information, GPS coordinate information and acquisition timestamp information corresponding to the original infrared thermal radiation image sequence to generate the standard infrared thermal radiation image.
[0012] Furthermore, the multi-scale feature decomposition module includes: The Gaussian pyramid construction unit is used to perform multiple Gaussian blurring and downsampling operations on the standard infrared thermal radiation image to construct a Gaussian pyramid with five scale levels. The bottom layer of the Gaussian pyramid is the standard infrared thermal radiation image, and the top layer is a low-frequency approximate image reduced by 16 times the resolution. The standard deviation of the Gaussian kernel of the Gaussian blur increases exponentially with the increase of the level. The Laplacian pyramid generation unit is used to generate a Laplacian pyramid including four levels of detail by performing upsampling and difference operations on adjacent levels of the Gaussian pyramid. Each level of detail of the Laplacian pyramid represents the spatial temperature variation information of the standard infrared thermal radiation image in a specific frequency band. The base layer thermal radiation feature extraction unit is used to input the low-frequency approximate image of the top layer of the Gaussian pyramid into the first convolutional neural network branch, and extract the base layer thermal radiation feature map through three serial convolutional layers with 5×5 convolutional kernels and the ReLU activation function. The spatial resolution of the base layer thermal radiation feature map is 1 / 16 of that of the standard infrared thermal radiation image, and the channel dimension is 256. The detail layer thermal radiation feature extraction unit is used to input each detail layer of the Laplacian pyramid into four parallel sub-networks of the second convolutional neural network branch. Each parallel sub-network includes two serial convolutional layers with 3×3 convolutional kernels and a ReLU activation function to extract four detail layer thermal radiation feature sub-maps, and then fuse them into the detail layer thermal radiation feature map through channel splicing. The spatial resolution of the detail layer thermal radiation feature map is consistent with that of the standard infrared thermal radiation image, and the channel dimension is 128. The feature map dimension alignment unit is used to compress the channel dimension of the base layer thermal radiation feature map to 128 through a convolution operation with a 1×1 convolution kernel, and restore the spatial resolution of the base layer thermal radiation feature map to the same level as the thermal radiation feature map of the detail layer through bilinear interpolation upsampling. Figure 1 This is to support the parallel processing of the subsequent adaptive attention module.
[0013] Furthermore, the adaptive attention module includes: The temperature gradient calculation unit is used to calculate the first-order partial derivatives of the thermal radiation feature map of the base layer and the thermal radiation feature map of the detail layer in the horizontal and vertical directions, respectively, and to generate the temperature gradient magnitude map of the base layer and the temperature gradient magnitude map of the detail layer by gradient magnitude synthesis. The gradient magnitude synthesis is calculated using the Euclidean norm. The global temperature statistical feature extraction unit is used to perform global average pooling and global max pooling operations on the base layer thermal radiation feature map to generate the base layer global average temperature vector and the base layer global maximum temperature vector, respectively. The same operation is performed on the detail layer thermal radiation feature map to generate the detail layer global average temperature vector and the detail layer global maximum temperature vector. The spatial attention weight generation unit is used to concatenate the base layer temperature gradient magnitude map with the base layer global average temperature vector and the base layer global maximum temperature vector, and then input the result into a first multilayer perceptron network to output the low-resolution spatial attention weight matrix. It also concatenates the detail layer temperature gradient magnitude map with the detail layer global average temperature vector and the detail layer global maximum temperature vector, and then inputs the result into a second multilayer perceptron network to output the high-resolution spatial attention weight matrix. Both the first and second multilayer perceptron networks include two fully connected layers and a sigmoid activation function. Each element of both the low-resolution and high-resolution spatial attention weight matrices has a value range of 0 to 1. The temperature-sensitive region enhancement constraint unit is used to perform hard threshold truncation processing on the low-resolution spatial attention weight matrix and the high-resolution spatial attention weight matrix according to a preset temperature anomaly determination threshold, setting the weight elements below the temperature anomaly determination threshold to zero, so as to suppress the attention response of the background region and enhance the feature salience of the temperature anomaly region.
[0014] Furthermore, the cross-scale feature fusion module includes: A scale selection gating mechanism construction unit is used to construct a learnable scale selection gating mechanism including a first gating branch and a second gating branch. The first gating branch receives the element-wise product of the base layer thermal radiation feature map and the low-resolution spatial attention weight matrix. The second gating branch receives the element-wise product of the detail layer thermal radiation feature map and the high-resolution spatial attention weight matrix. Both the first gating branch and the second gating branch include a convolutional layer with a 3×3 convolutional kernel and a Tanh activation function. The gating coefficient prediction unit is used to concatenate the output feature map of the first gating branch and the output feature map of the second gating branch in the channel dimension and input them into the gating coefficient prediction network. The gating coefficient prediction network includes two serial convolutional layers with 1×1 convolutional kernels and a Softmax activation function, and outputs a base layer gating coefficient matrix and a detail layer gating coefficient matrix. The sum of each channel element of the base layer gating coefficient matrix and the detail layer gating coefficient matrix is 1. An adaptive feature weighting fusion unit is used to perform a channel-by-channel element-by-element multiplication operation on the base layer gating coefficient matrix and the base layer thermal radiation feature map weighted by the low-resolution spatial attention weight matrix to generate a scale-adaptive base layer feature map, and to perform a channel-by-channel element-by-element multiplication operation on the detail layer gating coefficient matrix and the detail layer thermal radiation feature map weighted by the high-resolution spatial attention weight matrix to generate a scale-adaptive detail feature map. The multi-scale optimized feature synthesis unit is used to perform element-wise addition operations on the scale-adaptive base feature map and the scale-adaptive detail feature map, and refine the features through a residual convolutional layer with a 3×3 convolutional kernel to output the multi-scale optimized thermal radiation feature map. The input and output of the residual convolutional layer are added by skip connections.
[0015] Furthermore, the contrast enhancement and detail reconstruction module includes: The temperature dynamic range analysis unit is used to calculate the maximum and minimum pixel temperatures of the multi-scale optimized thermal radiation feature map, determine the original temperature dynamic range, and construct a transfer function that nonlinearly maps the original temperature dynamic range to the target display temperature dynamic range based on the preset target display temperature dynamic range. The transfer function uses an S-curve to compress the contrast between the high temperature region and the low temperature region and expand the contrast of the medium temperature region. The histogram specification calculation unit is used to calculate the target histogram distribution according to the transfer function, and to map the actual temperature histogram of the multi-scale optimized thermal radiation feature map to the target histogram distribution through the histogram specification algorithm to generate a contrast-extended thermal radiation feature map. The high-frequency detail extraction unit is used to input the contrast-expanded thermal radiation feature map into the Laplacian sharpening operator, and extract the high-frequency detail layer corresponding to the temperature change edge by calculating the difference between the contrast-expanded thermal radiation feature map and the contrast-expanded thermal radiation feature map after Gaussian blurring. The anti-sharpening mask enhancement unit is used to perform element-wise multiplication of the high-frequency detail layer with a preset detail enhancement gain coefficient to generate a detail enhancement mask, and to perform element-wise addition of the detail enhancement mask with the contrast-expanded thermal radiation feature map to enhance the gradient changes of the edges of small targets and generate the enhanced infrared thermal radiation image. The temperature quantization and color mapping unit is used to quantize the floating-point temperature data of the enhanced infrared thermal radiation image into an 8-bit unsigned integer, and perform temperature-color mapping according to a preset pseudo-color lookup table to output a pseudo-color enhanced infrared thermal radiation image suitable for manual interpretation and subsequent algorithm analysis.
[0016] Furthermore, the multi-scale anomaly analysis module includes: A shared feature encoding network is used to input the enhanced infrared thermal radiation image into a deep convolutional neural network based on the ResNet-50 architecture with the last three downsampling layers removed, and to extract a shared feature map with a multi-scale receptive field. The spatial resolution of the shared feature map is 1 / 8 of that of the enhanced infrared thermal radiation image, and the channel dimension is 2048. The large-scale target semantic segmentation branch is used to input the shared feature map into the first feature pyramid network. The first feature pyramid network generates a multi-level segmentation feature map that integrates high-level semantic information and low-level spatial information through top-down path aggregation and lateral connection. It outputs pixel-level category prediction and temperature mask of large-scale temperature anomaly targets through a semantic segmentation head with 3×3 convolution kernels. The large-scale temperature anomaly targets include the transmission tower body, conductor splicing pipe, and overall structure of insulator string. The micro-scale target detection branch is used to input the shared feature map into the second feature pyramid network. The second feature pyramid network adds an additional downsampling level with deformable convolution kernels on the basis of the first feature pyramid network to expand the receptive field coverage of micro-targets. It outputs the bounding box coordinates, defect category confidence and temperature measurement value of micro-scale temperature anomaly targets through the target detection network with scale-aware detection head. The micro-scale temperature anomaly targets include pin-level connection points, equalizing ring fastening bolts and single sheds of insulators. The cross-branch feature enhancement unit is used to use the temperature mask output by the large-scale target semantic segmentation branch as an attention weight to perform spatial weighting on the shared feature map, thereby enhancing the focusing ability of the small-scale target detection branch on the temperature anomaly region. At the same time, it feeds back the bounding box coordinates of the small-scale temperature anomaly targets detected by the small-scale target detection branch to the large-scale target semantic segmentation branch to correct the edge localization accuracy of the large-scale temperature anomaly targets. The temperature measurement calibration unit is used to establish a local temperature reference based on the pixel temperature statistics in the temperature mask output by the large-scale target semantic segmentation branch, perform relative temperature difference calculation on the temperature measurement values output by the micro-scale target detection branch, and generate a normalized temperature anomaly index. The normalized temperature anomaly index is used to eliminate system measurement errors introduced by environmental radiation and emissivity settings. The inspection analysis result generation unit is used to structurally encapsulate the pixel-level category prediction of the large-scale temperature anomaly target, the bounding box coordinates and defect category confidence of the micro-scale temperature anomaly target, the normalized temperature anomaly index and the corresponding global positioning system coordinate information, and output the inspection analysis result including defect location, defect type, severity level and recommended treatment measures.
[0017] The technological advancements achieved by this invention compared to existing technologies are as follows: To address the shortcomings of existing feature extraction networks, which employ fixed spatial sampling strategies and single-scale processing procedures, and lack adaptive scale selection mechanisms tailored to the characteristics of infrared thermal imaging, particularly the technical deficiencies where the thermal signals of tiny defects in infrared images are only 2–3°C higher than the background and easily obscured by large-scale high-temperature regions, this invention offers the following advantages: (i) Construct a multi-scale thermal radiation feature decomposition mechanism. By using a Gaussian-Laplace pyramid structure, the large-scale temperature distribution trend and the details of small temperature abrupt changes are separated and modeled, avoiding the over-response of single-scale convolutional networks to high-temperature regions, and improving the distinguishability of weak thermal anomalies from a structural level.
[0018] (ii) Introducing an adaptive spatial attention mechanism driven by temperature gradient, which combines the first-order temperature gradient in the temperature space with global temperature statistical features to achieve selective enhancement of low-contrast abnormal regions and significantly suppress the interference of background thermal noise and large-area high-temperature regions.
[0019] (iii) Construct a learnable scale selection gating mechanism. Through the competitive gating coefficient of Softmax, dynamic weight allocation across scales is achieved, so that the system can automatically adjust the proportion of trend information and detail information in different scenarios, avoiding the feature overwhelming problem caused by fixed sampling strategies.
[0020] (iv) By using S-type temperature dynamic range compression and anti-sharpening mask enhancement, the contrast in the mid-temperature zone is improved and the gradient changes at the edges of small targets are strengthened, so that weak thermal defects that are only 2–3°C higher than the background can still form a stable response.
[0021] (v) During the intelligent analysis phase, large-scale structure segmentation and micro-target detection are achieved simultaneously, and a local temperature benchmark is established for relative temperature difference calibration to improve the stability and cross-scene consistency of temperature measurement results.
[0022] In summary, this invention effectively solves the problem that small defects with low contrast in infrared images are easily obscured by large targets at high temperatures, and achieves multi-scale adaptive enhancement and refined anomaly recognition, significantly improving the detection sensitivity and reliability in UAV inspection scenarios. Attached Figure Description
[0023] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used together with the embodiments of the invention to explain the invention and do not constitute a limitation thereof.
[0024] In the attached diagram: Figure 1 This is a system structure diagram of the present invention. Detailed Implementation
[0025] The following specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of the present invention will now be described with reference to the accompanying drawings.
[0026] like Figure 1 As shown, this invention discloses an infrared sensing and image enhancement intelligent analysis system for unmanned aerial vehicle (UAV) inspection, comprising: The infrared acquisition module acquires raw infrared thermal radiation image sequences, including multi-scale temperature anomaly targets, through an uncooled infrared focal plane array sensor. The image preprocessing module performs non-uniformity correction and detail-preserving noise reduction on the original infrared thermal radiation image sequence to generate a standard infrared thermal radiation image. The multi-scale feature decomposition module inputs a standard infrared thermal radiation image into the Laplacian pyramid structure network to separate the thermal radiation feature map of the base layer from the thermal radiation feature map of the detail layer. The adaptive attention module calculates spatial attention weights for the thermal radiation feature maps of the base layer and detail layer based on the temperature gradient, generating low-resolution and high-resolution spatial attention weight matrices. The cross-scale feature fusion module, based on the spatial attention weight matrix, performs selective enhancement and suppression on the thermal radiation feature maps of the base layer and detail layer through a learnable scale selection gating mechanism, and outputs a multi-scale optimized thermal radiation feature map. The contrast enhancement and detail reconstruction module performs temperature dynamic range compression and desharpening mask enhancement on the multi-scale optimized thermal radiation feature map to generate an enhanced infrared thermal radiation image. The multi-scale anomaly analysis module inputs enhanced infrared thermal radiation images into a deep neural network equipped with a scale-sensing detection head, simultaneously completing semantic segmentation of large-scale temperature anomaly targets and localization and identification of small-scale temperature anomaly targets, and outputting inspection analysis results including temperature measurements and defect category confidence levels.
[0027] Specifically, the infrared acquisition module includes: In this embodiment, the infrared acquisition module is used to stably acquire raw infrared thermal radiation image sequences, including targets with multi-scale temperature anomalies, during UAV inspection flights. This module achieves high-precision acquisition of infrared thermal radiation signals and suppression of temperature drift through the coordinated operation of an uncooled infrared focal plane array sensor unit, a multispectral filter switching unit, a sensor housing temperature monitoring unit, a thermal drift compensation calculation unit, and an analog-to-digital conversion and image formatting unit.
[0028] (a) Uncooled infrared focal plane array sensor unit The uncooled infrared focal plane array sensor unit adopts a microbolometer array structure to convert the incident long-wave infrared thermal radiation signal into a resistance change signal, and then converts it into a voltage output signal through a bias circuit.
[0029] 1. Infrared radiation to electrical signal conversion mechanism Within the 8μm–14μm long-wave infrared band, the thermal radiation power of the inspected target obeys Planck's radiation law, and its radiation power per unit area can be expressed as: ,in: Wavelength; The target thermodynamic temperature; It is Planck's constant; The speed of light; is the Boltzmann constant.
[0030] When a microbolometer pixel absorbs incident radiation, the temperature of its sensitive thin film increases, causing a change in resistance. The pixel's output voltage can be expressed as: in: For the first The original voltage output of each pixel; For pixel response gain; The change in resistance due to temperature; A fixed offset for each pixel.
[0031] 2. Pixel performance parameter constraints In this embodiment: the pixel pitch is 17μm; the thermal sensitivity (NETD) is less than 50mK; and the frame rate is 30Hz. These parameters ensure sufficient resolution for small temperature differences (2–3°C), avoid motion blur at the drone's inspection flight speed, and simultaneously cover both large-scale and small-scale targets in a single frame image.
[0032] (ii) Multispectral filter switching unit The multispectral filter switching unit is located at the optical front end of the uncooled infrared focal plane array sensor unit and is used to dynamically adjust the spectral response range.
[0033] When high-temperature targets (such as wire splicing pipes) are present in the scene, this embodiment switches to a short cutoff wavelength filter to avoid oversaturation of radiation energy in the high-temperature area; when the overall temperature difference of the scene is small (such as inspection on a cloudy day), a broadband filter is switched to improve the signal-to-noise ratio. Let the filter transmission function be: The effective radiated power received by a pixel is: By changing This enables the regulation of radiative energy within different temperature ranges.
[0034] By limiting the transmittance of specific wavelength bands, interference from surface-reflected infrared radiation, residual solar radiation, and water vapor absorption bands in the air can be suppressed, thereby improving the thermal contrast stability of the original infrared thermal radiation image sequence.
[0035] (III) Sensor Housing Temperature Monitoring Unit During drone inspections, changes in sunlight and ambient airflow can cause temperature drift in the sensor housing, leading to pixel baseline drift. The sensor housing temperature monitoring unit includes a high-precision thermistor array attached to the surface of the metal housing, which outputs the housing temperature in real time. This temperature data serves as the input parameter for the subsequent thermal drift compensation calculation unit.
[0036] (iv) Thermal drift compensation calculation unit First, a drift mechanism model is performed. The pixel output is not only related to the target temperature but also affected by the shell temperature. In this embodiment, a nonlinear mapping model of shell temperature and pixel response is established: in: This is a pixel-by-pixel nonlinear mapping function; the function is established from blackbody radiation data under multiple temperature conditions.
[0037] Then, pixel-by-pixel compensation calculations are performed to compensate the original voltage signal: The infrared thermal radiation electrical signal after thermal drift correction is obtained.
[0038] This pixel-by-pixel compensation method can avoid local errors caused by uniform offset across the entire frame, ensure that the true temperature difference of small-scale temperature anomalies is not overwhelmed by drift errors, and maintain temperature response stability under strong sunlight conditions.
[0039] (v) Analog-to-digital conversion and image formatting unit The thermally drift-corrected infrared thermal radiation electrical signal is converted into a 14-bit digital signal: The 14-bit quantization precision supports a wide dynamic temperature range and the ability to resolve minute temperature differences.
[0040] Each frame generates a timestamp, GPS coordinates, and UAV attitude angle, forming a structured sequence of raw infrared thermal radiation images: In summary, through the aforementioned structure and calculation process, the infrared acquisition module achieves highly sensitive acquisition of 8μm–14μm long-wave infrared radiation, pixel-by-pixel nonlinear compensation for shell temperature drift, spectral suppression of environmental stray radiation, and stable imaging of multi-scale temperature anomaly targets. This module provides highly stable, high dynamic range raw infrared thermal radiation image sequences as input for subsequent image preprocessing modules.
[0041] Specifically, the image preprocessing module includes: In this embodiment, the image preprocessing module performs physical consistency correction and statistical noise suppression on the raw infrared thermal radiation image sequence output by the infrared acquisition module, generating a standard infrared thermal radiation image for subsequent multi-scale feature decomposition modules. Let the single-frame image output by the infrared acquisition module after thermal drift compensation and analog-to-digital conversion be: ,in: The frame number; These are the cell coordinates; It is a 14-bit digital signal. This module... Execute in sequence: 1. Non-uniformity correction; 2. Temperature-guided filtering noise reduction; 3. Defective pixel detection and replacement; 4. Image normalization and format conversion; Final output: That is, a standard infrared thermal radiation image.
[0042] (a) Non-uniformity correction unit First, regarding the formation mechanism of pattern noise, in an uncooled infrared focal plane array sensor unit, due to differences in pixel manufacturing processes, each pixel has different response gain and offset coefficient, namely: ,in: This is a real radiation signal; For pixel gain; For pixel offset, the distance between different pixels , Inconsistency can create fixed pattern noise.
[0043] In known high-temperature blackbodies With low temperature blackbody Under the given conditions, obtain the pixel output respectively: The pixel gain and offset coefficient are then calculated as follows: During the inspection process, linear inversion correction is performed on each frame: The thermal radiation image after two-point correction is obtained.
[0044] Since the ambient temperature changes slowly during flight, relying solely on the above two-point correction cannot eliminate residual offset drift. Therefore, a local window for the current frame is assumed. The average value within is: If the window is part of the background area, its statistical value should be stable. If a systematic shift in the local mean is detected, the shift should be dynamically updated. in: For reference background temperature; For adaptive coefficient updates.
[0045] Final output: That is, the thermal radiation image after cascaded non-uniformity correction.
[0046] (ii) Temperature-guided filtering unit This unit is used to suppress random noise while maintaining the edge of temperature abrupt changes.
[0047] 1. Input Definition Input image: ; Guide image: same .
[0048] 2. Joint bilateral filtering model The output pixel value is defined as: in: For the current pixel; For neighboring pixels; For space kernel parameters; These are the kernel parameters for the temperature gradient; For temperature spatial gradient; This is the normalization coefficient.
[0049] This temperature-guided filtering unit can smooth random noise (high-frequency, small-amplitude fluctuations), maintain the gradient in areas of abrupt temperature changes (edges of targets with minute-scale temperature anomalies), and avoid the blurring of thermal boundaries caused by traditional Gaussian filtering. The output is denoted as: .
[0050] (III) Bad Pixel Detection and Replacement Unit First, perform bad pixel detection in the sliding window. Within this range, calculate the local mean and standard deviation: , If the following conditions are met: ,in If the threshold coefficient is used, the pixels are classified as: overheated pixels (positive anomalies) or undercooled pixels (negative anomalies).
[0051] Then, a replacement is performed, using the median temperature of the effective neighboring pixels: This method is superior to mean replacement and can avoid the spread of temperature gradient.
[0052] The output is: (iv) Image Normalization and Format Conversion Unit First, perform dynamic range mapping and calculate the minimum and maximum temperatures for the current frame: , Mapped to a 16-bit unsigned integer: get: Then, information encapsulation is performed, and the following information is appended to the image structure: Timestamp ; Global Positioning System coordinates ; attitude angle ; Generate a standard infrared thermal radiation image: The image preprocessing module achieves pixel-level fixed pattern noise suppression, random noise smoothing without weakening temperature change edges, adaptive repair of bad pixels, and output of a high dynamic range, structured encapsulated standard infrared thermal radiation image. This standard infrared thermal radiation image provides a stable, low-noise, and physically consistent input data foundation for the subsequent multi-scale feature decomposition module.
[0053] Specifically, the multi-scale feature decomposition module includes: In this embodiment, the multi-scale feature decomposition module is used to process the standard infrared thermal radiation image output by the image preprocessing module. This module performs structured decomposition in the frequency and spatial scale domains, achieving physical decoupling between large-scale temperature distribution trends and minute-scale temperature abrupt changes within a single path. Based on the Laplacian pyramid structure and combined with a bi-branch convolutional feature extraction network, the final output is: Thermal radiation characteristic map of the base layer (Characterizing the temperature distribution trend of large-scale targets); Thermal radiation feature map of detail layer (Characterizing details of minute-scale temperature abrupt changes).
[0054] (a) Building blocks of the Gaussian pyramid First, a multi-scale low-frequency structure is constructed, with a standard infrared thermal radiation image as input: A five-layer Gaussian pyramid is constructed using recursive Gaussian blur and downsampling: ,in: The Gaussian kernel standard deviation increases layer by layer, with a downsampling factor of 2, at the top layer... The resolution is 1 / 16 of the original image.
[0055] The Gaussian pyramid suppresses high-frequency temperature abrupt changes layer by layer, retaining only the low-frequency temperature distribution trend. Includes complete temperature information; The overall temperature distribution trend of the transmission tower is approximately characterized; high-frequency small-scale temperature anomalies have been filtered out. This structure naturally conforms to the scale distribution characteristics of infrared thermal imaging.
[0056] (ii) Laplace's pyramid generating unit Perform upsampling and difference operations on adjacent layers of the Gaussian pyramid: in: For the first Layer detail level; It characterizes temperature variation information within the frequency bandpass.
[0057] : Extremely high frequency detail (pin-level defect edge); : Higher frequency details (local anomalies in insulator skirts); : Mid-frequency details (local wire splicing points); Low-frequency details (uneven local temperature distribution on the tower).
[0058] This structure enables multi-frequency separation of the temperature field.
[0059] (III) Basic-level thermal radiation characteristic extraction unit The input is a low-frequency approximation image of the top layer of a Gaussian pyramid. Its spatial resolution is: .
[0060] The first convolutional neural network branch structure uses three sequential 5×5 convolutional layers: Each layer is followed by a ReLU activation function.
[0061] Output: .
[0062] A 5×5 convolution kernel can expand the receptive field, enhance the modeling of large-scale temperature field structures, and suppress the interference of local noise on the judgment of global trends.
[0063] This feature map is used to express the temperature trend of large-scale targets (such as the entire tower or a long section of conductor).
[0064] (iv) Detail layer thermal radiation feature extraction unit The input is the fourth level of detail in the Laplace Pyramid: .
[0065] In the second convolutional neural network branch structure, to avoid scale aliasing, each detailed layer is input to an independent parallel subnetwork: Each subnetwork consists of two 3×3 convolutional layers with ReLU activation function.
[0066] Then, channel splicing and fusion are performed, and the outputs of the four sub-networks are spliced together: Output: Its spatial resolution remains consistent with that of standard infrared thermal radiation images.
[0067] 3×3 convolution preserves high-frequency local temperature gradients, and the multi-level parallel structure prevents information from different frequency bands from interfering with each other, thus enhancing the feature representation of small-scale temperature anomaly targets.
[0068] This feature map serves to express temperature abrupt changes in microscale targets such as pin-level defects and equalizing ring connection points.
[0069] (v) Feature map dimension alignment unit Because the thermal radiation feature maps of the base layer and the detail layer differ in spatial resolution and channel dimension, they need to be aligned.
[0070] First, channel compression is performed using a 1×1 convolution operation: The number of channels was reduced from 256 to 128: .
[0071] Then spatial upsampling is performed using bilinear interpolation: ,get: .
[0072] The final output includes two types of feature maps: aligned base layer thermal radiation feature maps. Thermal radiation feature map of detail layer Both have the same spatial dimensions and can be input in parallel to the subsequent adaptive attention module.
[0073] The multi-scale feature decomposition module achieves frequency domain physical decoupling based on the Laplacian pyramid, and structural separation of large-scale temperature distribution trends and micro-scale temperature abrupt change details. It completes multi-scale information separation within a single-path network structure, providing a structurally clear dual-scale feature input for subsequent dynamic scale selection and gating fusion.
[0074] Specifically, the adaptive attention module includes: The infrared acquisition module performs normalization and noise suppression of standard infrared thermal radiation images, the image preprocessing module performs basic feature encoding, and the multi-scale feature decomposition module outputs the thermal radiation feature map of the base layer. Thermal radiation feature map of detail layer Building upon this foundation, this module constructs a temperature gradient-driven spatial attention control mechanism to address the spatial variation characteristics of temperature at different scales, generating low-resolution spatial attention weight matrices accordingly. With high-resolution spatial attention weight matrix This is used for subsequent scale-adaptive feature enhancement.
[0075] grassroots characteristics Characterizes the trend of large-scale temperature distribution; Detail features Characterizes local temperature abrupt changes and minute target details; This module executes the following separately for each: 1. Modeling the significance of temperature gradients; 2. Global temperature statistical feature modeling; 3. Gradient and statistical features are fused to generate spatial attention weights; 4. Enhanced constraint in temperature-sensitive areas; To achieve adaptive attention to thermal anomaly regions at different scales.
[0076] (a) Temperature gradient calculation unit To enhance sensitivity to regions of abrupt temperature changes, the spatial first-order partial derivatives are calculated for features at both the base and detail layers. For any feature map... Its horizontal and vertical gradients are respectively: Gradient magnitudes are synthesized using the Euclidean norm: Obtain the temperature gradient amplitude map of the base layer. Temperature gradient amplitude map of detail layer .
[0077] The base layer gradient emphasizes the boundaries of large-scale temperature changes (such as the overall outline of the equipment's heating zone), while the detail layer gradient emphasizes local hot spots or microcrack areas. (II) Global Temperature Statistical Feature Extraction Unit To enhance global semantic control capabilities, respectively... and implement: 1. Global average pooling 2. Global max pooling Obtain: Global average temperature vector at the grassroots level ; Maximum temperature vector at the grassroots level ;Detail layer global average temperature vector ; Global highest temperature vector at the detail layer .
[0078] (III) Spatial Attention Weight Generation Unit First, feature concatenation is performed: At the grassroots level: ; Regarding the scale of detail: .
[0079] The global vector is extended to the spatial dimension through a broadcast mechanism.
[0080] Then, a multilayer perceptron mapping is performed, and the concatenated features are input into a two-layer fully connected network (sharing structure but not sharing parameters): ,in: , These are the weights for the fully connected layer; It is the ReLU activation function; This is the Sigmoid function.
[0081] get: Low-resolution spatial attention weight matrix ; High-resolution spatial attention weight matrix .
[0082] (iv) Temperature-sensitive area enhanced constraint unit To suppress the response in background noise areas, a temperature anomaly detection threshold is set. Perform hard threshold truncation: Enhanced version: Low-resolution attention matrix ; High-resolution attention matrix .
[0083] In summary, this module jointly models temperature spatial gradient information with global thermal statistical semantics to achieve adaptive selective attention to thermal anomaly targets at different scales.
[0084] Specifically, the cross-scale feature fusion module includes: The infrared acquisition module performs standard infrared thermal radiation image preprocessing, the image preprocessing module performs basic semantic encoding, the multi-scale feature decomposition module performs multi-scale decomposition, and the adaptive attention module generates a low-resolution spatial attention weight matrix. With high-resolution spatial attention weight matrix Building upon this foundation, this module constructs a learnable scale-selective gating mechanism for the thermal radiation characteristic map of the grassroots level. Thermal radiation feature map of detail layer Perform cross-scale dynamic regulation and fusion to output multi-scale optimized thermal radiation characteristic maps. The core objective of this module is to adaptively select either trend-driven or detail-driven feature representation methods under different scenarios and thermal anomaly scales, thereby achieving optimal fusion of cross-scale thermal radiation information.
[0085] (I) Scale Selection Gating Mechanism Construction Unit First, attention-weighted pre-modulation is performed, applying the attention matrix output by the adaptive attention module to the corresponding scale features: in This indicates channel-wise and element-wise multiplication.
[0086] This step completes the suppression of low-significance background regions and the enhancement of temperature anomaly response regions.
[0087] Then, gating branches are constructed, creating two parallel gating branches, the first gating branch (the base branch): The second gating branch (detail branch). Its technical role is to capture local spatial dependencies through 3×3 convolution, compress features to the ([-1,1]) interval through the Tanh function, and provide scale comparison information for subsequent gating weight prediction.
[0088] (ii) Gating coefficient prediction unit First, perform channel splicing: .
[0089] Then, gating coefficients are generated, and the concatenated result is input into the gating coefficient prediction network (two serial 1×1 convolutional layers + Softmax): in: This is the basic gate control coefficient matrix; The detail layer gate coefficient matrix satisfies .
[0090] Its technical significance lies in constructing a competitive scale selection mechanism, which dynamically determines the proportion of trend information and detailed information at each spatial location, avoiding feature dilution caused by simple weighted averaging.
[0091] (III) Adaptive Feature Weighting Fusion Unit Perform quadratic gating weighting on both scales respectively: , Scale-adaptive base layer feature map is obtained Scale-adaptive detail map .
[0092] This stage aims to increase the weight of the basic layer when large-area thermal anomalies exist, and increase the weight of the detail layer when small hotspots dominate, thus forming a spatially differentiated fusion strategy in complex scenarios.
[0093] (iv) Multi-scale optimized feature synthesis unit First, feature synthesis is performed: Then, residual refinement is performed, further refined through a 3×3 convolutional residual layer: The input and output are added by skip connections, which refines the scale interaction boundary, reduces the pseudo-edge effect generated during the fusion process, and improves the structural consistency of the thermal anomaly region.
[0094] In summary, this module introduces a competitive Softmax gating mechanism to achieve dynamic scale selection, and combines spatial attention and channel gating to construct a dual control structure. Residual refinement enhances fusion stability, avoiding false positives or false negatives caused by single-scale dominance, and significantly improves the generalization ability of multiple types of thermal anomalies in UAV inspection scenarios. The final output is a multi-scale optimized thermal radiation feature map. It has the same spatial resolution as the original input, a uniform 128-channel dimension, and preserves large-scale temperature trend information and details of small temperature changes, making it suitable for subsequent anomaly detection, target localization, or image enhancement and reconstruction tasks.
[0095] Specifically, the contrast enhancement and detail reconstruction module includes: This module addresses the input requirements of human-computer collaborative interpretation and subsequent anomaly detection algorithms, performing dynamic temperature range reconstruction and edge enhancement of small targets, ultimately generating enhanced infrared thermal radiation images. This module implements two types of enhancements: adaptive redistribution of temperature contrast and edge gradient enhancement for small thermal anomalies.
[0096] (a) Temperature dynamic range analysis unit Calculation of the original temperature dynamic range and optimization of feature maps at multiple scales. calculate: , The original temperature dynamic range is: .
[0097] To avoid excessive saturation in high or low temperature regions, an S-shaped transfer function is constructed: ,in, This is the original temperature value. This is the center temperature value. This is the contrast adjustment coefficient.
[0098] This unit can compress the contrast between high-temperature and low-temperature zones, expand the contrast in the medium-temperature zone, and improve the distinguishability of weak thermal anomaly areas.
[0099] (ii) Histogram-defined calculation unit The target probability density distribution is obtained based on the sigmoid transfer function. Temperature remapping is achieved by matching the cumulative distribution function: The contrast-extended thermal radiation feature map was obtained. Its technical significance is to maintain the overall temperature sorting relationship, eliminate the occupation of the display dynamic range by extreme values, and improve the visibility of low contrast areas.
[0100] (III) High-Frequency Detail Extraction Unit To enhance the edges of temperature abrupt changes, the Laplacian sharpening operator is employed.
[0101] First, apply Gaussian blur: Then, high-frequency detail extraction is performed: ,in: As a high-frequency detail layer, it represents the edges of temperature abrupt changes and the boundaries of small targets. Its technical role is to preserve the optimized scale structure after the cross-scale feature fusion module is fused, and to extract information on the edges of small cracks and local hot spots.
[0102] (iv) Anti-sharpening mask enhancement unit First, construct the enhancement mask and set the detail enhancement gain coefficients. ,but: Details layered The technical effect is to enhance the edge gradient of small targets, improve the clarity of thermal anomaly contours, and maintain the structural stability of the main area. This process avoids over-sharpening and artifacts.
[0103] (v) Temperature quantization and color mapping unit Map the floating-point temperature data to ([0,255]): Convert to an 8-bit unsigned integer format and execute according to a preset lookup table (such as Ironbow, Rainbow, etc.): It outputs pseudo-color enhanced infrared thermal radiation images. Its technical significance lies in improving the efficiency of manual inspection and interpretation, enhancing the visual contrast of abnormal areas, and ensuring the consistency of input for algorithm analysis.
[0104] Specifically, the multi-scale anomaly analysis module includes: The aforementioned modules perform the following: infrared data standardization and noise suppression, multi-scale thermal radiation feature decomposition, temperature gradient-driven attention enhancement, cross-scale dynamic fusion, contrast enhancement and detail reconstruction, and output an enhanced infrared thermal radiation image. Based on this, this module realizes intelligent analysis of multi-scale temperature anomaly targets for UAV inspection scenarios, simultaneously completing large-scale target semantic segmentation, micro-scale anomaly target detection and localization, temperature measurement calibration, structured inspection result generation, and finally outputting inspection analysis results including temperature measurement values and defect category confidence.
[0105] (a) Shared Feature Coding Network The enhanced infrared thermal radiation image is input into a deep convolutional network based on the ResNet-50 architecture. The last three downsampling layers are removed, and the output is a shared feature map. Its technical features include maintaining high spatial resolution (1 / 8), expanding the receptive field to cover large-scale structures, and providing a unified semantic expression for segmentation and detection branches.
[0106] (II) Large-scale target semantic segmentation branch Construct a first feature pyramid network, build a top-down path aggregation structure, fuse high-level semantic information and low-level spatial details, and generate multi-level segmentation feature maps: A pixel-level category prediction map is output through a 3×3 convolutional layer. Temperature mask ,in: Large-scale temperature anomaly targets include: the transmission tower body, conductor splicing pipes, and the overall structure of insulator strings. Their functional significance lies in accurately extracting the thermal anomaly areas of the entire structure and establishing a local temperature statistical benchmark.
[0107] (III) Small-scale target detection branch A second feature pyramid network is constructed, which adds additional downsampling levels and deformable convolutional kernels to the first feature pyramid network to improve the receptive field coverage for small targets.
[0108] The scale-aware detection head outputs bounding box coordinates. Defect category confidence level Temperature measurement value Bounding box regression is represented as: Confidence level: . Small targets include pin-level connection points, equalizing ring fastening bolts, and individual insulator skirts.
[0109] (iv) Cross-branch feature enhancement unit Use temperature masks as attention weights: This enhances the detection branch's ability to focus on high-temperature areas, feeds back the detected small target bounding box regions to the segmentation branch, and refines the edges of the corresponding regions to enhance bidirectional information flow and improve edge positioning accuracy.
[0110] (v) Temperature measurement calibration unit To eliminate errors caused by environmental radiation and emissivity, a local temperature reference is established.
[0111] 1. Local reference temperature Calculations are performed within the temperature mask region: 2. Normalized Temperature Difference Index Its technical significance is to eliminate the influence of ambient background temperature, provide comparable temperature difference indicators across scenarios, and improve the stability of anomaly level determination.
[0112] (vi) Inspection Analysis Result Generation Unit The following information is structured and encapsulated: 1. Pixel-level segmentation category map 2. Coordinates of the bounding box of the small target 3. Confidence level of defect category 4. Normalized Temperature Anomaly Index 5. GPS location information Generate inspection analysis results: defect location, defect type, severity level (based on...) (Classification), and recommended handling measures.
[0113] Output formats supported: JSON structured reports, visual overlay charts, and API calls to operation and maintenance systems.
[0114] Finally, it should be noted that the above descriptions are merely preferred embodiments of the present invention and are not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the claims of the present invention.
Claims
1. An infrared sensing and image enhancement intelligent analysis system for unmanned aerial vehicle (UAV) inspection, characterized in that, include: The infrared acquisition module acquires raw infrared thermal radiation image sequences, including multi-scale temperature anomaly targets, through an uncooled infrared focal plane array sensor. The image preprocessing module performs non-uniformity correction and detail-preserving noise reduction on the original infrared thermal radiation image sequence to generate a standard infrared thermal radiation image. The multi-scale feature decomposition module inputs a standard infrared thermal radiation image into the Laplacian pyramid structure network to separate the thermal radiation feature map of the base layer from the thermal radiation feature map of the detail layer. The adaptive attention module calculates spatial attention weights for the thermal radiation feature maps of the base layer and detail layer based on the temperature gradient, generating low-resolution and high-resolution spatial attention weight matrices. The cross-scale feature fusion module, based on the spatial attention weight matrix, performs selective enhancement and suppression on the thermal radiation feature maps of the base layer and detail layer through a learnable scale selection gating mechanism, and outputs a multi-scale optimized thermal radiation feature map. The contrast enhancement and detail reconstruction module performs temperature dynamic range compression and desharpening mask enhancement on the multi-scale optimized thermal radiation feature map to generate an enhanced infrared thermal radiation image. The multi-scale anomaly analysis module inputs enhanced infrared thermal radiation images into a deep neural network equipped with a scale-sensing detection head, simultaneously completing semantic segmentation of large-scale temperature anomaly targets and localization and identification of small-scale temperature anomaly targets, and outputting inspection analysis results including temperature measurements and defect category confidence levels.
2. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The infrared acquisition module includes: An uncooled infrared focal plane array sensor unit is used to convert incident infrared thermal radiation signals into raw electrical signals through a microbolometer array. The microbolometer array has a pixel pitch of 17 micrometers, a thermal sensitivity of less than 50 milliklvin, and a frame rate of 30 Hz. A multispectral filter switching unit is configured at the optical front end of the uncooled infrared focal plane array sensor unit. It is used to switch narrowband filters with different cutoff wavelengths in the long-wave infrared band of 8 micrometers to 14 micrometers according to the scene temperature distribution range in the original infrared thermal radiation image sequence, so as to suppress environmental stray heat radiation interference in a specific wavelength range. The sensor housing temperature monitoring unit includes a high-precision thermistor array attached to the surface of the metal housing of the uncooled infrared focal plane array sensor unit, which is used to collect the temperature drift data of the metal housing under sunlight conditions in real time. A thermal drift compensation calculation unit is used to receive the temperature drift data, perform pixel-by-pixel temperature drift compensation calculation on the original electrical signal, and generate an infrared thermal radiation electrical signal that has been thermally drift corrected. The analog-to-digital conversion and image formatting unit is used to convert the thermally drift-corrected infrared thermal radiation electrical signal into a 14-bit digital signal and encapsulate it into the original infrared thermal radiation image sequence with timestamp and location information.
3. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The image preprocessing module includes: The non-uniformity correction unit suppresses fixed pattern noise in the original infrared thermal radiation image sequence; The temperature-guided filtering unit is used to take the output of the original infrared thermal radiation image sequence after processing by the non-uniformity correction unit as the guide image and the original infrared thermal radiation image sequence as the input image to perform joint bilateral filtering. The weight kernel of the joint bilateral filtering is jointly determined by the temperature spatial gradient of the guide image and the pixel spatial distance of the input image, so as to maintain the sharpness of temperature change edges while suppressing random noise. The bad pixel detection and replacement unit is used to identify overheated and undercooled pixels in the uncooled infrared focal plane array sensor unit, and to perform interpolation replacement based on the median temperature of the effective pixels in the neighborhood of the overheated and undercooled pixels. The image normalization and format conversion unit is used to map the image data processed by the bad pixel detection and replacement unit to a 16-bit unsigned integer dynamic range, and to add the UAV platform attitude information, GPS coordinate information and acquisition timestamp information corresponding to the original infrared thermal radiation image sequence to generate the standard infrared thermal radiation image.
4. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The multi-scale feature decomposition module includes: A Gaussian pyramid construction unit is used to perform Gaussian blurring and downsampling operations on the standard infrared thermal radiation image to construct a Gaussian pyramid with five scale levels. The Laplacian pyramid generation unit is used to generate a Laplacian pyramid including four levels of detail by performing upsampling and difference operations on adjacent levels of the Gaussian pyramid. Each level of detail of the Laplacian pyramid represents the spatial temperature variation information of the standard infrared thermal radiation image in a specific frequency band. The base layer thermal radiation feature extraction unit is used to input the low-frequency approximate image of the top layer of the Gaussian pyramid into the first convolutional neural network branch, and extract the base layer thermal radiation feature map through three serial convolutional layers with 5×5 convolutional kernels and the ReLU activation function. The spatial resolution of the base layer thermal radiation feature map is 1 / 16 of that of the standard infrared thermal radiation image, and the channel dimension is 256. The detail layer thermal radiation feature extraction unit is used to input each detail layer of the Laplacian pyramid into four parallel sub-networks of the second convolutional neural network branch. Each parallel sub-network includes two serial convolutional layers with 3×3 convolutional kernels and a ReLU activation function to extract four detail layer thermal radiation feature sub-maps, and then fuse them into the detail layer thermal radiation feature map through channel splicing. The spatial resolution of the detail layer thermal radiation feature map is consistent with that of the standard infrared thermal radiation image, and the channel dimension is 128. The feature map dimension alignment unit is used to compress the channel dimension of the base layer thermal radiation feature map to 128 through a convolution operation with a 1×1 convolution kernel, and restore the spatial resolution of the base layer thermal radiation feature map to be consistent with that of the detail layer thermal radiation feature map through bilinear interpolation upsampling, so as to support the parallel processing of the subsequent adaptive attention module.
5. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The adaptive attention module includes: The temperature gradient calculation unit is used to calculate the first-order partial derivatives of the thermal radiation feature map of the base layer and the thermal radiation feature map of the detail layer in the horizontal and vertical directions, respectively, and to generate the temperature gradient amplitude map of the base layer and the temperature gradient amplitude map of the detail layer by synthesizing the gradient amplitude. The global temperature statistical feature extraction unit is used to perform global average pooling and global max pooling operations on the base layer thermal radiation feature map to generate the base layer global average temperature vector and the base layer global maximum temperature vector, respectively. The same operation is performed on the detail layer thermal radiation feature map to generate the detail layer global average temperature vector and the detail layer global maximum temperature vector. The spatial attention weight generation unit is used to concatenate the base layer temperature gradient magnitude map with the base layer global average temperature vector and the base layer global maximum temperature vector, and then input the result into a first multilayer perceptron network to output the low-resolution spatial attention weight matrix. It also concatenates the detail layer temperature gradient magnitude map with the detail layer global average temperature vector and the detail layer global maximum temperature vector, and then inputs the result into a second multilayer perceptron network to output the high-resolution spatial attention weight matrix. Both the first and second multilayer perceptron networks include two fully connected layers and a sigmoid activation function. Each element of both the low-resolution and high-resolution spatial attention weight matrices has a value range of 0 to 1. The temperature-sensitive region enhancement constraint unit is used to perform hard threshold truncation processing on the low-resolution spatial attention weight matrix and the high-resolution spatial attention weight matrix according to a preset temperature anomaly determination threshold, setting the weight elements below the temperature anomaly determination threshold to zero, so as to suppress the attention response of the background region and enhance the feature salience of the temperature anomaly region.
6. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The cross-scale feature fusion module includes: A scale selection gating mechanism construction unit is used to construct a learnable scale selection gating mechanism including a first gating branch and a second gating branch. The first gating branch receives the element-wise product of the base layer thermal radiation feature map and the low-resolution spatial attention weight matrix. The second gating branch receives the element-wise product of the detail layer thermal radiation feature map and the high-resolution spatial attention weight matrix. Both the first gating branch and the second gating branch include a convolutional layer with a 3×3 convolutional kernel and a Tanh activation function. The gating coefficient prediction unit is used to concatenate the output feature map of the first gating branch and the output feature map of the second gating branch in the channel dimension and input them into the gating coefficient prediction network. The gating coefficient prediction network includes two serial convolutional layers with 1×1 convolutional kernels and a Softmax activation function, and outputs a base layer gating coefficient matrix and a detail layer gating coefficient matrix. The sum of each channel element of the base layer gating coefficient matrix and the detail layer gating coefficient matrix is 1. An adaptive feature weighting fusion unit is used to perform a channel-by-channel element-by-element multiplication operation on the base layer gating coefficient matrix and the base layer thermal radiation feature map weighted by the low-resolution spatial attention weight matrix to generate a scale-adaptive base layer feature map, and to perform a channel-by-channel element-by-element multiplication operation on the detail layer gating coefficient matrix and the detail layer thermal radiation feature map weighted by the high-resolution spatial attention weight matrix to generate a scale-adaptive detail feature map. The multi-scale optimized feature synthesis unit is used to perform element-wise addition operations on the scale-adaptive base feature map and the scale-adaptive detail feature map, and refine the features through a residual convolutional layer with a 3×3 convolutional kernel to output the multi-scale optimized thermal radiation feature map. The input and output of the residual convolutional layer are added by skip connections.
7. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The contrast enhancement and detail reconstruction module includes: The temperature dynamic range analysis unit is used to calculate the maximum and minimum pixel temperatures of the multi-scale optimized thermal radiation feature map, determine the original temperature dynamic range, and construct a transfer function that nonlinearly maps the original temperature dynamic range to the target display temperature dynamic range based on the preset target display temperature dynamic range. The transfer function uses an S-curve to compress the contrast between the high temperature region and the low temperature region and expand the contrast of the medium temperature region. The histogram specification calculation unit is used to calculate the target histogram distribution according to the transfer function, and to map the actual temperature histogram of the multi-scale optimized thermal radiation feature map to the target histogram distribution through the histogram specification algorithm to generate a contrast-extended thermal radiation feature map. The high-frequency detail extraction unit is used to input the contrast-expanded thermal radiation feature map into the Laplacian sharpening operator, and extract the high-frequency detail layer corresponding to the temperature change edge by calculating the difference between the contrast-expanded thermal radiation feature map and the contrast-expanded thermal radiation feature map after Gaussian blurring. The anti-sharpening mask enhancement unit is used to perform element-wise multiplication of the high-frequency detail layer with a preset detail enhancement gain coefficient to generate a detail enhancement mask, and to perform element-wise addition of the detail enhancement mask with the contrast-expanded thermal radiation feature map to enhance the gradient changes of the edges of small targets and generate the enhanced infrared thermal radiation image. The temperature quantization and color mapping unit is used to quantize the floating-point temperature data of the enhanced infrared thermal radiation image into an 8-bit unsigned integer, and perform temperature-color mapping according to a preset pseudo-color lookup table to output a pseudo-color enhanced infrared thermal radiation image suitable for manual interpretation and subsequent algorithm analysis.
8. The infrared sensing and image enhancement intelligent analysis system for UAV inspection according to claim 1, characterized in that, The multi-scale anomaly analysis module includes: A shared feature encoding network is used to input the enhanced infrared thermal radiation image into a deep convolutional neural network based on the ResNet-50 architecture with the last three downsampling layers removed, and to extract a shared feature map with a multi-scale receptive field. The spatial resolution of the shared feature map is 1 / 8 of that of the enhanced infrared thermal radiation image, and the channel dimension is 2048. The large-scale target semantic segmentation branch is used to input the shared feature map into the first feature pyramid network. The first feature pyramid network generates a multi-level segmentation feature map that integrates high-level semantic information and low-level spatial information through top-down path aggregation and lateral connection. It outputs pixel-level category prediction and temperature mask of large-scale temperature anomaly targets through a semantic segmentation head with 3×3 convolution kernels. The large-scale temperature anomaly targets include the transmission tower body, conductor splicing pipe, and overall structure of insulator string. The micro-scale target detection branch is used to input the shared feature map into the second feature pyramid network. The second feature pyramid network adds an additional downsampling level with deformable convolution kernels on the basis of the first feature pyramid network to expand the receptive field coverage of micro-targets. It outputs the bounding box coordinates, defect category confidence and temperature measurement value of micro-scale temperature anomaly targets through the target detection network with scale-aware detection head. The micro-scale temperature anomaly targets include pin-level connection points, equalizing ring fastening bolts and single sheds of insulators. The cross-branch feature enhancement unit is used to use the temperature mask output by the large-scale target semantic segmentation branch as an attention weight to perform spatial weighting on the shared feature map, thereby enhancing the focusing ability of the small-scale target detection branch on the temperature anomaly region. At the same time, it feeds back the bounding box coordinates of the small-scale temperature anomaly targets detected by the small-scale target detection branch to the large-scale target semantic segmentation branch to correct the edge localization accuracy of the large-scale temperature anomaly targets. The temperature measurement calibration unit is used to establish a local temperature reference based on the pixel temperature statistics in the temperature mask output by the large-scale target semantic segmentation branch, perform relative temperature difference calculation on the temperature measurement values output by the micro-scale target detection branch, and generate a normalized temperature anomaly index. The normalized temperature anomaly index is used to eliminate system measurement errors introduced by environmental radiation and emissivity settings. The inspection analysis result generation unit is used to structurally encapsulate the pixel-level category prediction of the large-scale temperature anomaly target, the bounding box coordinates and defect category confidence of the micro-scale temperature anomaly target, the normalized temperature anomaly index and the corresponding GPS coordinate information, and output the inspection analysis result.