A visible light image temperature detection method, system, device and medium

By combining atmospheric scattering models and machine learning, the problem of decreased accuracy in visible light image temperature measurement under complex weather conditions was solved, enabling high-precision temperature detection of transformer equipment.

CN122244450APending Publication Date: 2026-06-19ELECTRIC POWER RES INST OF GUANGDONG POWER GRID CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ELECTRIC POWER RES INST OF GUANGDONG POWER GRID CO LTD
Filing Date
2026-04-15
Publication Date
2026-06-19

Smart Images

  • Figure CN122244450A_ABST
    Figure CN122244450A_ABST
Patent Text Reader

Abstract

This application discloses a visible light image temperature detection method, system, device, and medium, belonging to the field of image processing and temperature detection technology. The method includes: acquiring a visible light image; performing image restoration processing on the visible light image to obtain a restored image; inputting the restored image into a pre-trained semantic segmentation model to obtain image regions corresponding to each transformer device; extracting the color distribution statistical features of the red, green, and blue channels within the image region corresponding to each transformer device; and inputting the color distribution statistical features into a pre-trained machine learning model to obtain the predicted temperature of the transformer device. By implementing this application, the technical problem of decreased accuracy of visible light image temperature measurement under complex weather conditions can be solved, effectively improving the accuracy of temperature measurement of multiple transformer devices in harsh environments such as fog, rain, and sandstorms.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of image processing and temperature detection technology, and in particular to a method, system, device and medium for temperature detection of visible light images. Background Technology

[0002] In the operation and maintenance of power systems, temperature status monitoring of critical equipment such as transformers is an important means of preventing equipment failures and ensuring the safe operation of the power grid. Abnormal temperature rises caused by poor contact, overload, or internal faults are often early signs of impending failures. Therefore, high-frequency, full-coverage real-time monitoring of equipment temperature has significant engineering value.

[0003] Currently, temperature detection methods in industrial settings mainly include manual inspection, infrared thermal imaging, and visible light image analysis. Manual inspection relies on handheld infrared thermometers or thermocouples for contact measurement. This method is inefficient, difficult to implement for large-area continuous monitoring, and cannot be carried out normally in adverse weather conditions such as rain, snow, and fog. Measurement data is also easily affected by ambient temperature and humidity. Although infrared thermal imaging can achieve non-contact temperature measurement, in real-world scenarios such as blast furnaces and substations, dynamic dust can scatter infrared radiation, leading to measurement errors. The high reflectivity of metal surfaces is easily affected by sunlight and raindrop reflections, causing false high-temperature readings. Furthermore, high-precision infrared equipment is expensive and requires regular calibration, making large-scale deployment difficult. The basic principle of visible light image-based temperature measurement methods is to utilize the subtle color or brightness changes exhibited by the device at different temperatures, extract image features, and establish a mapping relationship between these features and temperature to achieve temperature measurement. However, existing visible light image temperature measurement methods have limitations when applied to complex outdoor environments: under adverse weather conditions such as rain, fog, haze, and sandstorms, images exhibit degradation phenomena such as decreased contrast, blurred details, and color distortion, resulting in a reduced image signal-to-noise ratio and the disruption of the stable correspondence between color features and temperature. At the same time, substation scenarios typically contain multiple transformer devices, and existing methods struggle to accurately segment multiple targets simultaneously under image degradation conditions, leading to extracted features being mixed with noise from non-target areas, further reducing temperature measurement accuracy.

[0004] Therefore, the biggest drawback of existing technology is that image degradation caused by complex weather severely interferes with the stable correspondence between image color features and device temperature, resulting in the inaccuracy of visible light image temperature measurement in harsh environments such as fog, rain, and sandstorms. Therefore, there is an urgent need for a solution that can adapt to complex weather changes and accurately extract the temperature features of multiple targets under image degradation conditions to achieve high-precision temperature measurement. Summary of the Invention

[0005] This application provides a visible light image temperature detection method, system, device and medium, which aims to solve the technical problem of image degradation and reduced temperature measurement accuracy caused by complex weather such as fog, rain and sandstorms in the prior art, and realize accurate temperature detection of transformer equipment under complex weather conditions.

[0006] In a first aspect, this application provides a method for detecting temperature in a visible light image, comprising: Acquire visible light images; The visible light image is subjected to image restoration processing based on an atmospheric scattering model to obtain a restored image; the restored image is then input into a pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image. For each image region corresponding to the transformer device, the statistical features of the color distribution of the red, green and blue channels within the image region corresponding to the transformer device are extracted respectively. The statistical features of each color distribution are input into a pre-trained machine learning model to obtain the predicted temperature of the transformer equipment.

[0007] This application utilizes an atmospheric scattering model-based image restoration process to effectively remove image blurring and color shifts caused by weather factors such as haze and dust storms, restoring the true color distribution of the image and providing accurate input data for subsequent temperature prediction. A semantic segmentation model is used to perform pixel-level segmentation of the restored image, accurately locating each transformer equipment area, eliminating background interference, and ensuring that subsequent feature extraction targets only the target equipment, enabling simultaneous detection of multiple targets. Statistical features (such as mean and standard deviation) of various color distributions in the red, green, and blue channels are extracted from each equipment area to construct a chromaticity feature vector that reflects temperature changes, quantifying subtle color differences. A machine learning model is used to establish a mapping relationship between color features and temperature, outputting accurate temperature prediction values. Compared to existing technologies that directly measure the temperature of the entire image or rely on simple color mean values, this application can effectively suppress external interference under complex weather conditions, achieving high-precision temperature measurement of multiple transformer equipment and improving the accuracy of temperature measurement in complex environments.

[0008] Further, the image restoration processing based on an atmospheric scattering model is performed on the visible light image to obtain the restored image, specifically as follows: Dark channel extraction is performed on the visible light image to obtain a dark channel image; The atmospheric light value is determined based on the dark channel image; Based on the atmospheric light value and the atmospheric scattering model, the visible light image is restored by calculation to obtain the restored image.

[0009] This application utilizes the dark channel prior principle to accurately quantify the intensity of ambient light interference, and then restores the true color and details of the image through physical inversion of an atmospheric scattering model, effectively eliminating image degradation caused by weather factors such as haze and dust storms. The physical model-based restoration method can realistically restore the original appearance of the image, avoiding color distortion caused by traditional enhancement methods. This provides high-quality input images for subsequent semantic segmentation, feature extraction, and temperature prediction, ultimately improving the accuracy and reliability of device temperature detection under complex weather conditions.

[0010] Further, for each image region corresponding to the transformer device, the statistical features of the color distribution of the red, green, and blue channels within the image region corresponding to the transformer device are extracted, including: For each of the transformer devices, the grayscale value matrices of the image regions in the red, green, and blue channels are obtained respectively. Based on the grayscale value matrix of each color channel, calculate the grayscale distribution statistics of each color channel; wherein, the grayscale distribution statistics include mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper quantile, and lower quantile. The grayscale distribution statistics of the three color channels are combined to obtain the chromaticity feature vector corresponding to the image region.

[0011] This application, when predicting the temperature of each transformer equipment image area, extracts the grayscale value matrices of the red, green, and blue channels, and calculates nine grayscale distribution statistics for each channel: mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper quantile, and lower quantile, constructing a 27-dimensional chromaticity feature vector. This multi-dimensional color feature extraction method can comprehensively quantify the subtle color differences caused by temperature changes: the mean reflects the overall brightness level, the standard deviation characterizes the dispersion of the color distribution, kurtosis and skewness describe the morphological characteristics of the grayscale distribution curve, the mode and median provide robust estimates of central tendency, the peak value reflects the most frequently occurring grayscale level, and the upper and lower quantiles characterize the tail features of the grayscale distribution. By combining these multiple statistics, the subtle color shifts and changes in distribution morphology caused by temperature changes are transformed into quantifiable mathematical features, providing rich and discriminative input data for subsequent machine learning models to establish accurate "color-temperature" mapping relationships, thereby improving the accuracy and stability of temperature prediction.

[0012] Further, the step of inputting the statistical features of each color distribution into a pre-trained machine learning model to obtain the predicted temperature of the transformer equipment includes: For each image region corresponding to the transformer equipment, the statistical features of each color distribution are standardized to obtain a standardized feature vector. Each of the standardized feature vectors is input into a pre-trained machine learning model, and the machine learning model performs relational mapping on the standardized feature vectors to output the predicted temperature of the transformer equipment.

[0013] By standardizing the extracted chromaticity features, the imbalance in feature scale caused by differences in units and numerical ranges among different statistics is eliminated. This ensures that features of each dimension have equal contribution weights in the subsequent model, preventing large numerical features from dominating model training while small numerical features are ignored. The standardized feature vectors are then input into a pre-trained machine learning model, which calculates continuous temperature predictions based on the mapping relationship. The combination of standardization and the machine learning model establishes a stable mapping relationship from color features to temperature. The machine learning model then maps points in the multi-dimensional feature space to one-dimensional temperature values ​​based on the learned "color-temperature" correspondence.

[0014] Furthermore, the pre-trained machine learning model includes: Obtain a training sample set containing visible light images of multiple transformer devices; The training sample set is divided into a training subset and a validation subset using k-fold cross-validation. The training subset is used to train the preset regression model, and the validation subset is used to validate the trained regression model. The mean absolute error is used as the evaluation index. When the mean absolute error meets a preset threshold, the trained regression model is determined as the machine learning model.

[0015] Thus, this application provides an accurate supervised learning foundation for the model by constructing a training sample set with measured temperature labels; it employs k-fold cross-validation to divide the training and validation subsets, effectively eliminating the randomness of a single dataset partition through repeated validation, thereby improving the model's generalization ability and stability under different weather conditions; and it uses mean absolute error as the evaluation metric to directly optimize model accuracy towards the core objective of the regression task. These training methods ensure that the machine learning model can accurately learn and stably output the mapping relationship between color features and temperature, achieving high-precision and high-stability equipment temperature prediction even under complex weather conditions such as fog, rain, and sandstorms.

[0016] Further, the step of inputting the restored image into a pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image includes: The restored image is input into a pre-trained semantic segmentation model to obtain the device category label corresponding to each pixel in the restored image, thus obtaining a preliminary segmentation image; Morphological processing is performed on the preliminary segmented image to obtain the image region corresponding to each of the transformer devices.

[0017] By outputting the device category label corresponding to each pixel through a semantic segmentation model, a preliminary segmented image is obtained, achieving pixel-level classification of different transformer devices in the image. This provides a precise spatial localization foundation for subsequent temperature prediction for different devices. Based on this, morphological processing is further performed on the preliminary segmented image. By eliminating isolated noise and internal holes in the segmentation results, the boundary continuity and internal integrity of the device regions are optimized, providing clean and reliable target input for subsequent color feature extraction and temperature prediction.

[0018] Furthermore, before inputting the restored image into the pre-trained semantic segmentation model, the method further includes: Adjust the size of the restored image to a preset size.

[0019] By performing size normalization on the restored image, the image size is uniformly adjusted to a preset fixed size, which solves the problem that the fixed requirements of the semantic segmentation model for the input image size are inconsistent with the actual acquired image size. This ensures that all input images conform to the input specifications of the segmentation model and avoids the model being unable to process or processing errors due to size mismatch.

[0020] Secondly, this application provides a visible light image temperature detection system, including: an image acquisition module, a target segmentation module, and a temperature prediction module; The image acquisition module is used to acquire visible light images; The target segmentation module is used to perform image restoration processing based on an atmospheric scattering model on the visible light image to obtain a restored image; and input the restored image into a pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image. The temperature prediction module is used to extract the color distribution statistical features of the red, green and blue channels in the image area corresponding to each transformer device; and input the color distribution statistical features into a pre-trained machine learning model to obtain the predicted temperature of the transformer device.

[0021] In this way, by dividing the entire temperature detection process into three functionally independent modules, each with a clear responsibility and a clear and standardized data flow between modules, the difficulties in debugging and maintenance caused by functional coupling are avoided. This ensures the high efficiency and stability of the data processing process and improves the overall processing efficiency, system reliability, and engineering application feasibility of visible light image temperature detection tasks in scenarios such as substations.

[0022] Thirdly, this application provides a terminal device including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor. When the processor executes the computer program, it implements the visible light image temperature detection method as described in the first aspect above.

[0023] Fourthly, this application provides a computer-readable storage medium including a stored computer program, wherein, when the computer program is executed, it controls the device where the computer-readable storage medium is located to perform the visible light image temperature detection method as described in the first aspect above. Attached Figure Description

[0024] To more clearly illustrate the technical solution of this application, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained from these drawings without creative effort.

[0025] Figure 1 This is a flowchart of an embodiment of the visible light image temperature detection method provided in this application; Figure 2 This is a flowchart of an image restoration process according to an embodiment provided in this application; Figure 3 These are comparison images of image restoration effects according to one embodiment provided in this application; Figure 4 This is a schematic diagram of the segmentation mask colors corresponding to various devices in one embodiment of this application; Figure 5 This is a diagram illustrating the segmentation and extraction effect of different types of transformers according to an embodiment provided in this application; Figure 6 This is a graph showing the change in accuracy as a function of iterations during the training process of a semantic segmentation model according to an embodiment of this application. Figure 7 This is a schematic diagram of the visible light image temperature detection system in some embodiments of this application.

[0026] Labeling explanation: 100, Image acquisition module; 200, Target segmentation module; 300, Temperature prediction module. Detailed Implementation

[0027] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings of the embodiments. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0028] In the description of the embodiments of this application, technical terms such as "first" and "second" are used only to distinguish different objects and should not be construed as indicating or implying relative importance or implicitly specifying the number, specific order, or primary and secondary relationship of the indicated technical features. In the description of the embodiments of this application, "multiple" means two or more, unless otherwise explicitly defined.

[0029] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0030] In the description of the embodiments in this application, the term "and / or" is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Additionally, the character " / " in this document generally indicates that the preceding and following related objects have an "or" relationship.

[0031] In the daily operation and maintenance of power systems, transformers, as core equipment in the power transmission and transformation process, have their operating temperature as a key indicator reflecting their health status. Abnormal temperatures often indicate internal faults, poor contact, or overload operation. Currently, transformer temperature monitoring mainly relies on manual inspections and various non-contact temperature measurement technologies. These existing methods introduce temperature measurement errors in dusty or smoggy environments. Foggy weather causes images to appear bluish, dusty weather causes images to appear yellowish, and rainy weather causes images to become blurry. At the same time, the high reflectivity of metal surfaces is also easily affected by ambient light, resulting in false high-temperature readings. This image degradation distorts color features, directly leading to a decrease in the accuracy of temperature inversion.

[0032] To address this issue, this application proposes a method for temperature detection in visible light images. Please refer to [link / reference]. Figure 1 , Figure 1This is a flowchart of one embodiment of the visible light image temperature detection method provided in this application. To address the problem that in existing technologies, image degradation interferes with color features under complex weather conditions, leading to a decrease in the accuracy of device temperature detection for visible light images, an embodiment of this application provides a visible light image temperature detection method, including steps S1 to S5, each step as follows: Step S1: Acquire a visible light image; Step S2: Perform image restoration processing based on the atmospheric scattering model on the visible light image to obtain the restored image; Step S3: Input the restored image into the pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image; Step S4: For each transformer device, extract the color distribution statistical features of the red, green and blue channels within the image region corresponding to the transformer device. Step S5: Input the statistical features of each color distribution into the pre-trained machine learning model to obtain the predicted temperature of the transformer equipment.

[0033] The atmospheric scattering model is a physical model describing the attenuation of light propagating in media such as fog and haze. Its core logic is "degraded image = real scene image × transmittance + atmospheric light value × (1 - transmittance)," which is the theoretical basis for image restoration. By estimating the model parameters and performing inverse calculations, the original sharpness and color distribution of the image can be restored, thereby eliminating the interference of weather factors on image color.

[0034] Color distribution statistical features refer to quantitative indicators extracted from image color channels using statistical methods to describe the characteristics of color distribution. These include mean, standard deviation, kurtosis, and skewness. These statistical features can transform subtle color differences caused by temperature changes into quantifiable mathematical characteristics. In this application, nine statistical measures are extracted for each channel: mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper quantile, and lower quantile. A total of 27 features are extracted across the three channels. These statistical measures comprehensively capture subtle color differences caused by temperature changes from multiple perspectives, including overall level, dispersion, distribution pattern, central tendency, and tail features.

[0035] Semantic segmentation models are deep learning models capable of pixel-level classification of images, assigning a category label to each pixel to accurately delineate the contours of various targets. Machine learning models are algorithmic models that learn the mapping relationship between inputs and outputs through training data. In this embodiment, the machine learning model learns the correspondence between statistical features of color distribution and measured temperature, enabling it to predict color features and output the corresponding temperature value.

[0036] In actual operation, in step S1, visible light images of the transformer equipment in an outdoor environment can be acquired using a high-definition camera. The camera can be set at different positions and angles to comprehensively acquire image information of the equipment. The acquired images can undergo preliminary cropping and scaling to meet the needs of subsequent processing.

[0037] Specifically, in some embodiments, please refer to Figure 2 and Figure 3 , Figure 2 This is a flowchart of an image restoration process according to an embodiment provided in this application. Figure 3 This is a comparison image of the image restoration effect according to an embodiment provided in this application. In step S2, image restoration processing based on an atmospheric scattering model is performed on the visible light image to obtain the restored image. Specifically, it includes steps S21 to S23, and the specific steps are as follows: Step S21: Extract the dark channel from the visible light image to obtain the dark channel image; Step S22: Determine the atmospheric light value based on the dark channel image; Step S23: Based on the atmospheric light value and the atmospheric scattering model, perform restoration calculations on the visible light image to obtain the restored image.

[0038] The dark channel prior refers to the fact that in most non-sky regions of a clear, fog-free outdoor image, at least one color channel has a very low pixel value, even close to zero. Based on this prior, the dark channel is defined as the minimum value of the three RGB channels in a local region of the image, reflecting the degree of light scattering in suspended particles such as fog and haze.

[0039] Atmospheric light intensity reflects the scattering intensity of ambient light by suspended particles such as fog, haze, and dust, and is a key parameter in atmospheric scattering models. In dark channel images, the brightest pixels typically correspond to the areas with the densest fog or the sky at infinity, which best represent the intensity of atmospheric light.

[0040] The following specific example illustrates the image restoration process. This example uses a 1920×1080 resolution visible light image (Image I) of a transformer acquired in a foggy substation environment as the processing object. The image includes target equipment such as transformers, insulators, and disconnect switches. Due to fog interference, distant equipment is blurred, the overall color is bluish, and details are lost. Image restoration processing based on an atmospheric scattering model is performed on the visible light image, such as... Figure 2 As shown, the specific restoration steps for obtaining the restored image are as follows: In step S21, as Figure 2 As shown in the "Dark Channel Extraction" step, for the visible light image to be processed... Using the coordinates of each pixel in the image Define a local rectangular neighborhood centered on [the point]. Perform a minimum value filtering operation on this neighborhood and calculate the red channel values ​​respectively. Green Channel Blue Channel The minimum value of a pixel is then taken, and the minimum value among these three channel minimum values ​​is used as the dark channel value of that pixel. The mathematical expression is as follows: , in, This indicates the coordinates of the currently processed pixel. Indicates The local neighborhood centered on (usually taken) (Pixels or larger, adjustable depending on image resolution and weather scattering). Red ,green ,blue Three color channels, This indicates the original visible light image in the channel. medium pixel grayscale value at that location This indicates the calculated dark channel image at the pixel level. The dark channel value at that location. By iterating through all pixels of the original image, a dark channel image of the same size as the original image is obtained. In dark channel images, areas with higher grayscale values ​​typically correspond to areas of denser fog or brighter sky, while areas with lower grayscale values ​​correspond to clear, fog-free ground features. This image is crucial for subsequent estimations of atmospheric light values ​​and transmittance.

[0041] Specifically, in step S22, as Figure 2 As shown in the "Atmospheric Light Value Calculation" step, from the dark channel image Among them, select the one with the highest brightness value. Pixels, let the set of coordinates of these pixels be denoted as Map these coordinate points back to the visible light image. The corresponding position in the set Among all corresponding pixels, find the maximum grayscale value of the red, green, and blue channels, and determine this maximum value as the atmospheric light value. Atmospheric light value It is a scalar constant used for subsequent transmittance estimation and image restoration calculations.

[0042] Transmittance is a key parameter in atmospheric scattering models, representing the proportion of light reflected from objects in a scene that reaches the camera sensor directly without being scattered after propagating through media such as fog and haze. Transmittance ranges from 0 to 1: when transmittance is close to 1, it indicates that the light is almost unaffected, resulting in a clear image; when transmittance is close to 0, it indicates that the light is severely attenuated, resulting in a blurry image. In practical calculations, transmittance is estimated jointly using dark channel priors and atmospheric light values, with a lower limit (e.g., 0.1) set to avoid excessive noise amplification in the restored image due to excessively low transmittance. The transmittance map is a matrix of the same size as the original image, with each pixel corresponding to a transmittance value, used for pixel-by-pixel image restoration calculations.

[0043] In step S23, as Figure 2 The steps for "transmittance calculation" and "image restoration calculation" are shown in the diagram. First, the atmospheric scattering model describes the formation process of a foggy image: the observed image... It is composed of the portion of scene radiant light attenuated by the medium and the portion of ambient light scattered into the camera; its mathematical expression is: , in, It is the observed foggy image at the pixel level. Pixel value at that location, It is the clear image to be restored (i.e., the original image) at the pixel level. Pixel value at that location, It is a pixel. The transmittance at a certain point (representing the proportion of light that reaches the camera directly without attenuation), and A is the estimated atmospheric light value. To solve from the above model... Transmittance needs to be estimated first. Based on the dark channel prior and atmospheric light values, the formula for calculating transmittance is as follows: Here, ω is a constant between 0 and 1, used to control the degree of dehazing and avoid over-dehazing that could lead to image distortion; it is typically set to 0.95 or 0.99. The double-layer minimum operation in the formula is similar to the dark channel calculation, operating within a local neighborhood. Inner normalized image / A calculates the minimum value for the three RGB channels. To avoid excessively low transmittance leading to amplified noise in the restored image, a lower limit for transmittance can be set. (For example =0.1), and the final transmittance used is the maximum value between the calculated value and the lower limit, i.e., max( , The estimated atmospheric light value A and transmittance will be used to... Substituting the inverse transform formula of the atmospheric scattering model, we can obtain the restored image R at each pixel. Pixel value at: , By analyzing each pixel of the visible light image By performing point-by-point calculations, the restored values ​​of all pixels are obtained, and the image composed of these restored values ​​is the final restored image R.

[0044] Specifically, in some embodiments, in step S4, for each image region corresponding to a transformer device, the color distribution statistical features of the red, green, and blue channels within the image region corresponding to the transformer device are extracted, including: For each transformer device, the grayscale value matrix of the image region in the red, green and blue channels is obtained respectively; Based on the grayscale value matrix of each color channel, calculate the grayscale distribution statistics for each color channel; among which, the grayscale distribution statistics include mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper quantile, and lower quantile. By combining the grayscale distribution statistics of the three color channels, the chromaticity feature vector corresponding to the image region is obtained.

[0045] In the specific implementation, for each segmented transformer equipment image region, the grayscale values ​​of all pixels within that region are extracted in the red (R), green (G), and blue (B) channels. The grayscale values ​​of each channel form a two-dimensional matrix, with the number of rows and columns corresponding to the number of rows and columns of pixels in the corresponding image region. The grayscale values ​​range from 0 to 255, a total of 256 grayscale levels. A higher grayscale level indicates a higher brightness of the pixel in that channel.

[0046] Based on the principle of thermally modulated light reflection, the grayscale probability distribution curve of a device surface changes under different temperatures. To quantify this change, nine statistical measures are calculated for each color channel to comprehensively characterize the distribution features of the grayscale frequency curve. A channel's grayscale value matrix contains N pixels, and the grayscale value is denoted as... The definitions and calculation methods for each statistic are as follows: Mean: Reflects the overall brightness level of this channel, calculated using the following formula: , Standard deviation: reflects the dispersion of gray-level distribution, and is calculated using the following formula: , Kurtosis: Reflects the steepness of the gray-level distribution curve, and is calculated using the following formula: , Skewness: Reflects the asymmetry of gray-level distribution; the calculation formula is: , Mode: The value that appears most frequently in the grayscale values, reflecting the most common brightness level.

[0047] Median: The value in the middle after sorting gray values ​​from smallest to largest, reflecting the central trend of gray value distribution.

[0048] Peak value: The highest frequency value that appears in the gray-level distribution, that is, the highest point of the gray-level histogram.

[0049] superior Percentile: The value located at position α after sorting gray values ​​from smallest to largest. ,generally Take 0.75 or 0.9.

[0050] Down quantile: The value located at position 1-α after sorting gray values ​​from smallest to largest is denoted as quantile. Together with the upper α quantile, these nine statistical measures characterize the tail features of the grayscale distribution. They describe the overall level, dispersion, morphological characteristics, and tail properties of the channel's grayscale distribution from different perspectives, comprehensively capturing subtle color differences caused by temperature variations. Nine statistics from the red channel, nine statistics from the green channel, and nine statistics from the blue channel are concatenated sequentially to form a 27-dimensional feature vector, which serves as the chromaticity feature vector for the image region of the transformer equipment. Each dimension of this feature vector has a clear physical or statistical meaning, reflecting the pattern of color change on the equipment surface with temperature. In this application, the chromaticity feature vector needs to be standardized before being input into the model to eliminate differences in the dimensions and numerical ranges of the features. This vector will subsequently be used as input to a machine learning model for temperature prediction.

[0051] Specifically, in some embodiments, in step S5, the statistical features of each color distribution are input into a pre-trained machine learning model to obtain the predicted temperature of the transformer equipment, including: For each transformer device, the statistical features of each color distribution are standardized to obtain a standardized feature vector. Each standardized feature vector is input into a pre-trained machine learning model. The machine learning model performs relational mapping on the standardized feature vectors and outputs the predicted temperature of the transformer equipment.

[0052] Specifically, in some preferred embodiments, in step S5, the pre-trained machine learning model includes: Obtain a training sample set containing visible light images of multiple transformer devices; The k-fold cross-validation method is used to divide the training sample set into a training subset and a validation subset. The training subset is used to train the pre-set regression model, and the validation subset is used to validate the trained regression model. The mean absolute error is used as the evaluation index. When the mean absolute error meets the preset threshold, the trained regression model is determined as a machine learning model.

[0053] Standardization refers to the method of converting features with different dimensions and numerical ranges into a uniform scale. For example, Z-score standardization is used. For each feature dimension, the mean of the feature on the training set is subtracted, and then divided by the standard deviation of the feature on the training set. This makes the mean of each dimension of the standardized feature 0 and the variance 1, eliminating the influence of dimensions and enabling the model to learn each dimension of features equally.

[0054] The training sample set refers to the image data collection used to train the machine learning model. It consists of multiple samples, each containing a visible light image of the transformer equipment and its corresponding measured temperature value. The measured temperature value is obtained synchronously using an infrared thermal imager and serves as the true label for model training. The training sample set should cover different equipment types (such as transformers and reactors), different weather conditions (fog, rain, dust storms), and different lighting environments (morning and afternoon) to ensure that the model has good generalization ability.

[0055] k-fold cross-validation involves dividing the training set into k equal subsets. Each time, k-1 subsets are used as the training set, and the remaining subset is used as the validation set. This process is repeated k times, ensuring each subset is used for validation once. The average of the k validation results is then used as the performance metric for the model. This method effectively utilizes limited data, avoids the randomness of a single dataset partition, and improves the reliability of model evaluation.

[0056] Mean absolute error (MAE) is the average of the absolute differences between predicted and actual values, and it is a commonly used evaluation metric in regression tasks. MAE directly reflects the average magnitude of the prediction error; the smaller the value, the higher the prediction accuracy.

[0057] Regression models refer to machine learning models used to predict continuous numerical values. The machine learning models in this application involve four types of regression models: k-nearest neighbor regression (predicting based on the k nearest samples in the feature space), decision tree regression (partitioning and predicting the feature space through a tree structure), gradient boosting regression tree (training multiple weak learners iteratively and combining them with weights), and random forest regression (integrating multiple decision trees and averaging the prediction results).

[0058] In practice, before applying the model, it is necessary to first construct a machine learning model that can accurately map the relationship between color features and temperature using a training sample set. This training process includes the following steps: Establish a visible light image database for different targets. For example, at a substation site, use high-definition cameras to collect visible light images of different equipment (such as transformers and reactors), and simultaneously use an infrared thermal imager to record the measured surface temperature of the equipment at each shooting moment, serving as the true temperature label for the image. Randomly divide the collected images into training and testing sets according to a certain ratio; for example, use 80% of the images as the training set and 20% as the testing set. The training set is used for model training, and the testing set is used for final model performance evaluation.

[0059] For each image in the training set, image restoration and semantic segmentation are first performed to obtain the image region corresponding to each transformer device. Then, the color distribution statistical features of the three channels (red (R), green (G), and blue (B)) are extracted for each device region. Nine statistical measures are calculated for each channel: mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper alpha quantile, and lower alpha quantile. The three channels together form a 27-dimensional feature vector, which constitutes the color feature vector of the region.

[0060] Because different statistical measures have different units and numerical ranges, directly inputting them into the model may lead to large numerical features dominating model training. Therefore, the extracted 27-dimensional features are standardized using the following formula: , in, and These are the mean and standard deviation of that feature dimension on the training set, respectively. The standardized feature vector has a mean of 0 and a variance of 1 for each dimension, eliminating the influence of dimensions and enabling the model to better learn each feature dimension.

[0061] This embodiment employs k-fold cross-validation for model training and validation to fully utilize limited data and ensure the model's generalization ability. The training set is divided into k groups (k=10 in this embodiment, i.e., ten-fold cross-validation). Each time, k-1 groups are used as the training subset, and the remaining group as the validation subset. This process is repeated k times, ensuring that each group is used for validation once. On each training subset, four regression models are trained: k-nearest neighbor regression (KNN), decision tree regression (DT), gradient boosting regression tree (GBRT), and random forest regression (RFR). During training, standardized 27-dimensional features are used as input, and the corresponding measured temperatures are used as output. During each validation, the trained model is used to predict the validation subset, and the mean absolute error (MAE) between the predicted and actual temperatures is calculated. , in, To predict temperature, Let n be the measured temperature and n be the number of samples in the validation subset. For each regression model, 10-fold cross-validation was performed twice, and the average MAE of all validation results was taken as the performance evaluation metric for that model. The model with the smallest average MAE was selected as the final machine learning model, or either model could be chosen if the MAE of all models met a preset threshold (e.g., 0.2°C). Experiments showed that the KNN and RFR models were particularly stable, with MAE controlled within 0.1°C.

[0062] For a new visible light image to be tested, following the exact same method as in the training phase, the 27-dimensional color distribution statistical features of each transformer equipment region obtained after image restoration and semantic segmentation are extracted. Then, the standard deviation parameter (i.e., the mean of each dimension of the training set) saved in the training phase is used. ) and standard deviation The current features are standardized to obtain a standardized feature vector. This standardized feature vector is then input into the trained machine learning model. The model performs regression calculations on the input features based on the learned mapping relationships and outputs the predicted temperature value for the area covered by the device.

[0063] The technical effects of this embodiment will be explained below with reference to specific experimental data. The experiment was carried out in a real substation environment. By simulating image acquisition and processing under complex weather conditions, the feasibility and accuracy of the method were comprehensively evaluated.

[0064] In this embodiment, the experimental site was a 220kV substation, and the data acquisition equipment was deployed at two different locations: Scenario 1 was to photograph the 220kV line reactor from the balcony of the high-voltage room, and the data acquisition time was in the morning; Scenario 2 was to photograph the 220kV line transformer (phase A, phase a, and phase B) from the balcony of the high-voltage room, and the data acquisition time was in the afternoon.

[0065] In practice, image acquisition utilizes an infrared digital dual-lens camera, controlled by the accompanying iVMS-4800 software. It automatically records a visible light image (1920×1080 resolution, 1080P) and the corresponding infrared thermal image every 3 seconds. All data is stored in real-time on a laptop computer on-site. The infrared image is used to obtain the measured temperature of the device surface, serving as a real label for model training and validation; the visible light image serves as the input source for this method.

[0066] To comprehensively evaluate the adaptability of the method under different conditions, the ambient light intensity was recorded every 15 minutes during the experiment, and an infrared rangefinder was used to measure the distance between the camera and the subject, ensuring the traceability of environmental parameters for data acquisition. A total of 1662 images were obtained from the two scenarios: a reactor image library and a transformer image library (covering phases A, a, and B).

[0067] The acquired images were processed using a dark channel prior algorithm based on an atmospheric scattering model combined with automatic color equalization enhancement technology to remove fog and noise from the original visible light images, restoring image clarity and true colors. The DeepLabv3+ semantic segmentation model was then used to perform pixel-level segmentation of the restored images, extracting different equipment regions such as insulators, transformers, and reactors. The segmentation mask colors corresponding to each type of equipment are as follows: Figure 4 As shown, Figure 4 This is a schematic diagram of the segmentation mask colors corresponding to various devices in one embodiment of this application; Figure 4 This diagram illustrates the mask colors (RGB values) assigned by the semantic segmentation model to various types of transformer equipment. Each equipment type corresponds to a unique RGB triplet; for example, transformers correspond to (255, 182, 000) (yellow), glass insulators to (154, 032, 121) (magenta), and open-type disconnect switches to (162, 000, 255) (purple). Through this color encoding, the segmentation model can output a corresponding equipment category label for each pixel in the reconstructed image, thereby generating a color mask image that visually distinguishes different equipment regions. This mask color diagram provides accurate spatial positioning data for subsequent category-based cropping of equipment regions and extraction of color distribution statistical features.

[0068] The segmented image regions were cropped to a uniform size of 120×130 pixels. For each device region, the grayscale distribution statistical features of the red (R), green (G), and blue (B) channels were extracted, including mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper alpha quantile, and lower alpha quantile, resulting in a 27-dimensional chromaticity feature vector. The extracted 27-dimensional features were standardized, and four regression models were trained using ten-fold cross-validation: k-nearest neighbor regression (kNN), decision tree regression (DT), gradient boosting regression tree (GBRT), and random forest regression (RFR). Mean absolute error (MAE) was used as the model performance evaluation metric.

[0069] The temperature ranges for each equipment module and the prediction errors (MAE) of the four models are shown in Table 1. In terms of temperature range: Transformer A phase module: 44.6–48.7°C; Transformer a phase module: 44.5–51°C; Transformer B phase module: 49.1–55.6°C; Transformer A+a+B combination module: 44.5–55.6°C; Reactor module: 26.2–27.7°C.

[0070] Table 1 Comparison of Results of Four Machine Learning Algorithms In terms of prediction error, the KNN model performed best: the MAE for transformer phase A was 0.029℃, for transformer phase a was 0.091℃, for transformer phase B was 0.029℃, for transformer combination was 0.077℃, and for reactor was 0.053℃. The RFR model also performed well: the MAEs for each library were 0.031℃, 0.085℃, 0.034℃, 0.053℃, and 0.083℃, respectively. The DT and GBRT models had slightly higher errors, but all were kept within 0.2℃.

[0071] Experimental results show that the MAE of all models is within 0.2℃, with the errors of the KNN and RFR models basically controlled within 0.1℃, which meets the accuracy requirements for temperature monitoring of industrial equipment. The model prediction performance remains consistent under different equipment types and temperature ranges, without significant fluctuations. Under different lighting conditions in the morning and afternoon, and under complex weather conditions (including fog, haze, and cloudy days during the experiment), this method still maintains stable temperature measurement accuracy.

[0072] Specifically, in some embodiments, please refer to Figure 5 and Figure 6 , Figure 5 This is a diagram illustrating the segmentation and extraction effect of different types of transformers according to an embodiment provided in this application. Figure 6 This is a graph showing the accuracy of a semantic segmentation model during training, as per the number of iterations, according to an embodiment of this application. In step S3, the restored image is input into the pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image, including: The restored image is input into a pre-trained semantic segmentation model to obtain the device category label corresponding to each pixel in the restored image; A preliminary segmentation image is generated based on the device category label of each pixel; Morphological processing is performed on the initially segmented image to obtain the image regions corresponding to each transformer device.

[0073] Morphological processing refers to mathematical transformation operations based on image morphology, used to process the local structure of the segmented image. This embodiment mainly includes opening and closing operations. Opening involves first performing an erosion operation on the image, followed by a dilation operation on the eroded result. Erosion, by replacing the value of each pixel with the minimum value in its neighborhood, can eliminate isolated, small noise regions in the image; dilation, by replacing the value of each pixel with the maximum value in its neighborhood, can restore the slightly shrunken boundary of the target region due to erosion. Opening is mainly used to remove false-detection noise in the segmentation result. Closing involves first performing a dilation operation on the image, followed by an erosion operation on the dilated result. Dilation can fill small holes and cracks inside the target region; erosion can shrink over-expanded boundaries back to a reasonable position. Closing is mainly used to fill holes inside the device region in the segmentation result, keeping the target region continuous and intact. By combining opening and closing operations in sequence, the boundary smoothness and regional integrity of the segmented image can be improved.

[0074] In an optional embodiment, the semantic segmentation model employs the DeepLab v3+ architecture, and its training process is as follows: First, a sample image set containing various transformer devices is constructed. The LabelMe annotation tool is used to annotate the training set images at the pixel level, generating corresponding mask images. The dataset contains device annotations for 15 categories, including insulators, disconnectors, transformers, reactors, and other common equipment in substation environments. Each device category corresponds to a unique label value, represented by different RGB color values ​​in the mask image. The annotated training set is input into the DeepLab v3+ model for training. Through iterative optimization, the model is able to accurately identify the device category to which each pixel belongs. The accuracy changes with the number of iterations during training as follows: Figure 6 As shown, the model accuracy gradually improves and tends to stabilize as the number of training rounds increases.

[0075] The restored image is input into the trained semantic segmentation model, which performs forward computation on the image and outputs a device category label for each pixel. These labels are consistent with the category system established during the training phase; for example, all pixels in the insulator region are labeled as "insulator" and all pixels in the transformer region are labeled as "transformer".

[0076] Based on the device category label for each pixel, a preliminary segmentation image of the same size as the restored image is generated. In this image, the value of each pixel is the corresponding category label. For ease of visualization, different categories are usually mapped to different colors, forming an image like this: Figure 5The color segmentation result is shown below. The preliminary segmentation image can intuitively display the position and outline of each transformer device in the image. However, due to the imperfection of the model prediction, there may be two types of problems in the image: one is isolated noise, that is, a small number of pixels that do not belong to any device area are incorrectly assigned to a certain device category; the other is holes, that is, a small number of pixels that should belong to the device inside the device area are not correctly identified, forming small gaps.

[0077] To eliminate the aforementioned noise and holes, morphological processing is performed on the preliminary segmentation image. First, an opening operation is performed on the preliminary segmentation image. The opening operation consists of two basic operations: erosion and dilation. The erosion operation traverses the image, replacing the value of each pixel with the minimum value in its neighborhood, thus eliminating isolated small noise points. The dilation operation then replaces the value of each pixel with the maximum value in its neighborhood, restoring the slightly shrunken boundaries of the device regions. Through the opening operation, scattered, small mis-segmented regions in the preliminary segmentation image can be effectively removed. Next, a closing operation is performed on the image after the opening operation. The closing operation consists of two basic operations: dilation and erosion. The dilation operation fills in the small holes and gaps inside the device regions. The erosion operation then restores the slightly expanded boundaries to their proper positions. Through the closing operation, the internal continuity of each device region is maintained, eliminating holes caused by inaccurate model predictions. After the above morphological processing, optimized image regions corresponding to each transformer device are obtained. These regions are characterized by clear boundaries, internal integrity, and the absence of isolated noise points, laying a good foundation for the accurate extraction of subsequent color distribution statistical features.

[0078] like Figure 5 As shown, the left side is the original image, and the right side is the segmentation result after semantic segmentation and morphological processing. It can be seen from the image that different devices such as insulators and transformers are accurately segmented, each forming an independent image region, with different categories distinguished by different colors. The boundaries of each device region are smooth, and there are no holes inside, indicating high segmentation quality that meets the requirements for subsequent feature extraction and temperature prediction.

[0079] Specifically, in some embodiments, before step S3, before inputting the restored image into the pre-trained semantic segmentation model, the following steps are also included: Adjust the size of the restored image to the preset size.

[0080] In practice, the restored image can be normalized to an m×n size.

[0081] Please refer to Figure 7 , Figure 7 This is a schematic diagram of the visible light image temperature detection system in some embodiments of this application. The system includes: an image acquisition module 100, a target segmentation module 200, and a temperature prediction module 300. Image acquisition module 100 is used to acquire visible light images; The target segmentation module 200 is used to perform image restoration processing based on the atmospheric scattering model on the visible light image to obtain the restored image; the restored image is input into the pre-trained semantic segmentation model to obtain the image region corresponding to each transformer device in the restored image; The temperature prediction module 300 is used to extract the color distribution statistical features of the red, green and blue channels in the image area corresponding to each transformer device; and input the color distribution statistical features into a pre-trained machine learning model to obtain the predicted temperature of the transformer device.

[0082] It is understood that the above system embodiments correspond to the method embodiments of this application, and can implement the visible light image temperature detection method provided by any of the above method embodiments of this application.

[0083] It should be noted that the system embodiments described above are merely illustrative, and some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Furthermore, in the accompanying drawings of the system embodiments provided in this application, the connection relationships between modules indicate that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines. Those skilled in the art can understand and implement this without any creative effort.

[0084] For ease of description and brevity, the system embodiments of this application include all the implementation methods described in the above-described visible light image temperature detection method embodiments, and will not be repeated here.

[0085] Based on the above embodiments of the visible light image temperature detection method, another embodiment of this application provides a terminal device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor. When the processor executes the computer program, it implements the visible light image temperature detection method provided in any of the above-described method embodiments of this application.

[0086] For example, in this embodiment, the computer program can be divided into one or more modules, which are stored in the memory and executed by the processor to complete this application. The one or more module units may be a series of computer program instruction segments capable of performing a specific function, which describe the execution process of the computer program in the terminal device.

[0087] The terminal device may be a desktop computer, laptop, handheld computer, or cloud server, etc. The terminal device may include, but is not limited to, a processor and a memory.

[0088] The processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor can be a microprocessor or any conventional processor. The processor is the control center of the terminal device, connecting all parts of the terminal device via various interfaces and lines.

[0089] Based on the above-described method embodiments, another embodiment of this application provides a computer-readable storage medium including a stored computer program, wherein, when the computer program is executed, it controls the device where the computer-readable storage medium is located to execute the visible light image temperature detection method provided in any of the above-described method embodiments of this application.

[0090] The modules / units integrated in the device / terminal equipment, if implemented as software functional units and sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include: any entity or device capable of carrying the computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc.

[0091] The above description is the preferred embodiment of this application. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of this application, and these improvements and modifications are also considered to be within the scope of protection of this application.

Claims

1. A method for detecting temperature in a visible light image, characterized in that, include: Acquire visible light images; The visible light image is subjected to image restoration processing based on an atmospheric scattering model to obtain a restored image; The restored image is input into a pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image; For each image region corresponding to the transformer device, the statistical features of the color distribution of the red, green and blue channels within the image region corresponding to the transformer device are extracted respectively. The statistical features of each color distribution are input into a pre-trained machine learning model to obtain the predicted temperature of the transformer equipment.

2. The visible light image temperature detection method according to claim 1, characterized in that, The process of performing image restoration processing based on an atmospheric scattering model on the visible light image to obtain the restored image is specifically as follows: Dark channel extraction is performed on the visible light image to obtain a dark channel image; The atmospheric light value is determined based on the dark channel image; Based on the atmospheric light value and the atmospheric scattering model, the visible light image is restored by calculation to obtain the restored image.

3. The visible light image temperature detection method according to claim 1, characterized in that, For each image region corresponding to the transformer device, the statistical features of the color distribution of the red, green, and blue channels within the image region corresponding to the transformer device are extracted, including: For each of the transformer devices, the grayscale value matrices of the image regions in the red, green, and blue channels are obtained respectively; Based on the grayscale value matrix of each color channel, calculate the grayscale distribution statistics of each color channel; wherein, the grayscale distribution statistics include mean, standard deviation, kurtosis, skewness, mode, median, peak value, upper quantile, and lower quantile. The grayscale distribution statistics of the three color channels are combined to obtain the chromaticity feature vector corresponding to the image region.

4. The visible light image temperature detection method according to claim 1, characterized in that, The step of inputting the statistical features of each color distribution into a pre-trained machine learning model to obtain the predicted temperature of the transformer equipment includes: For each image region corresponding to the transformer equipment, the statistical features of each color distribution are standardized to obtain a standardized feature vector. Each of the standardized feature vectors is input into a pre-trained machine learning model, and the machine learning model performs relational mapping on the standardized feature vectors to output the predicted temperature of the transformer equipment.

5. The visible light image temperature detection method according to claim 4, characterized in that, The pre-trained machine learning model includes: Obtain a training sample set containing visible light images of multiple transformer devices; The training sample set is divided into a training subset and a validation subset using k-fold cross-validation. The training subset is used to train the preset regression model, and the validation subset is used to validate the trained regression model. The mean absolute error is used as the evaluation index. When the mean absolute error meets a preset threshold, the trained regression model is determined as the machine learning model.

6. The visible light image temperature detection method according to claim 1, characterized in that, The step of inputting the restored image into a pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image includes: The restored image is input into a pre-trained semantic segmentation model to obtain the device category label corresponding to each pixel in the restored image; Based on the device category label of each pixel, a preliminary segmentation image is generated; Morphological processing is performed on the preliminary segmented image to obtain the image region corresponding to each of the transformer devices.

7. The visible light image temperature detection method according to claim 1, characterized in that, Before inputting the restored image into the pre-trained semantic segmentation model, the method further includes: The size of the restored image is adjusted to a preset size.

8. A visible light image temperature detection system, characterized in that, include: Image acquisition module, target segmentation module, and temperature prediction module; The image acquisition module is used to acquire visible light images; The target segmentation module is used to perform image restoration processing based on an atmospheric scattering model on the visible light image to obtain a restored image; and input the restored image into a pre-trained semantic segmentation model to obtain the image regions corresponding to each transformer device in the restored image. The temperature prediction module is used to extract the color distribution statistical features of the red, green and blue channels in the image area corresponding to each transformer device, respectively. The statistical features of each color distribution are input into a pre-trained machine learning model to obtain the predicted temperature of the transformer equipment.

9. A terminal device, characterized in that, The method includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein when the processor executes the computer program, it implements the visible light image temperature detection method as described in any one of claims 1-7.

10. A computer-readable storage medium, characterized in that, include: A stored computer program, wherein, when the computer program is executed, it controls the device containing the computer-readable storage medium to perform the visible light image temperature detection method as described in any one of claims 1-7.