A method for predicting cleaning opportunity of photovoltaic dust accumulation for operation and maintenance decision

By constructing counterfactual cleaning reference trajectories at edge computing nodes, the problem of distinguishing between dust accumulation loss in photovoltaic arrays and weather or equipment status is solved, enabling stable prediction of cleaning timing and improved power generation revenue.

CN122243460APending Publication Date: 2026-06-19JINGNENGSHEN (SUZHOU) ENERGY TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JINGNENGSHEN (SUZHOU) ENERGY TECH CO LTD
Filing Date
2026-03-13
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Under edge computing conditions, existing technologies cannot effectively distinguish between dust accumulation loss in local photovoltaic arrays and disturbances caused by weather, temperature, and equipment status. This leads to unstable cleaning judgments, easily changing conclusions before and after cleaning, and inconsistent power generation benefits after cleaning.

Method used

By constructing a counterfactual cleaning reference trajectory at the edge computing node, and using power generation output, environmental perception, and equipment status data, a cleaning benefit trajectory is generated to determine the cleaning trigger period, and timing is matched by combining operational resources and operational constraint data.

🎯Benefits of technology

It improves the stability and consistency of clean energy decision-making, reduces the impact of non-dust accumulation factors on the assessment of clean energy benefits, and enhances the power generation efficiency of photovoltaic arrays.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122243460A_ABST
    Figure CN122243460A_ABST
Patent Text Reader

Abstract

This invention discloses a method for predicting the timing of dust accumulation cleaning in photovoltaic power plants, specifically relating to the field of intelligent operation and maintenance of photovoltaic power plants. The method includes acquiring edge monitoring data of a target local array within the current prediction period. This edge monitoring data includes power generation output data, environmental perception data, equipment status data, and historical cleaning records. Based on the edge monitoring data, time-series alignment and segmentation are performed to generate a current state sequence of the target local array. A set of historical operating segments of the target local array is read, and candidate selection is performed based on the similarity between the current state sequence and each historical operating segment in terms of environmental change processes, output change processes, and state change processes. By constructing a counterfactual cleaning reference trajectory corresponding to the current state sequence and solving for the output difference time-by-time, a cleaning reference for the target local array at the current moment can be provided, relatively improving the problem of difficulty in distinguishing dust accumulation loss from weather, temperature, and equipment status disturbances.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent operation and maintenance technology for photovoltaic power plants, and more specifically, to a method for predicting the timing of cleaning accumulated dust in photovoltaic power plants for operation and maintenance decision-making. Background Technology

[0002] In the operation and maintenance of photovoltaic power plants, most existing technologies aim to solve the problem of when it is more appropriate to arrange cleaning. The common practice is to first collect data on module power, irradiance, temperature, wind speed, rainfall, and images or dust-related data, and then determine whether the current power generation decline is related to dust accumulation based on this data. Combined with preset thresholds, historical better power generation levels, or comparison results with adjacent branches, a judgment is made on whether cleaning is needed. This approach can be effective when regional conditions are relatively stable and there are few influencing factors. In practical applications, especially in large photovoltaic power plants that adopt edge IoT architecture, the orientation, tilt angle, shading conditions, surface dust conditions and equipment status of components in different areas of the same power plant are often inconsistent. On-site factors such as cloud disturbance, component temperature rise, inverter limiting, branch mismatch and device aging often exist at the same time. Furthermore, the system needs to be processed locally by edge nodes under weak grid conditions. It cannot rely on the full data back to the cloud for a long time, nor can it rely on frequent manual verification. In this situation, the existing practices will continue to produce the following phenomenon: some areas, which are clearly experiencing a short-term drop in power generation due to weather changes, temperature changes, or changes in equipment status, are judged to be due to increased dust accumulation. In some areas, although there is indeed an impact from dust accumulation, the final cleaning judgment is not stable because it is based on the better power generation value of a certain historical day or the operating results of neighboring branches. The conclusions of different time periods are prone to change, and the actual power generation benefits restored after cleaning are often inconsistent with the prior judgment. Further analysis reveals that the problem is not that existing technologies cannot compare data, but that they lack a reliable reference that truly represents the power generation status that the local array should have after cleaning at the current moment. Therefore, they cannot reliably distinguish between the losses caused by dust accumulation and the losses caused by weather, temperature, aging, and control factors. Based on this, the technical problem to be solved by this application is how to establish a cleaning reference for a local photovoltaic array under edge computing conditions, which can be used for judgment at the current moment, so as to predict the actual cleaning benefits and cleaning timing of the local array. Summary of the Invention

[0003] To overcome the aforementioned deficiencies in the prior art, embodiments of the present invention provide a photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making. This method constructs a counterfactual cleaning reference trajectory of the target local array at the current moment based on historical cleaning recovery segments at the edge computing node side, and performs difference attribution between the current power generation output and the counterfactual cleaning reference trajectory to generate a cleaning benefit trajectory and matches the executable time period to determine the cleaning trigger time period, thereby solving the problems mentioned in the background art.

[0004] To achieve the above objectives, the present invention provides the following technical solution: a method for predicting the timing of photovoltaic dust accumulation cleaning for operation and maintenance decision-making, comprising: S1. Obtain edge monitoring data of the target local array within the current prediction period. The edge monitoring data includes power generation output data, environmental perception data, equipment status data, and historical cleaning records. Perform time-series alignment and segmentation based on the edge monitoring data to generate the current state sequence of the target local array. S2. Read the set of historical running segments of the target local array, perform candidate filtering according to the similarity between the current state sequence and each historical running segment in the process of environmental change, output change and state change, and generate a set of candidate reference segments corresponding to the target local array. S3. Perform a clean-up power generation characterization rollback calculation on the candidate reference segment set, extract the clean recovery features corresponding to the current state sequence in each candidate reference segment, and perform reference fusion based on the matching consistency between each candidate reference segment and the current state sequence to generate the counterfactual clean reference trajectory of the target local array in the current prediction period. S4. Solve the time-by-time difference between the current state sequence and the counterfactual cleaning reference trajectory, separate the effective loss caused by the ash accumulation effect and the disturbance loss caused by non-ash accumulation factors, and generate the cleaning benefit trajectory of the target local array based on the continuous change of the effective loss in the subsequent prediction interval. S5. Perform timing matching calculations between the cleaning revenue trajectory and the preset operation and maintenance execution conditions to determine the target cleaning triggering period of the target local array, and output the cleaning timing prediction results corresponding to the target cleaning triggering period.

[0005] In a preferred embodiment, S1 includes: S1-1. Perform time mapping and missing data completion processing on the power generation output data, environmental perception data, equipment status data and historical cleaning records according to the unified sampling time scale to generate a data alignment set corresponding to each sampling time. The power generation output data includes numerical variables that can be directly collected by metering devices, such as array DC voltage, array DC current, inverter AC active power, inverter AC reactive power, cumulative power generation, and sampling timestamps; environmental sensing data includes numerical variables output by meteorological sensors, such as total horizontal irradiance or module planar irradiance, ambient temperature, module backsheet temperature, relative humidity, wind speed, wind direction, rainfall, and their corresponding sampling timestamps; equipment status data includes discrete or numerical variables that can characterize equipment operating conditions and availability, such as inverter operating status codes, power limit flags, power limit settings, fault alarm codes, string open circuit status, insulation impedance values, and communication link quality indicators; and historical cleaning records include structured event variables that can be used to trace cleaning behavior, such as cleaning start time, cleaning end time, cleaning object identifier, cleaning method identifier, cleaning water consumption or equipment operation frequency, power generation difference before and after cleaning, and personnel or equipment identifiers. S1-2. Perform segmentation and slicing on the data alignment set according to the preset continuous time period division rules, extract the power generation output change characteristics, environmental change characteristics, equipment status change characteristics and cleaning behavior correlation characteristics in each segment, and generate data feature fragments corresponding to each segment. The power generation output variation characteristics include the amplitude of array DC voltage variation, array DC current variation, AC active power variation, cumulative power generation increment, direction of change of each output variable, and rate of change of each output variable; environmental variation characteristics include the amplitude of irradiance variation, ambient temperature variation, component temperature variation, humidity variation, wind speed variation, wind direction variation, rainfall variation, and continuity of change of each environmental variable; equipment status variation characteristics include the number of inverter operating status code switching, power limit flag changes, fault alarm code changes, string on / off status switching, communication quality index fluctuations, and duration of each equipment operating condition variable; cleaning behavior correlation characteristics include the time location corresponding to the cleaning start time and end time, cleaning object identifier, cleaning method identifier, power generation change before and after cleaning, duration of cleaning action, and degree of power generation recovery response after cleaning; S1-3. Perform concatenation encoding according to the time sequence of each data feature segment to generate a current state sequence that represents the evolution process of the current operating state of the target local array.

[0006] In a preferred embodiment, step S2 includes the following steps: S2-1. Slice the current state sequence and each historical operation segment into segments with the same time length, and extract the time-series change values ​​of irradiance, ambient temperature, component temperature, wind speed, and rainfall in each segment as the environmental change process. Extract the time-series change values ​​of array DC voltage, array DC current, AC active power, and cumulative power generation increment as the output change process. Extract the time-series change values ​​of inverter operating status code, power limit flag, fault alarm code, string on / off status, and communication quality indicators as the state change process. Generate the current process feature segment and the historical process feature segment. S2-2. Compare each current process feature segment with its corresponding historical process feature segment according to time position, calculate the cumulative result of the environmental variable change difference at each time, the cumulative result of the output variable change difference at each time, and the combined result of the state variable change difference and the state switching difference number at each time, and generate the environmental process difference value, output process difference value and state process difference value corresponding to each historical running segment. S2-3. Sort the environmental process difference value, output process difference value and state process difference value of each historical running segment in the same direction, retain the historical running segments in which the three types of process difference values ​​are simultaneously in the first position of the sort, and generate a set of candidate reference segments corresponding to the target local array.

[0007] In a preferred embodiment, S3 includes: S3-1. Locate the cleaning start time and cleaning end time for each candidate reference segment in the candidate reference segment set. Extract the post-cleaning output sequence after the cleaning end time that is the same length as the current prediction time period. Subtract the post-cleaning output sequence from the pre-cleaning output sequence of the candidate reference segment that is located before the cleaning start time and has the same position as the current prediction time period according to time. Generate the cleaning recovery feature sequence corresponding to each candidate reference segment. S3-2. Add the environmental process difference value, output process difference value and state process difference value of each candidate reference segment to generate the total segment difference value, and divide the reciprocal of the total segment difference value of each candidate reference segment by the sum of the reciprocals of the total segment difference values ​​of all candidate reference segments to generate the matching consistency coefficient corresponding to each candidate reference segment.

[0008] In a preferred embodiment, S3 further includes: S3-3. Map the clean recovery feature sequence of each candidate reference segment to the sampling time corresponding to the current state sequence according to the time position. Based on the output value of the current power generation output sequence at each sampling time, add the clean recovery increment of the corresponding sampling time step by step to generate the candidate recovery output sequence corresponding to each candidate reference segment. Then, calculate the deviation correction amount of each candidate recovery output sequence from the environmental change process and state change process in the current state sequence at the same sampling time. Perform position weighted fusion of each corrected candidate recovery output sequence according to the corresponding matching consistency coefficient step by step to generate the counterfactual clean reference trajectory of the target local array continuously unfolding along the time sequence in the current prediction period.

[0009] In a preferred embodiment, S4 includes: S4-1. Match the current power generation output sequence in the current state sequence with the counterfactual clean reference trajectory according to the same sampling time, calculate the output difference at each sampling time, and synchronously read the irradiance change value, ambient temperature change value, component temperature change value, wind speed change value, rainfall change value, inverter operation status code change value, power limit flag change value, fault alarm code change value, and string on / off status change value corresponding to each sampling time to generate a time-by-time difference data set. S4-2. For each time-by-time difference data set, calculate the environmental disturbance interpretation value and the state disturbance interpretation value respectively. The environmental disturbance interpretation value is the sum of the results obtained by multiplying the irradiance change value, ambient temperature change value, component temperature change value, wind speed change value, and rainfall change value at each sampling time by the response ratio of the same type of variable to the output difference in the corresponding historical operation segment. The state disturbance interpretation value is the sum of the output offsets corresponding to the inverter operating status code change value, power limit flag change value, fault alarm code change value, and series on / off status change value at each sampling time. Then, subtract the environmental disturbance interpretation value and the state disturbance interpretation value from the output difference at each sampling time to generate the residual difference at each sampling time.

[0010] In a preferred embodiment, S4 further includes: S4-3. Arrange the residual differences at each sampling time in chronological order, calculate the difference between the residual differences at adjacent sampling times, and determine the consecutive sampling periods where the absolute value of the difference between the residual differences at adjacent sampling times is less than the absolute value of the residual difference at the previous sampling time as smooth continuation segments, and determine the remaining sampling periods as abrupt disturbance segments. Then, record the residual differences in the smooth continuation segments as effective loss candidate values, and record the residual differences in the abrupt disturbance segments as disturbance loss candidate values.

[0011] In a preferred embodiment, S4 further includes: S4-4. Read the power generation recovery sequence after each clean event in the historical clean record. Compare each smooth continuation segment with each power generation recovery sequence according to the principle of consistent time length. Calculate the cumulative result of the absolute value of the difference at the corresponding time. Determine the power generation recovery sequence with the smallest cumulative result as the clean corresponding recovery sequence of that smooth continuation segment. Then, determine the effective loss candidate value that is consistent with the change direction of the clean corresponding recovery sequence as the effective loss amount. Incorporate the effective loss candidate values ​​that are inconsistent with the change direction into the disturbance loss amount.

[0012] In a preferred embodiment, S4 further includes: S4-5. Extend the effective loss amount to subsequent prediction intervals in chronological order. At each future sampling time, subtract the effective loss amount of the previous sampling time from the scouring reduction amount formed by the change in rainfall at the corresponding future sampling time, and add it to the dust increase formed by the change in wind speed to generate the continuous loss amount for each future sampling time. Then, combine the effective loss amount in the current prediction period with the continuous loss amount in the subsequent prediction interval in chronological order to generate the clean benefit trajectory of the target local array.

[0013] In a preferred embodiment, step S5 includes the following steps: S5-1. Obtain the operation resource data and operation constraint data corresponding to the target local array. The operation resource data includes the number of available cleaning equipment, the operation capacity of the cleaning equipment per unit time, the arrival time of the cleaning equipment and the expected operation duration. The operation constraint data includes the allowed downtime period, the prohibited operation period and the weather-related prohibition conditions. Generate an executable time period set based on the operation resource data and operation constraint data. S5-2. Enumerate the clean revenue trajectory within the set of executable time periods by candidate continuous time periods. For each candidate continuous time period, calculate the total recoverable revenue and total operating cost corresponding to that candidate continuous time period. The total recoverable revenue is the difference between the clean revenue value corresponding to the start time and the clean revenue value corresponding to the end time of that candidate continuous time period. The total operating cost is the product of the expected operating time and the operating cost per unit time. Calculate the ratio of the total recoverable revenue to the total operating cost as the timing benefit value. S5-3. Sort the timing benefit values ​​of each candidate continuous time period, select the candidate continuous time period with the largest timing benefit value as the target cleaning trigger time period, and output the cleaning timing prediction result corresponding to the target cleaning trigger time period.

[0014] The technical effects and advantages of this invention are as follows: By constructing a counterfactual cleaning reference trajectory corresponding to the current state sequence and solving the output difference time by time, a cleaning reference for the target local array at the current moment can be provided, which can relatively improve the problem that it is difficult to distinguish between dust accumulation loss and weather, temperature and equipment state disturbances. By unifying the time-stamp alignment and segmenting the power generation output data, environmental perception data, equipment status data, and historical cleaning records, a current state sequence with a consistent structure can be formed, which relatively reduces the interference of asynchronous multi-source sampling on subsequent similarity screening and difference calculation. By performing joint screening on historical operating segments according to environmental change process, output change process and state change process, candidate reference segments that are closer to the current operating conditions can be obtained, which can relatively suppress the reference offset and misselection problem caused by single power similarity. By extracting the clean recovery feature sequence of candidate reference fragments and combining it with the matching consistency coefficient and deviation correction amount for reference fusion, a counterfactual clean reference trajectory that is closer to the current working condition can be reconstructed, which relatively improves the stability of the recovery amount estimation after cleaning. By subtracting the environmental disturbance and state disturbance explanation values ​​from the output difference and verifying the effective loss amount in conjunction with historical clean recovery sequences, the probability of non-ash accumulation factors being mixed into the clean benefit judgment can be relatively reduced, and the attribution of losses can be improved. By timing-matching the cleaning revenue trajectory with operational resource data and operational constraint data, the target cleaning trigger period can be determined within the executable time period, thereby improving the consistency between cleaning decisions and on-site operational conditions. Attached Figure Description

[0015] Figure 1 This is a flowchart of the method steps of the present invention. Detailed Implementation

[0016] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0017] Refer to the instruction manual appendix Figure 1 The present invention provides a method for predicting the timing of photovoltaic dust accumulation cleaning for operation and maintenance decision-making, comprising: S1. Obtain edge monitoring data of the target local array within the current prediction period. The edge monitoring data includes power generation output data, environmental perception data, equipment status data, and historical cleaning records. Perform time-series alignment and segmentation based on the edge monitoring data to generate the current state sequence of the target local array. This implementation process organizes power generation output data, environmental sensing data, equipment status data, and historical cleaning records into a directly computable input set based on the same time reference at the edge computing node. It further transforms this input set into a current state sequence that characterizes the short-term operational evolution of the target local array, ensuring that subsequent candidate selection of historical operational segments and construction of counterfactual cleaning reference trajectories are based on the same temporal structure. This implementation process includes the following steps: The purpose of S1-1 is to eliminate alignment errors caused by inconsistent sampling frequencies and missing data from multiple sources, ensuring that each sampling moment has a complete set of fields that can be directly compared. The input includes power generation output data, environmental perception data, equipment status data, and historical cleaning records from the edge monitoring data, along with their respective sampling timestamps or event timestamps. Processing actions include determining a unified sampling timescale and performing time mapping and missing data completion on the four types of data. The unified sampling timescale is generated by the system clock of the edge computing node and forms a timescale sequence with a fixed sampling period. Time mapping uses a nearest-neighbor timescale matching method to map each original sampling point to its corresponding unified sampling timescale position. Missing data completion uses a forward hold method to fill in the most recent valid sampled value for numerical variables and records the missing value. For discrete state variables, it uses the most recent state code to fill in the missing value and records the state hold flag. For event-type historical cleaning records, it uses the cleaning start time and cleaning end time to fill in the missing value. The time-mapping is mapped to the event occupancy marker of the corresponding time interval, and the cleaning object identifier and cleaning method identifier are written in the interval. The output is a data alignment set that corresponds one-to-one with each unified sampling time interval. Each record in the data alignment set contains at least the array DC voltage, array DC current, inverter AC active power, inverter AC reactive power, cumulative power generation, irradiance, ambient temperature, module backplane temperature, relative humidity, wind speed, wind direction, rainfall, inverter operating status code, whether power limit flag, power limit setting value, fault alarm code, string open circuit status, insulation impedance value, communication link quality index, and cleaning event occupancy marker and cleaning object identifier fields. This data alignment set is written to the timing cache of the edge computing node for subsequent segmentation and slicing reading. Anomaly or missing data handling includes marking the corresponding time interval as an unavailable interval and skipping the interval in subsequent steps when the continuous missing duration exceeds one cleaning operation's shortest decision cycle. The purpose of S1-2 is to transform the time-by-time aligned data into segmented feature fragments that facilitate similarity comparison and subsequent attribution calculations, so that the same segment simultaneously contains information related to output changes, environmental changes, state changes, and cleaning behavior. The inputs are the data alignment set and the continuous time period division rules. The processing actions include slicing the data alignment set according to the continuous time period division rules and extracting four types of features within each segment. The continuous time period division rules are provided by the operation and maintenance configuration in the parameter file of the edge computing node, including two items: segment length and segment step size. The segment length is used to define the number of uniform sampling time scales covered by each data feature fragment, and the segment step size is used to define the moving interval of the starting time scale of adjacent fragments. Within each segment, the power generation output change features are obtained by calculating the change amplitude of array DC voltage, array DC current, AC active power, and cumulative power generation increment within the segment, as well as the change direction and rate of change obtained by the difference between adjacent sampling times of each output variable. The environmental change features are obtained by calculating the change amplitude of irradiance and the change amplitude of ambient temperature within the segment. The data is obtained by analyzing the variations in component temperature, humidity, wind speed, wind direction, and rainfall, as well as the duration of consecutive differences between adjacent environmental variables within a segment. Equipment status change characteristics are obtained by statistically analyzing the number of inverter status code switching, power limit flag changes, fault alarm code changes, string on / off status switching, communication quality index fluctuations, and the duration of each equipment operating condition variable within a segment. Cleaning behavior correlation characteristics are obtained by identifying the start and end positions of cleaning event occupancy markers within a segment to generate the duration of the cleaning action and calculating the post-cleaning power generation recovery response level by combining the power generation changes before and after cleaning. The output is a set of data feature fragments corresponding to each segment. Each data feature fragment contains at least a segment start time stamp, a segment end time stamp, and four types of feature fields. This set of data feature fragments is written into the fragment index table of the edge computing node for subsequent concatenation encoding and historical fragment retrieval. Anomaly or missing data handling includes discarding a segment when the proportion of unusable intervals within a segment exceeds half and writing a discard reason marker into the fragment index table. The purpose of S1-3 is to organize the set of data feature segments into a unified sequence structure that can be used for subsequent similarity screening and callback calculation, so that subsequent steps can be aligned, compared, and fused in a consistent manner according to time position. The input is the set of data feature segments. The processing actions include sorting the data feature segments according to the segment start time index from smallest to largest and performing concatenation encoding. The concatenation encoding is implemented by mapping each data feature segment to a sequence element. The sequence element is written with fixed field order, including power generation output change features, environmental change features, equipment status change features, and cleaning behavior association features. A corresponding sequence element is also written for each sequence element. The time position index ensures consistency with the position of subsequent historical running segments; the output is the current state sequence, which consists of multiple sequence elements arranged chronologically and can characterize the evolution of the target local array's running state during the current prediction period. The current state sequence is written to the current sequence register of the edge computing node for subsequent candidate reference segment set filtering and reading; the anomaly or missing data handling includes outputting an insufficient marker and triggering the edge computing node to self-check the sampling period and segment length when the number of valid sequence elements in the current state sequence is insufficient to cover the current prediction period, so that operation and maintenance personnel can trace the cause of data missing data; The technical advantage of this implementation process lies in establishing a synchronous alignment relationship between multi-source data through a unified sampling timescale, and transforming time-series data into feature segments containing information related to output changes, environmental changes, state changes, and cleaning behavior through segmented slicing. These segments are then concatenated and encoded to form a structurally consistent current state sequence. This ensures that subsequent similar segment selection, cleaning recovery feature extraction, counterfactual cleaning reference trajectory construction, and difference attribution calculation all have a consistent temporal basis, thereby reducing decision-making bias under conditions of weak networks and inconsistent multi-source sampling. In practical applications: edge computing nodes can be deployed on the inverter-side gateway. After the unified sampling timescale is generated according to a fixed sampling period, power metering and weather station data are mapped to the same timescale position. Missing points caused by short-term disconnections are retained forward and marked with missing information. Data feature segments are then generated using the segment length and step size configured by operations and maintenance, and concatenated in chronological order to form the current state sequence. This current state sequence is then directly used as input for subsequent steps, enabling local candidate reference segment retrieval and cleaning timing prediction output without relying on continuous cloud transmission.

[0018] S2. Read the set of historical running segments of the target local array, perform candidate filtering according to the similarity between the current state sequence and each historical running segment in the process of environmental change, output change and state change, and generate a set of candidate reference segments corresponding to the target local array. This implementation process is used to select historical operation segments from the historical operation records of the target local array that have high consistency with the current state sequence in terms of environmental change processes, output change processes, and state change processes. This allows for subsequent clean recovery feature extraction and counterfactual clean reference trajectory construction based on the selected candidate reference segment set. The basic principle is to first unify the current state sequence and historical operation segments to the same time length and field structure, then calculate the process difference values ​​from the perspectives of environment, output, and state, and finally retain historical operation segments that simultaneously meet the aforementioned conditions based on the joint ranking result of the three types of process difference values. This avoids the candidate reference segments deviating from the actual clean reference due to relying solely on single power similarity or single environmental similarity. The implementation process includes the following steps: The purpose of S2-1 is to transform the current state sequence and historical running segments into process feature segments that can be directly compared at the same time position, so that subsequent difference calculations are based on the same length, the same position and the same field caliber. The input consists of the current state sequence generated by S1, the set of historical running segments of the target local array, and the time length corresponding to the current prediction period. The processing includes first using the number of uniform sampling time stamps covered by the current prediction period as the slice length, truncating the current state sequence into a current slice of the corresponding length, and then generating a historical slice of the same length for each historical running segment in the historical running segment set using a sliding method. The sliding step size uses the same configuration value as the segmentation step size in S1 to ensure consistency in temporal granularity between the current slice and the historical slice. Subsequently, the environmental change process, output change process, and state change process are extracted within each current slice and each historical slice, respectively. The change process is obtained by subtracting the corresponding time-series change value groups from irradiance, ambient temperature, component temperature, wind speed, and rainfall between adjacent sampling times. The output change process is obtained by subtracting the corresponding time-series change value groups from array DC voltage, array DC current, AC active power, and cumulative power generation increment between adjacent sampling times. The state change process is obtained by comparing the inverter operating status code, power limit flag, fault alarm code, string on / off status, and communication quality index between adjacent sampling times. Discrete state variables are recorded as zero when the same value is the same and as one when the different values ​​are different to form change values. The communication quality index is formed by subtracting the values ​​at adjacent sampling times to form change values. The output consists of the current process feature segment and the historical process feature segment. Both the current process feature segment and the historical process feature segment contain three sets of fields: environmental change process, output change process, and state change process. These are written to the process comparison buffer for S2-2 to read. The handling of anomalies or missing data includes discarding the historical slice directly when the number of valid sampling points in a certain historical slice is less than half of the slice length, and recording the reason for discarding in the historical process feature segment index table to avoid missing data dominating subsequent difference calculations. The purpose of S2-2 is to quantify the deviation between the current process feature segment and each historical process feature segment in terms of environmental change, output change, and state change, so that subsequent candidate selection is based on sortable numerical results. The input is the current process feature segment and each historical process feature segment. The processing actions include comparing the change values ​​of corresponding fields in the current process feature segment and each historical process feature segment item by item according to time position, and calculating the environmental process difference value, output process difference value, and state process difference value respectively. The environmental process difference value is obtained by summing the absolute values ​​of the difference in irradiance, ambient temperature, component temperature, wind speed, and rainfall at each time position. The output process difference value is obtained by summing the absolute values ​​of the difference in array DC voltage, array DC current, AC active power, and cumulative power generation increment at each time position. The state process difference value is obtained by summing the difference in inverter operating status code and power limit flag at each time position. The absolute values ​​of the differences in fault alarm codes, group connection / disconnection status, and communication quality indicators are accumulated one by one, and then the number of times the current process feature segment and the historical process feature segment have inconsistent state switching at the corresponding time position is added. The number of inconsistent state switching refers to the count result where the current process feature segment has a state change at a certain time position while the historical process feature segment does not have a state change at the same time position, or the historical process feature segment has a state change at a certain time position while the current process feature segment does not have a state change at the same time position. The output is the environmental process difference value, output process difference value, and state process difference value corresponding to each historical running segment. The three types of process difference values ​​are associated with the historical running segment identifier and written into the difference value index table for S2-3 to read. The abnormal or missing handling includes that when a certain field is a missing value in both the current process feature segment and the historical process feature segment, the time position corresponding to the field is not included in the accumulation result, and is synchronously written into the effective comparison position count so that historical running segments with too few effective comparison positions can be removed during subsequent screening. The purpose of S2-3 is to retain historical operation segments that simultaneously meet the three similar conditions of environment, output, and state from the three types of process difference values, thereby forming a set of candidate reference segments that can be used for subsequent clean recovery feature extraction; the inputs are the environmental process difference value, output process difference value, state process difference value, and the effective comparison position count in the difference value index table for each historical operation segment; The processing steps include first sorting the environmental process difference value, output process difference value, and state process difference value in ascending order to form environmental sorting results, output sorting results, and state sorting results. Then, the retention boundary of the sorting stage is determined based on the upper limit configuration value of the candidate number in the edge computing node. The upper limit configuration value of the candidate number is written into the parameter file by the operation and maintenance side during deployment to constrain the computational scale of the subsequent counterfactual cleaning reference trajectory construction. In this embodiment, the upper limit configuration value of the candidate number is first used as the initial retention number of each sorting result. The historical running segment identifiers that are within the aforementioned retention number in the environmental sorting result, output sorting result, and state sorting result are respectively extracted, and the intersection of the three is calculated. When the number of historical running segments in the intersection is greater than zero, the historical running segments in the intersection are determined as a candidate reference segment set. When the number of historical running segments in the intersection is zero, the number of retained segments in each sorting result is increased successively according to a fixed expansion step size, and the truncation and intersection operations are repeatedly performed until the number of historical running segments in the intersection is greater than zero or the number of retained segments is expanded to the total number of historical running segments. During the expansion process, the fixed expansion step size is also taken from the edge computing node parameter file. When there are multiple intersection results, the sum of the sorting positions of each historical running segment in the environment sorting result, output sorting result, and state sorting result is further calculated, and sorted according to the sorting position. The sums are rearranged from smallest to largest, and historical running segments with smaller sums of sorting positions are written into the candidate reference segment set first. The output is the candidate reference segment set corresponding to the target local array, and the candidate reference segment set is written into the candidate segment register area for S3 to read. The exception or missing handling includes triggering a rollback process when there is still no intersection result after the number of retained segments is expanded to the total number of historical running segments. The historical running segment with the smallest sum of sorting positions among the environment sorting result, output sorting result and state sorting result is directly written into the candidate reference segment set, and a low consistency mark is written at the same time so that the contribution of the candidate reference segment is reduced when S3 calculates the matching consistency coefficient. The technical advantage of this implementation process lies in the fact that by slicing the current state sequence and historical operation segments with the same length, extracting the same fields, and calculating the differences at the same location, it can stably filter out historical operation segments with similar environmental disturbances, similar output changes, and similar equipment state evolution. This avoids mistakenly selecting historical operation segments that are similar only in one dimension but significantly deviate in other key dimensions as candidate reference segments, thereby improving the pertinence and interpretability of subsequent clean recovery feature extraction and counterfactual clean reference trajectory construction. In practical applications: the edge computing node can first generate a large number of historical slices from the locally stored six-month historical operation records according to the slice length corresponding to the current prediction period. Then, it calculates the environmental process difference value, output process difference value, and state process difference value between each historical slice and the current process feature segment. Subsequently, it takes the intersection of the three sorting results according to the upper limit configuration value of the candidate number set in the parameter file. If the initial intersection is empty, the retention range is expanded step by step, and finally a set of candidate reference segments is obtained that neither excessively relaxes the screening conditions nor fails to ensure that there are usable candidate reference segments for subsequent steps.

[0019] S3. Perform a clean-up power generation characterization rollback calculation on the candidate reference segment set, extract the clean recovery features corresponding to the current state sequence in each candidate reference segment, and perform reference fusion based on the matching consistency between each candidate reference segment and the current state sequence to generate the counterfactual clean reference trajectory of the target local array in the current prediction period. This implementation process is used to reconstruct the dust-free reference output state of the target local array under the current prediction period if a cleaning action occurs, based on a set of candidate reference segments and utilizing the power generation recovery response brought about by historical real cleaning events. Multiple candidate recovery output sequences are then fused into a continuous counterfactual cleaning reference trajectory by matching consistency coefficients and deviation corrections. The basic principle is as follows: First, the output difference before and after cleaning is extracted from each candidate reference segment to form a cleaning recovery feature sequence. Then, the consistency between the candidate reference segment and the current state sequence is quantified using environmental process difference values, output process difference values, and state process difference values ​​to generate a matching consistency coefficient. Subsequently, the cleaning recovery feature sequence is mapped to the time position of the current prediction period and superimposed on the current power generation output sequence to form a candidate recovery output sequence. Next, deviation corrections are performed on the candidate recovery output sequence based on the current environmental change process and state change process. Finally, the counterfactual cleaning reference trajectory is obtained by fusing the sequences time-by-time according to the matching consistency coefficients. This implementation process includes the following steps: The purpose of S3-1 is to extract a clean recovery feature sequence from historical clean events that can characterize the recovery magnitude of the clean action on power generation output, so that subsequent clawback calculations can directly reference the historical real recovery response without relying on subjective estimation. The input quantities are a set of candidate reference segments, the historical clean record field corresponding to each candidate reference segment in the candidate reference segment set, and the power generation output data sequence contained in the candidate reference segment. The processing actions include, for each candidate reference segment, firstly reading the clean start time and clean end time and mapping them to the unified sampling time stamp position of the candidate reference segment, then using the time stamp corresponding to the clean end time as the starting point of the clean output sequence, and extracting the clean output sequence with the same length as the current prediction period from this starting point. The clean output sequence contains at least the AC active power sequence and the cumulative power generation increment sequence with the same field caliber as the current power generation output sequence; then using the time stamp corresponding to the clean start time as the ending point of the clean output sequence, and backtracking from this ending point to extract the clean output sequence with the same length as the current prediction period, so that the clean output sequence and the clean output sequence are consistent in sequence length and time position number. Finally, the output sequence after cleaning is subtracted from the output sequence before cleaning at each time position to obtain the cleaning recovery increment at each time position. The cleaning recovery increments at each time position are then arranged in order to generate a cleaning recovery feature sequence. The output is a cleaning recovery feature sequence that corresponds one-to-one with each candidate reference segment. The cleaning recovery feature sequence is written into the cleaning recovery feature register area for S3-2 and S3-3 to read. The abnormal or missing handling includes skipping the candidate reference segment and writing a cleaning record missing mark when the cleaning start time or cleaning end time is missing in the candidate reference segment. When the output sequence before cleaning or the output sequence after cleaning cannot be truncated to a sufficient length, the truncated part is used to generate the cleaning recovery feature sequence and a length insufficient mark is written at the end to reduce the weight during subsequent fusion. The purpose of S3-2 is to quantify the matching degree between candidate reference segments and the current state sequence into a matching consistency coefficient that can be used for fusion, so that candidate reference segments that are closer to the current operating conditions contribute more to the counterfactual cleaning reference trajectory. The inputs are the environmental process difference value, output process difference value, and state process difference value corresponding to each candidate reference segment, as well as the candidate reference segment identifier set. The processing actions include adding the environmental process difference value, output process difference value, and state process difference value of each candidate reference segment to generate a total segment difference value, and when the total segment difference value is zero, replacing the total segment difference value with the minimum non-zero total segment difference value in the same candidate reference segment set to avoid the inverse operation from generating infinity. Then, the reciprocal of the total difference value of each candidate reference segment is calculated to obtain the reciprocal difference value. Then, the reciprocal difference values ​​of all candidate reference segments are summed to obtain the reciprocal difference sum value. Finally, the reciprocal difference value of each candidate reference segment is divided by the reciprocal difference sum value to obtain the matching consistency coefficient. The output is the matching consistency coefficient corresponding to each candidate reference segment, and the matching consistency coefficient is written to the matching coefficient index table for S3-3 to read. The exception or missing value handling includes not participating in the calculation of the total difference value of a candidate reference segment when any process difference value of the candidate reference segment is missing and marking the candidate reference segment as an unfusionable segment. When the reciprocal difference sum value is zero, the backoff processing is triggered and the matching consistency coefficient of all fusionable candidate reference segments is set to the same value. The purpose of S3-3 is to convert the clean recovery feature sequence into a dust-free candidate recovery output sequence that can be directly used in the current prediction period. Based on the current environmental and state change processes, the candidate recovery output sequences are corrected for deviations and then fused according to the matching consistency coefficient, ultimately forming a continuous counterfactual clean reference trajectory. The inputs are the current power generation output sequence in the current state sequence, the clean recovery feature sequence corresponding to each candidate reference segment, the matching consistency coefficient, and the environmental and state change processes in the current state sequence. The processing steps include first mapping each time position of the clean recovery feature sequence to the corresponding sampling time of the current power generation output sequence according to the unified sampling time stamp number of the current prediction period, and then adding the clean recovery increment step by step based on the output value of the current power generation output sequence at the corresponding sampling time to obtain the candidate recovery output sequence corresponding to the candidate reference segment. Subsequently, for each candidate recovery output sequence, the environmental deviation correction and state deviation correction are calculated at each sampling time. The environmental deviation correction is calculated based on the irradiance change, ambient temperature change, component temperature change, and wind speed changes of the current state sequence at that sampling time. The environmental deviation difference groups are obtained by subtracting the corresponding change values ​​of the candidate reference segment at the same time position from the change values ​​of velocity and rainfall, respectively. These environmental deviation difference groups are then multiplied by the response ratio of the corresponding environmental variable within the candidate reference segment to the output change, and then summed. The response ratio is obtained by calculating the ratio of the output change value to the environmental variable change value within the window corresponding to the time position of the candidate reference segment and averaging the ratios within the window. The state deviation correction is obtained by comparing the inverter operating status code, power limit flag, fault alarm code, serial connection / disconnection status, and communication quality index of the current state sequence at the sampling time with the corresponding state fields of the candidate reference segment at the same time position, item by item. These state deviation difference groups are then mapped to the corresponding output offsets and summed. The output offset is obtained by retrieving the average output difference within a fixed-length window before and after a similar state change in historical operating segments as the offset corresponding to that state change. Finally, the environmental deviation correction and the state deviation correction are subtracted from the output value of the candidate recovery output sequence at the sampling time to generate the corrected candidate recovery output sequence corresponding to the candidate reference segment. Finally, position-weighted fusion is performed on all corrected candidate recovery output sequences at each sampling time. Position-weighted fusion is obtained by multiplying the output values ​​of each corrected candidate recovery output sequence at the same sampling time by their corresponding matching consistency coefficients and summing them. The summation result is used as the counterfactual clean reference output value at that sampling time, thereby generating a counterfactual clean reference trajectory in chronological order. The output is the counterfactual clean reference trajectory of the target local array continuously unfolded in chronological order during the current prediction period. The counterfactual clean reference trajectory is written into the counterfactual trajectory register area for S4 to read. Anomaly or missing value handling includes marking the candidate reference segment as a low confidence segment and reducing its matching consistency coefficient proportionally before participating in position-weighted fusion when the response ratio or output offset of a candidate reference segment cannot be calculated. When a non-physical value appears in the counterfactual clean reference output value, linear interpolation of adjacent sampling times is used to replace it and a non-physical value correction mark is written for traceability. The technical effect of this implementation process is that, by extracting clean recovery feature sequences, the impact of cleanliness is given by historical real data; by matching consistency coefficients, the contribution of candidate reference segments is matched with the degree of consistency with current operating conditions; and by using environmental deviation correction and state deviation correction, the candidate recovery output sequence can be corrected under the current environmental and equipment state conditions. This results in the output of a counterfactual cleanliness reference trajectory that can be directly used for difference attribution calculation, reducing misjudgments of cleanliness timing caused by counterfactual reference distortion due to weather disturbances or equipment state differences. In practical applications: edge computing nodes can extract the output differences before and after cleanliness from multiple cleanliness events locally to form a clean recovery feature sequence, and combine it with the current power generation output sequence of the current prediction period to generate multiple candidate recovery output sequences. Then, based on the current irradiance change and equipment power limitation status, environmental deviation correction and state deviation correction are performed on the candidate recovery output sequences. Finally, the counterfactual cleanliness reference trajectory is obtained by fusing according to the matching consistency coefficient. This counterfactual cleanliness reference trajectory is then input into the subsequent time-by-time difference solution process to output an executable cleanliness triggering period.

[0020] S4. Solve the time-by-time difference between the current state sequence and the counterfactual cleaning reference trajectory, separate the effective loss caused by the ash accumulation effect and the disturbance loss caused by non-ash accumulation factors, and generate the cleaning benefit trajectory of the target local array based on the continuous change of the effective loss in the subsequent prediction interval. This implementation process, based on the already constructed counterfactual clean reference trajectory, further identifies how much of the current power output difference can be attributed to the impact of ash accumulation, and extends this attributable loss amount into subsequent prediction intervals as a clean benefit trajectory directly usable for operation and maintenance decisions. Its basic principle is to first calculate the output difference between the current power output sequence and the counterfactual clean reference trajectory at the same sampling time; then, using the historical response relationship of environmental variables and equipment state variables to the output difference, the portion that can be explained by non-ash accumulation factors is explained, yielding the residual difference; subsequently, based on the continuous change pattern of the residual difference over time, a smooth continuation segment and abrupt disturbance segment are divided, and the smooth continuation segment is further verified by combining the power recovery sequence after historical clean events, thereby determining the portion consistent with the actual clean recovery pattern as the effective loss amount; finally, the effective loss amount is combined with the rainfall and wind speed changes in the subsequent prediction interval to extrapolate the clean benefit trajectory of the target local array in the current prediction period and subsequent prediction intervals. This implementation process includes the following steps: The purpose of S4-1 is to unify the current power generation output sequence and the counterfactual clean reference trajectory at the same sampling location for difference expansion, and simultaneously organize the environmental change fields and state change fields required for subsequent disturbance interpretation, so that each sampling moment forms a complete attribution input; the input quantities are the current power generation output sequence in the current state sequence, the counterfactual clean reference trajectory generated by S3, and the environmental change values ​​and state change values ​​in the current state sequence corresponding to each sampling moment; the processing actions include first aligning the current power generation output sequence and the counterfactual clean reference trajectory one moment at a time according to the unified sampling time scale, then subtracting the current power generation output value from the counterfactual clean reference output value at the same sampling moment to obtain the output difference value at that sampling moment, and simultaneously reading the irradiance corresponding to that sampling moment. The system collects and packages the following data: temperature change, ambient temperature change, component temperature change, wind speed change, rainfall change, inverter operating status code change, power limit flag change, fault alarm code change, and string on / off status change. The output difference is then encapsulated with the above fields at the same sampling time into a time-series difference data set. The output quantity is a set of time-series difference data sets arranged in chronological order, and this set is written into the difference attribution buffer for S4-2 to read. Anomaly or missing data handling includes marking the sampling time as an unattributable time and removing it from subsequent difference attribution calculations when the current power generation output value or counterfactual clean reference output value is missing at a certain sampling time. Simultaneously, its time position is retained in the buffer to avoid subsequent timing misalignment. The purpose of S4-2 is to utilize historical environmental disturbance response relationships and state transition offset relationships to preemptively remove the portion of the output difference that can be explained by non-dust accumulation factors, thus leaving a cleaner residual difference for subsequent effective loss identification. The input consists of time-by-time difference data sets, the candidate reference segment set retained from S2, and historical records of the impact of similar variables on output changes within the candidate reference segments. The processing includes calculating the environmental disturbance interpretation value and the state disturbance interpretation value for each sampling time. The environmental disturbance interpretation value is calculated by variable, specifically by extracting the irradiance change value, ambient temperature change value, component temperature change value, wind speed change value, and rainfall change value that are consistent with the current sampling time in each candidate reference segment, and then using the current sampling time as the reference value in each candidate reference segment. The system calculates a sequence of ratios between the changes in the same type of environmental variable and the output difference within a fixed-length window consisting of the same number of sampling points before and after the center. The arithmetic mean of this ratio sequence is then taken to obtain the response ratio of the corresponding environmental variable to the output difference within the candidate reference segment. The length of the fixed-length window is configured using the window length configuration value in the edge computing node parameter file, which is pre-written by the operations and maintenance side according to the site sampling cycle. Then, the changes in each environmental variable at the current sampling time are multiplied by the response ratio of the corresponding candidate reference segment to obtain the explanatory component of each environmental variable under each candidate reference segment. The explanatory components of the same type under each candidate reference segment are then weighted and summed according to the matching consistency coefficient generated by S3 to form the final explanatory component of each environmental variable at the current sampling time. Finally, the final explanatory components of the five types of environmental variables are summed to obtain the environmental disturbance explanatory value. The state disturbance interpretation value is calculated item by item according to the state field. Specifically, firstly, historical state change events with the same state change type as the current sampling time are retrieved from the historical operation segment. Then, the output difference sequence within a fixed-length window before and after the state change is taken as the reference. The average of the output differences within the window is used to obtain the output offset corresponding to this type of state change. The fixed-length window uses the same window length configuration value as the response ratio calculation. Then, at the current sampling time, the inverter operating status code change value, power limit flag change value, fault alarm code change value, and string on / off status change value are mapped to the corresponding output offsets, and the sum of each output offset is used to obtain the state disturbance interpretation value. Finally, the current... The output difference at each sampling time is subtracted from the environmental disturbance interpretation value, and then the state disturbance interpretation value is subtracted to generate the residual difference at that sampling time. The output is the environmental disturbance interpretation value, state disturbance interpretation value, and residual difference corresponding to each sampling time, and is written to the residual difference register for S4-3 to read. The anomaly or missing value handling includes: when an environmental variable cannot form a valid response ratio in the corresponding window, the result of the weighted average of the similar response ratios in the other candidate reference segments is used as the substitute; when a certain state change type has no similar events that can be retrieved in the historical running segments, the output offset corresponding to the state change type is recorded as zero and written to the state offset missing flag for subsequent tracing. The purpose of S4-3 is to distinguish the continuously changing part that better conforms to the characteristics of dust accumulation from the abrupt part that better conforms to the characteristics of instantaneous disturbance, based on the continuous change pattern of the residual difference over time, thus forming effective loss candidate values ​​and disturbance loss candidate values ​​to be reviewed. The input is the residual difference value and its time position index at each sampling time. The processing actions include first arranging all residual differences continuously in the order of sampling time, then calculating the difference between the residual difference value of the next sampling time and the residual difference value of the previous sampling time for each adjacent sampling time, and taking the absolute value of this difference as the absolute value of the difference between the residual differences of adjacent sampling times. Then, starting from the first effective sampling time, scanning backwards, when multiple consecutive adjacent sampling times all satisfy the condition that the absolute value of the difference between the residual differences of adjacent sampling times is less than the absolute value of the residual difference value of the previous sampling time, the continuous sampling interval is determined as a smooth continuation segment, indicating that the change of residual difference value in this interval is relatively gentle and maintains continuity. When a sampling moment that does not meet the above conditions occurs, the interval containing that sampling moment is defined as a sudden change or short-term reverse change in the residual difference within that interval. Then, the residual differences of each sampling moment in the smooth continuation segment are written into the effective loss candidate value sequence in the original time order, and the residual differences of each sampling moment in the sudden change and disturbance segment are written into the disturbance loss candidate value sequence in the original time order. The output consists of the smooth continuation segment, the sudden change and disturbance segment, the effective loss candidate value sequence, and the disturbance loss candidate value sequence. The start and end times of the smooth continuation segment are written into the smooth segment index table for S4-4 to read. The handling of anomalies or missing values ​​includes that when a single smooth continuation segment contains only one sampling moment, it is not directly output as an independent smooth continuation segment, but is merged into the adjacent sudden change and disturbance segment to avoid the formation of pseudo effective loss candidate values ​​by single-point accidental fluctuations. The purpose of S4-4 is to perform a secondary verification of the effective loss candidate values ​​by utilizing the power generation recovery patterns after historical real-world cleaning events, preventing the misidentification of slow fluctuations not caused by ash accumulation as effective loss amounts based solely on continuity judgment. The input quantities are a smooth continuation segment, a sequence of effective loss candidate values, and power generation recovery sequences after each cleaning event in the historical cleaning records. The processing actions include reading the time length of each smooth continuation segment and filtering out power generation recovery sequences with the same time length from the historical cleaning records. If no power generation recovery sequence with the exact same length exists in the historical cleaning records, then an equal-length subsequence is extracted from the power generation recovery sequences with a length greater than that of the smooth continuation segment, starting at the same position, as a candidate recovery sequence. Subsequently, the effective loss candidate values ​​in the smooth continuation segment are compared with each candidate recovery sequence at each time position, and the absolute value of the difference at the corresponding time is calculated. The absolute values ​​of the differences at all corresponding time points are then accumulated to obtain the accumulated difference result between the smooth continuation segment and each candidate recovery sequence. Finally, the candidate recovery sequence with the smallest value is selected from all accumulated difference results and determined as the clean corresponding recovery sequence for the smooth continuation segment. Then, the direction of change of the effective loss candidate value in the smooth continuation segment is compared with the direction of change of the corresponding clean recovery sequence. At the same time position, when the two change directions are consistent, the corresponding effective loss candidate value is determined as the effective loss amount. When the two change directions are inconsistent, the corresponding effective loss candidate value is incorporated into the disturbance loss amount. The output is the effective loss amount and disturbance loss amount finally determined at each sampling time. The effective loss amount is written into the revenue extrapolation register for S4-5 to read. The abnormal or missing handling includes that when a smooth continuation segment cannot find a power generation recovery sequence for comparison in the historical clean record, the original effective loss candidate value status of the smooth continuation segment remains unchanged, but an unverified mark is written, and its credibility record weight is reduced in subsequent revenue extrapolation. The purpose of S4-5 is to extrapolate the effective loss amount in the current prediction period to subsequent prediction intervals based on the changing patterns of dust accumulation in future periods, which may continue to accumulate or be naturally cleared. This results in a clean-up benefit trajectory that can be directly used for cleaning timing matching calculations. The inputs are the effective loss amount in the current prediction period, the wind speed and rainfall change sequences in subsequent prediction intervals, and the future extrapolation step size configuration in the edge computing nodes. The processing involves first reading the effective loss amount at the last sampling moment of the current prediction period as the starting point for future extrapolation, and then traversing subsequent prediction intervals moment by moment according to the future extrapolation step size configuration. The scour reduction and dust increment are calculated at a future sampling time. The scour reduction is obtained by multiplying the change in rainfall at the future sampling time by the natural scour response ratio corresponding to the rainfall event in the historical cleaning record. The natural scour response ratio is obtained by calculating the average value of the ratio of the decrease in output difference before and after the occurrence of a historical rainfall event to the change in rainfall. The dust increment is obtained by multiplying the change in wind speed at the future sampling time by the dust increment response ratio corresponding to the change in wind speed under historical no-cleaning conditions. The dust increment response ratio is obtained by calculating the average value of the ratio of the increase in residual difference caused by wind speed change in historical operating segments to the change in wind speed. Subsequently, the effective loss amount of the previous sampling time is subtracted from the scouring reduction amount and the dust increment is added to generate the continued loss amount of the current future sampling time. If the calculation result is less than zero, the continued loss amount is corrected to zero to ensure that the clean benefit trajectory does not have a negative value. Finally, the effective loss amount in the current prediction period and the continued loss amount generated moment by moment in the subsequent prediction interval are combined in chronological order to form the clean benefit trajectory of the target local array. The output is the clean benefit trajectory covering the current prediction period and the subsequent prediction interval, and the clean benefit trajectory is written to the operation and maintenance decision storage area for S5 to read. The abnormal or missing handling includes that when the wind speed change value or rainfall change value is missing at a certain future sampling time in the subsequent prediction interval, the corresponding variable value of the previous future sampling time is used to fill in the missing value and write the filling mark. When the continuous missing length exceeds the maximum continuous missing length corresponding to the future extrapolation step size configuration, the subsequent extension is stopped and the part of the clean benefit trajectory up to the stop time is output. The technical advantage of this implementation process lies in its ability to separate the effective losses truly caused by dust accumulation from the total difference by first calculating the output difference, then removing disturbance components that can be explained by environmental and equipment conditions, and subsequently filtering based on time continuity and verifying with historical cleaning and recovery sequences. Furthermore, it combines natural erosion and dust accumulation processes to extrapolate a cleaning benefit trajectory, thus ensuring that subsequent cleaning timing decisions are based on explainable, traceable benefits that correspond to real-world operation and maintenance scenarios. In practical applications, edge computing nodes can first calculate hourly data based on the current power generation output sequence and the counterfactual cleaning reference trajectory. The output difference is calculated, and the response ratio obtained from the candidate reference segment is used to explain the output fluctuations caused by irradiance and temperature changes. At the same time, the output offset corresponding to power limiting, fault alarm and string on / off is obtained by using historical state switching events, and then the residual difference is obtained. For the continuously and gradually increasing residual difference interval, the edge computing node compares it with the power generation recovery sequence after the historical clean event time by time to determine the effective loss. Finally, the effective loss is combined with the changes in rainfall and wind speed given by the weather forecast to extrapolate and output the clean benefit trajectory, so that the subsequent steps can select the clean trigger period within the executable period.

[0021] S5. Perform timing matching calculations between the cleaning revenue trajectory and the preset operation and maintenance execution conditions to determine the target cleaning triggering period of the target local array, and output the cleaning timing prediction results corresponding to the target cleaning triggering period. This implementation process is used to convert the cleaning benefit trajectory of the target local array into executable cleaning triggering periods, enabling edge computing nodes to output actual operation opportunities that meet on-site resources, operational constraints, and weather conditions. Its basic principle is to first read operation resource data and operational constraint data to determine executable periods with cleaning operation conditions; then, within the executable periods, enumerate candidate consecutive periods that meet the expected operation duration requirements; and combine the cleaning benefit trajectory to calculate the additional recoverable benefit and operation cost of each candidate consecutive period. Finally, determine the target cleaning triggering period with the largest additional recoverable benefit per unit operation cost. This implementation process includes the following steps: The purpose of S5-1 is to transform on-site operating conditions into explicit time interval constraints, providing an executable boundary for subsequent candidate continuous time interval enumeration. The input includes the operation resource data, operation constraint data, and unified sampling time stamp sequence corresponding to the current prediction time period and subsequent prediction intervals for the target local array. The processing actions include reading the number of available cleaning equipment, the unit time operation capacity of the cleaning equipment, the arrival time of the cleaning equipment, and the expected operation duration, as well as reading the allowed downtime, prohibited operation time, and weather-related prohibited operation conditions. The expected operation duration is obtained by dividing the area to be cleaned in the target local array by the unit time operation capacity of the cleaning equipment. When there are multiple available cleaning equipment, the unit time operation capacity of the multiple cleaning equipment is summed and used as the total operation capacity in the calculation. Subsequently, based on a unified sampling time stamp sequence, each sampling moment is evaluated to determine whether it is later than the arrival time of the cleaning equipment, whether it falls within the allowed downtime period, whether it does not belong to the prohibited operation period, and whether it does not meet the weather-related prohibition conditions. Sampling moments that simultaneously meet the above conditions are marked as executable sampling moments. Then, consecutive adjacent executable sampling moments are merged into executable time periods to form an executable time period set. The output is the executable time period set, which is written into the operation time period register for S5-2 to read. Anomaly or missing data handling includes using the most recent valid weather status of the edge computing node to fill in the missing weather data when the weather-related prohibition conditions are missing, and writing the weather data completion mark. When the sustainable operation time of all executable time periods is less than the expected operation time, the output is marked as no executable time period and the target local array is written into the list to be re-evaluated later. The purpose of S5-2 is to quantify the incremental benefits and operational costs of waiting before cleaning is performed under different candidate consecutive time periods within the set of executable time periods, thereby generating comparable timing benefit values. The inputs are the cleaning benefit trajectory, the set of executable time periods, the expected operation duration, and the operation cost per unit time. The processing actions include enumerating candidate consecutive time periods with a length not less than the expected operation duration within each executable time period, using a uniform sampling time scale as the step size. The start sampling time of each candidate consecutive time period represents the planned start time of cleaning, and the end sampling time represents the latest time when waiting is allowed. Subsequently, for each candidate continuous time period, the total recoverable revenue and total operating cost are calculated. The total recoverable revenue is the cleaning revenue value at the end of the candidate continuous time period minus the cleaning revenue value at the start of the period, representing the newly accumulated recoverable revenue during the waiting period. The total operating cost is obtained by multiplying the expected operating time by the unit time operating cost, which is pre-configured by the operations and maintenance side and includes at least equipment energy consumption cost, labor cost, and downtime impact cost. Then, the total recoverable revenue is divided by the total operating cost to obtain the timing benefit value of the candidate continuous time period. The candidate continuous time period identifier, total recoverable revenue, total operating cost, and timing benefit value are associated and written into the benefit evaluation table. The output is the benefit evaluation table corresponding to all candidate continuous time periods, and the benefit evaluation table is written into the sorting cache for S5-3 to read. The abnormal or missing handling includes: when the cleaning revenue value at the end of a candidate continuous time period is less than the cleaning revenue value at the start of the period, the total recoverable revenue of the candidate continuous time period is recorded as zero; when the total operating cost is zero, the candidate continuous time period is marked as a cost abnormal period and is not included in the sorting. The purpose of S5-3 is to select the cleaning opportunity with the highest added recoverable benefit per unit of operation cost from all candidate continuous time periods and generate a cleaning opportunity prediction result that can be directly issued. The inputs are a benefit evaluation table, a set of executable time periods, and a target local array identifier. The processing actions include first sorting all candidate continuous time periods in the benefit evaluation table from largest to smallest time period benefit value to obtain the candidate continuous time period ranking result; when multiple candidate continuous time periods have the same time period benefit value or the difference is less than the parallel judgment precision value in the edge computing node parameter file, the candidate continuous time period with the earlier start sampling time is selected as the ranking result, where the parallel judgment precision value is pre-written by the operation and maintenance side according to the benefit value measurement precision; then, the candidate continuous time period ranked first in the ranking result is read, it is determined as the target cleaning trigger time period, and the start sampling time, end sampling time, corresponding time period benefit value, corresponding total recoverable benefit, and corresponding total operation cost of the target cleaning trigger time period are extracted to generate the cleaning opportunity prediction result of the target local array. The output is the cleaning timing prediction result corresponding to the target cleaning trigger period, and the cleaning timing prediction result is written into the operation and maintenance decision output area for subsequent cleaning task scheduling module to read or back to the station-level management platform; the abnormal or missing handling includes outputting no valid candidate period marker and triggering the acquisition of operation resource data when the benefit evaluation table is empty; when the starting sampling time corresponding to the first candidate continuous period is earlier than the current system time, the candidate continuous period is removed and the next candidate continuous period is selected in sequence until an executable target cleaning trigger period is obtained or the current cycle does not trigger a cleaning marker. The technical effect of this implementation process is that by first forming a set of executable time periods, and then enumerating and evaluating the benefits of candidate continuous time periods within the set of executable time periods, the cleaning benefit trajectory can be transformed into actual cleaning opportunities that match operational resources, operational constraints and weather conditions, thus avoiding unexecutable or inefficient operations caused by directly triggering cleaning based on the benefit change curve. In practical applications: Edge computing nodes can first generate multiple executable time periods based on the arrival time of cleaning equipment, the target local array area, the unit time operation capacity of cleaning equipment, and the downtime plan for the day. Then, within each executable time period, candidate continuous time periods are enumerated according to a unified sampling time scale. The recoverable benefits and corresponding operation costs increased from the start sampling time to the end sampling time are calculated. Finally, the candidate continuous time period with the highest timing benefit value is selected as the target cleaning trigger time period. The target cleaning trigger time period, along with the corresponding benefit and cost information, is output to the operation and maintenance scheduling personnel.

[0022] Working principle: This solution does not directly determine whether cleaning is necessary based on the current decline in power generation. Instead, it first organizes the power generation output data, environmental perception data, equipment status data, and historical cleaning records of the target local array into a current state sequence at the edge computing node. Then, it selects the historical operation segments that are closest to the current state from the historical operation records, extracts the power generation recovery characteristics of these segments before and after actual cleaning, and constructs the counterfactual cleaning reference trajectory that the local array should have in the current time period. Subsequently, it compares the reference trajectory with the current actual power generation output moment by moment, eliminates the impact of weather changes and equipment status changes, identifies the effective losses caused by dust accumulation, and deduces the cleaning benefit trajectory by combining wind speed and rainfall changes in subsequent time periods. Finally, it matches the cleaning equipment, operable time, and prohibited operating conditions to output the target cleaning trigger period. For example, in a large photovoltaic power plant, the power generation of a certain local array may be consistently low, but at the same time there are cloud changes, module temperature rise, and inverter power limitation. In this case, it is difficult to determine whether cleaning is really necessary based solely on the current power drop. This solution first reads the multi-source data of the local array in the current time period from the edge computing node, and then finds the historical segment that is closest to its current operating state from the historical records. Using the recovery situation before and after cleaning in these segments, the power generation trajectory that the local array should have in a cleaner state in the current time period is reconstructed. Then, the actual loss caused by dust accumulation is separated out, and combined with subsequent weather changes and the arrival of cleaning equipment and the time conditions for allowing shutdown operations, it is determined whether cleaning is suitable at the moment, or whether it should be postponed to a more suitable time period.

[0023] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A method for predicting the timing of photovoltaic dust accumulation cleaning for operation and maintenance decision-making, characterized in that, include: S1. Obtain edge monitoring data of the target local array within the current prediction period. The edge monitoring data includes power generation output data, environmental perception data, equipment status data, and historical cleaning records. Perform time-series alignment and segmentation based on the edge monitoring data to generate the current state sequence of the target local array. S2. Read the set of historical running segments of the target local array, perform candidate filtering according to the similarity between the current state sequence and each historical running segment in the process of environmental change, output change and state change, and generate a set of candidate reference segments corresponding to the target local array. S3. Perform a clean-up power generation characterization rollback calculation on the candidate reference segment set, extract the clean recovery features corresponding to the current state sequence in each candidate reference segment, and perform reference fusion based on the matching consistency between each candidate reference segment and the current state sequence to generate the counterfactual clean reference trajectory of the target local array in the current prediction period. S4. Solve the time-by-time difference between the current state sequence and the counterfactual cleaning reference trajectory, separate the effective loss caused by the ash accumulation effect and the disturbance loss caused by non-ash accumulation factors, and generate the cleaning benefit trajectory of the target local array based on the continuous change of the effective loss in the subsequent prediction interval. S5. Perform timing matching calculations between the cleaning revenue trajectory and the preset operation and maintenance execution conditions to determine the target cleaning triggering period of the target local array, and output the cleaning timing prediction results corresponding to the target cleaning triggering period.

2. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 1, characterized in that: S1 includes: S1-1. Perform time mapping and missing data completion processing on the power generation output data, environmental perception data, equipment status data and historical cleaning records according to the unified sampling time scale to generate a data alignment set corresponding to each sampling time. S1-2. Perform segmentation and slicing on the data alignment set according to the preset continuous time period division rules, extract the power generation output change characteristics, environmental change characteristics, equipment status change characteristics and cleaning behavior correlation characteristics in each segment, and generate data feature fragments corresponding to each segment. S1-3. Perform concatenation encoding according to the time sequence of each data feature segment to generate a current state sequence that represents the evolution process of the current operating state of the target local array.

3. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 2, characterized in that: S2 includes the following steps: S2-1. Slice the current state sequence and each historical operation segment into segments with the same time length, and extract the time-series change values ​​of irradiance, ambient temperature, component temperature, wind speed, and rainfall in each segment as the environmental change process. Extract the time-series change values ​​of array DC voltage, array DC current, AC active power, and cumulative power generation increment as the output change process. Extract the time-series change values ​​of inverter operating status code, power limit flag, fault alarm code, string on / off status, and communication quality indicators as the state change process. Generate the current process feature segment and the historical process feature segment. S2-2. Compare each current process feature segment with its corresponding historical process feature segment according to time position, calculate the cumulative result of the environmental variable change difference at each time, the cumulative result of the output variable change difference at each time, and the combined result of the state variable change difference and the state switching difference number at each time, and generate the environmental process difference value, output process difference value and state process difference value corresponding to each historical running segment. S2-3. Sort the environmental process difference value, output process difference value and state process difference value of each historical running segment in the same direction, retain the historical running segments in which the three types of process difference values ​​are simultaneously in the first position of the sort, and generate a set of candidate reference segments corresponding to the target local array.

4. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 3, characterized in that: S3 includes: S3-1. Locate the cleaning start time and cleaning end time for each candidate reference segment in the candidate reference segment set. Extract the post-cleaning output sequence after the cleaning end time that is the same length as the current prediction time period. Subtract the post-cleaning output sequence from the pre-cleaning output sequence of the candidate reference segment that is located before the cleaning start time and has the same position as the current prediction time period according to time. Generate the cleaning recovery feature sequence corresponding to each candidate reference segment. S3-2. Add the environmental process difference value, output process difference value and state process difference value of each candidate reference segment to generate the total segment difference value, and divide the reciprocal of the total segment difference value of each candidate reference segment by the sum of the reciprocals of the total segment difference values ​​of all candidate reference segments to generate the matching consistency coefficient corresponding to each candidate reference segment.

5. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 4, characterized in that: S3 further includes: S3-3. Map the clean recovery feature sequence of each candidate reference segment to the sampling time corresponding to the current state sequence according to the time position. Based on the output value of the current power generation output sequence at each sampling time, add the clean recovery increment of the corresponding sampling time step by step to generate the candidate recovery output sequence corresponding to each candidate reference segment. Then, calculate the deviation correction amount of each candidate recovery output sequence from the environmental change process and state change process in the current state sequence at the same sampling time. Perform position weighted fusion of each corrected candidate recovery output sequence according to the corresponding matching consistency coefficient step by step to generate the counterfactual clean reference trajectory of the target local array continuously unfolding along the time sequence in the current prediction period.

6. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 5, characterized in that: S4 includes: S4-1. Match the current power generation output sequence in the current state sequence with the counterfactual clean reference trajectory according to the same sampling time, calculate the output difference at each sampling time, and synchronously read the irradiance change value, ambient temperature change value, component temperature change value, wind speed change value, rainfall change value, inverter operation status code change value, power limit flag change value, fault alarm code change value, and string on / off status change value corresponding to each sampling time to generate a time-by-time difference data set. S4-2. For each time-by-time difference data set, calculate the environmental disturbance interpretation value and the state disturbance interpretation value respectively. The environmental disturbance interpretation value is the sum of the results obtained by multiplying the irradiance change value, ambient temperature change value, component temperature change value, wind speed change value, and rainfall change value at each sampling time by the response ratio of the same type of variable to the output difference in the corresponding historical operation segment. The state disturbance interpretation value is the sum of the output offsets corresponding to the inverter operating status code change value, power limit flag change value, fault alarm code change value, and series on / off status change value at each sampling time. Then, subtract the environmental disturbance interpretation value and the state disturbance interpretation value from the output difference at each sampling time to generate the residual difference at each sampling time.

7. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 6, characterized in that: S4 further includes: S4-3. Arrange the residual differences at each sampling time in chronological order, calculate the difference between the residual differences at adjacent sampling times, and determine the consecutive sampling periods where the absolute value of the difference between the residual differences at adjacent sampling times is less than the absolute value of the residual difference at the previous sampling time as smooth continuation segments, and determine the remaining sampling periods as abrupt disturbance segments. Then, record the residual differences in the smooth continuation segments as effective loss candidate values, and record the residual differences in the abrupt disturbance segments as disturbance loss candidate values.

8. The photovoltaic dust accumulation cleaning timing prediction method for operation and maintenance decision-making according to claim 7, characterized in that: S4 further includes: S4-4. Read the power generation recovery sequence after each clean event in the historical clean record. Compare each smooth continuation segment with each power generation recovery sequence according to the principle of consistent time length. Calculate the cumulative result of the absolute value of the difference at the corresponding time. Determine the power generation recovery sequence with the smallest cumulative result as the clean corresponding recovery sequence of that smooth continuation segment. Then, determine the effective loss candidate value that is consistent with the change direction of the clean corresponding recovery sequence as the effective loss amount. Incorporate the effective loss candidate values ​​that are inconsistent with the change direction into the disturbance loss amount.

9. A method for predicting the timing of photovoltaic dust accumulation cleaning based on operation and maintenance decision-making, as described in claim 8, characterized in that: S4 further includes: S4-5. Extend the effective loss amount to subsequent prediction intervals in chronological order. At each future sampling time, subtract the effective loss amount of the previous sampling time from the scouring reduction amount formed by the change in rainfall at the corresponding future sampling time, and add it to the dust increase formed by the change in wind speed to generate the continuous loss amount for each future sampling time. Then, combine the effective loss amount in the current prediction period with the continuous loss amount in the subsequent prediction interval in chronological order to generate the clean benefit trajectory of the target local array.

10. A method for predicting the timing of photovoltaic dust accumulation cleaning based on operation and maintenance decision-making, as described in claim 9, characterized in that: S5 includes the following steps: S5-1. Obtain the operation resource data and operation constraint data corresponding to the target local array. The operation resource data includes the number of available cleaning equipment, the operation capacity of the cleaning equipment per unit time, the arrival time of the cleaning equipment and the expected operation duration. The operation constraint data includes the allowed downtime period, the prohibited operation period and the weather-related prohibition conditions. Generate an executable time period set based on the operation resource data and operation constraint data. S5-2. Enumerate the clean revenue trajectory within the set of executable time periods by candidate continuous time periods. For each candidate continuous time period, calculate the total recoverable revenue and total operating cost corresponding to that candidate continuous time period. The total recoverable revenue is the difference between the clean revenue value corresponding to the start time and the clean revenue value corresponding to the end time of that candidate continuous time period. The total operating cost is the product of the expected operating time and the operating cost per unit time. Calculate the ratio of the total recoverable revenue to the total operating cost as the timing benefit value. S5-3. Sort the timing benefit values ​​of each candidate continuous time period, select the candidate continuous time period with the largest timing benefit value as the target cleaning trigger time period, and output the cleaning timing prediction result corresponding to the target cleaning trigger time period.