A power station fault intelligent diagnosis method and system based on adaptive reference matching
By using adaptive benchmark matching technology, a personalized health status model is constructed, which solves the problems of threshold limitations and environmental adaptability in photovoltaic power plant fault diagnosis, and realizes high-precision, robust fault diagnosis and refined root cause localization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- ZHONGLAI ZHILIAN ENERGY ENG CO LTD
- Filing Date
- 2026-03-11
- Publication Date
- 2026-06-26
Smart Images

Figure CN122286367A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of power technology, and in particular to photovoltaic power plant fault detection technology, specifically a power plant fault intelligent diagnosis method and system based on adaptive benchmark matching. Background Technology
[0002] Currently, fault diagnosis for distributed photovoltaic power stations mainly faces the following technical bottlenecks:
[0003] Limitations of state thresholds: Existing technologies mostly use fixed thresholds or thresholds based on simple statistics (such as global mean and standard deviation) for anomaly detection. However, these methods are usually unable to adapt to the complex and ever-changing environmental conditions in power plant operation (such as continuous fluctuations in irradiance and temperature) and the slow natural degradation of equipment performance, resulting in both high false alarm rates and high missed alarm rates.
[0004] The baseline data lacks adaptability: Most existing diagnostic methods rely on preset, single theoretical models or empirical values as performance benchmarks. However, when the actual operating environment of the power plant (such as geographical location and climate type) or equipment model does not match the preset benchmark, the diagnostic accuracy drops significantly, lacking the ability to adaptively learn to specific power plant operating characteristics.
[0005] Diagnostic logic is singular and isolated: Most common diagnostic methods either only detect "missing strings" (complete failure) or only analyze "dispersion rate" (performance inconsistency), and their diagnostic dimensions are singular. Among them, the few methods that combine multiple indicators have failed to solve the problem of conflict and integration of analysis results of different indicators in complex scenarios, and have not made full use of the physical topology of the power plant for root cause localization.
[0006] Weak ability to cope with rare operating conditions: Under extreme or rare weather conditions, the diagnostic system is prone to failure or giving unreliable results due to the lack of corresponding historical health data as a reference.
[0007] Therefore, a fault diagnosis solution is needed that can autonomously construct personalized operating baselines, intelligently adapt to environmental changes, and integrate multi-dimensional evidence for accurate reasoning, in order to overcome the inherent defects of static models. Summary of the Invention
[0008] This invention aims to provide a method and system for intelligent fault diagnosis of power plants based on adaptive benchmark matching. Its core objective is to address the problems of static and rigid benchmark data and poor environmental adaptability in existing technologies. By introducing an adaptive architecture that combines offline historical learning with online dynamic benchmark matching, the system can establish personalized health status models for specific power plants and intelligently match or generate the most suitable benchmark data under different environmental conditions, thereby achieving high-precision and robust fault diagnosis and location.
[0009] To achieve the above-mentioned technical objectives, the present invention provides the following technical solution.
[0010] Option 1: This invention provides a power plant fault intelligent diagnosis method based on adaptive benchmark matching, characterized by the following steps:
[0011] Step S1. Construction of offline benchmark model library:
[0012] Collect long-term series string-level operation data and synchronous environmental data of the target photovoltaic power station during its historical healthy operation cycle. The operation data and environmental data include one or more types.
[0013] Based on any one or more environmental data collected, a one-dimensional or multi-dimensional feature space is constructed. The dimensional axes of the feature space are divided into multiple continuous intervals. Then, based on the division of the dimensional axes, the feature space is divided into several cells.
[0014] Based on the environmental data range defined for each cell and the correspondence between the string-level running data and the synchronous environmental data, the collected running data is defined in the corresponding cells, and for all cells with running data, the statistical characteristics of the various running data defined are calculated independently.
[0015] All obtained statistical features are associated with corresponding cell records and stored in a unified database to build an offline benchmark model library;
[0016] Step S2. Online adaptive benchmark acquisition:
[0017] Before online diagnosis, the current operating parameters and environmental parameters of the target photovoltaic power station are obtained, and the target cell in the feature space to which the current environmental parameters belong is determined;
[0018] If the target cell has a record in the benchmark model library, the statistical characteristics of the target cell are directly retrieved as benchmark data output. The benchmark data is the comparison standard for analyzing the current running data in the diagnostic phase.
[0019] If the target cell is not recorded in the benchmark model library, search for one or more other cells that are closest to the target cell in the benchmark model library, and obtain the statistical characteristics of the closest cell or the statistical characteristic values of multiple other cells after weighted average processing.
[0020] Based on preset standards, it is determined whether the statistical characteristics of the closest cell or the statistical characteristic value after weighted average processing is within a reliable range. If it is within a reliable range, it is output as the benchmark data; otherwise, it is called based on the statistical data of the entire site's historical data or based on the preset experience value, and output as the comparison standard for analyzing the current running data in the diagnostic stage.
[0021] Based on the above solutions, further improvements or preferred solutions include:
[0022] Furthermore, the diagnostic method of the present invention includes a process of periodically updating the offline benchmark model library, which is built based on historical healthy operating cycles that are closer to the current time.
[0023] Furthermore, after determining the target cell, step S2 specifically performs the following sub-steps:
[0024] Step S21. Determine whether there is a relevant record for the target cell in the benchmark model library. If there is, directly retrieve the statistical characteristics of the target cell as the benchmark data output. If there is no record, proceed to step S22.
[0025] Step S22. Calculate the distance parameter from the parameter point corresponding to the current environmental parameter to the center point of each environmental condition cell in the benchmark model library, determine the cell with the smallest distance parameter, and determine whether its distance parameter to the parameter point exceeds the preset confidence threshold. If it does not exceed the threshold, output the statistical characteristics of the cell as the benchmark data; if it exceeds the threshold, proceed to step S23.
[0026] Step S23. Using the center point of the environmental condition cell as the reference point, select multiple reference points in the reference model library that have the smallest distance from the parameter point of the current environmental parameter. Using the reciprocal of the distance between each reference point and the parameter point of the current environmental parameter as the weight, perform weighted average interpolation on the corresponding statistical characteristics of the cell to which each reference point belongs, and output the generated data as the reference data.
[0027] Furthermore, before outputting the generated data as reference data in step S23, the following steps are performed:
[0028] Determine whether the generated data is within a preset confidence range. If it is within the confidence range, output the generated data as the baseline data; otherwise, proceed to step S24.
[0029] Step S24. Call the global baseline data based on the statistics of historical data of the entire station or the global baseline data with preset experience values, and output it as the comparison standard for analyzing the current running data in the diagnostic stage.
[0030] Furthermore, the environmental parameters include irradiance and temperature, and in step S1, a two-dimensional feature space is preferably constructed based on irradiance and temperature.
[0031] Furthermore, the statistical characteristics include at least one or more of the following: average power, power standard deviation, coefficient of variation, and reasonable upper and lower bounds of power based on a normal distribution.
[0032] Furthermore, the statistical characteristics include reasonable upper and lower bound parameters for power based on a normal distribution, and the diagnostic phase includes the following steps:
[0033] Step S31. Adaptive missing string diagnosis: Using the reasonable lower bound parameter of power in the baseline data output in step S2 as a dynamic threshold, strings whose power is continuously lower than this threshold within a preset time are marked as "suspected missing strings". A topology confidence check is introduced. If the adjacent strings of a string marked as "suspected missing strings" are running normally, its missing string confidence parameter is increased. If the adjacent strings are also abnormal, its missing string confidence parameter is decreased.
[0034] Furthermore, the statistical characteristics include the coefficient of variation (CV_baseline), and the diagnostic phase includes the following steps:
[0035] Step S32. Dynamic Discreteness Analysis: Among the strings marked as "suspected missing strings", strings with a missing string confidence level exceeding the preset standard are excluded. The coefficient of variation (CV_curr) and dispersion deviation (ΔCV) of the current power of the remaining normal strings are calculated, where ΔCV = CV_curr - CV_baseline, and CV_baseline is the power coefficient of variation in the baseline data output in step S2. If the dispersion deviation (ΔCV) exceeds the set threshold, it is determined that the power plant has performance discrepancies, providing evidence for root cause inference.
[0036] Furthermore, the diagnostic phase includes the following steps:
[0037] Step 3.3 Spatial Correlation Analysis: Based on the real-time collected string power and the average power in the baseline data output in step S2, identify inefficient strings that are lower than the baseline data; for inefficient strings, analyze their spatial distribution clustering by combining the electrical topology and physical layout diagram of the power plant; if inefficient strings are concentrated in the same electrical branch or physical area, it is determined that the corresponding strings have clustered anomalies, providing evidence for root cause inference.
[0038] Furthermore, the present invention also includes the following steps:
[0039] Step S4. Integrating Decision Making and Root Cause Inference:
[0040] Evidence fusion: Taking into account all the diagnostic results in step S3, a rule engine is applied to make a fusion decision on the diagnostic results. The rule engine includes conflict resolution rules.
[0041] Root cause inference: Based on the fused diagnostic results and the spatial distribution clustering information obtained from the analysis, fault type subdivision and root cause inference are performed;
[0042] Step S5. Diagnostic Output and Alarms:
[0043] Generate a structured diagnostic report, which includes fault information and targeted operation and maintenance suggestions generated based on root cause inference, and trigger tiered alarms.
[0044] Option 2: This invention provides a power plant fault intelligent diagnosis system based on adaptive benchmark matching, used to implement the diagnosis method described in any of the preceding claims, characterized in that it includes:
[0045] The data acquisition and preprocessing module is used to connect to the target photovoltaic power station monitoring system, acquire string-level operating data and synchronous environmental data in real time, and preprocess them.
[0046] The offline benchmark model library construction module includes an environment interval division unit, a statistical feature calculation unit, and a data storage unit. The environment interval division unit is used to construct a one-dimensional or multi-dimensional feature space based on any one or more collected environmental data, dividing each dimension axis of the feature space into multiple continuous interval segments, and further dividing the feature space into several cells based on the division of each dimension axis. The statistical feature calculation unit is used to define the collected running data in corresponding cells based on the environmental data range defined for each cell and the correspondence between the string-level running data and the synchronized environmental data, and independently calculates the statistical features of various running data defined for each cell containing running data. The data storage unit is used to associate all obtained statistical features with the corresponding cells, record them in a unified database, and construct the offline benchmark model library.
[0047] The online adaptive benchmark acquisition engine includes an exact matching queryer, a weighted distance calculator, a dynamic interpolator, and a global benchmark generator. It executes step S2 based on a multi-level benchmark matching strategy to obtain suitable benchmark data for diagnostics.
[0048] The parallel fault analysis engine acquires benchmark data from the engine output based on the online adaptive benchmark, as well as the spatial correlation between strings, and performs fault analysis on each string of the power plant from multiple aspects.
[0049] The fusion decision and knowledge base module applies the application rule engine to make fusion decisions on the various diagnostic results output by the parallel fault analysis engine, and performs fault type subdivision and root cause inference based on the fused diagnostic results and spatial correlation information between strings.
[0050] The alarm and report generation module is used to generate structured diagnostic reports, which include fault information and targeted operation and maintenance suggestions generated based on root cause inference, and trigger hierarchical alarms.
[0051] The beneficial effects of this invention are:
[0052] 1) Strong baseline adaptability and high diagnostic accuracy: This invention combines "offline learning of personalized baselines" with "online multi-level dynamic matching" to completely eliminate static thresholds, enabling the diagnostic baseline to be intelligently adjusted according to environmental conditions, greatly reducing false alarms caused by fluctuations in environmental factors and significantly improving diagnostic accuracy.
[0053] 2) Excellent system robustness: In a further technical solution, the multi-level adaptive strategy designed in this invention can ensure that the diagnostic system can obtain usable benchmark data under various environmental conditions, from common to rare, thereby improving the continuity and reliability of the diagnostic system's functions.
[0054] 3) Achieving refined root cause localization: In a further technical solution, this invention deeply integrates adaptive benchmark analysis results with power grid topology and spatial layout information, which not only enables the discovery of faults, but also effectively distinguishes between equipment-level, branch-level, and regional-level faults, helping to infer potential causes and providing clear action guidelines for operation and maintenance.
[0055] 4) Possesses self-evolution potential: The diagnostic system architecture of this invention supports the continuous injection of new health operation data into the offline learning module and the regular updating of the benchmark model library, enabling the system to adjust the benchmark according to the natural aging of power plant equipment, and possessing long-term adaptability and self-optimization capabilities. Attached Figure Description
[0056] Figure 1 The flowchart of the overall architecture and adaptive benchmark acquisition strategy of the diagnostic system of the present invention is shown in a specific embodiment.
[0057] Figure 2 This is a schematic diagram of the gridded division of environmental condition intervals in a two-dimensional feature space, as shown in a specific embodiment.
[0058] Figure 3 This is a schematic diagram illustrating the merging of online matching and interpolation strategies in a specific embodiment;
[0059] Figure 4 This is a schematic diagram of the user interface area of an application example of the system of the present invention;
[0060] Figure 5 This is a schematic diagram of the user interface area two of the system application example of the present invention. Detailed Implementation
[0061] The core of this invention lies in constructing a "data-driven, hierarchical, and dynamically adaptive" intelligent diagnostic system framework. In the offline phase, the diagnostic system utilizes historical health operation data of the power plant, especially recent historical health operation data, and learns the normal operation characteristics of the power plant within different environmental condition ranges through machine learning methods, constructing a benchmark model library from one or more dimensions. During online diagnosis, the diagnostic system can sense current environmental parameters in real time and initiate a multi-level, degradable adaptive benchmark acquisition strategy. It intelligently matches or dynamically generates benchmark data from the benchmark model library that best fits the current conditions. Subsequently, based on this adaptive benchmark, it performs parallel operations such as missing string diagnosis, power dispersion analysis, and spatial correlation analysis. Finally, it outputs accurate diagnostic conclusions through a fusion decision engine.
[0062] Example 1:
[0063] A power plant fault intelligent diagnosis method based on adaptive benchmark matching, the specific implementation process of which includes the following stages:
[0064] (I) Construction of an offline personalized benchmark model library
[0065] 1.1 Data Preparation
[0066] The system collects long-term string-level operational data (such as current and power) and synchronous environmental data (such as irradiance and temperature) from the target photovoltaic power plant during its historical healthy operating cycle (such as the initial fault-free period after commissioning). "Long-term" refers to a period in which sufficient historical data can be obtained; the length of this period can be adjusted during the testing phase of the diagnostic system based on system operating efficiency and the accuracy of the diagnostic results.
[0067] 1.2 Environmental Condition Interval Gridding
[0068] Based on the collected irradiance and temperature data, a two-dimensional irradiance-temperature feature space is constructed. The irradiance parameter axis and temperature parameter axis of the two-dimensional feature space are each divided into several continuous intervals at equal intervals. For example, irradiance is divided into [0,200), [200,400), ... (W / m²), and temperature is divided into [-10,0), [0,10), ... (°C). Combining the division of the irradiance parameter axis and temperature parameter axis, the two-dimensional feature space is planned into multiple closely arranged but non-overlapping rectangular intervals, forming discrete environmental condition cells.
[0069] This embodiment uses irradiance and temperature data as examples to illustrate this stage. However, in the practice of the method of the present invention, other one or more key environmental data can be selected according to the actual situation to construct a feature space in one or more dimensions. Generally, constructing a two-dimensional or three-dimensional feature space is preferred to improve the fit between the reference data and the current power plant.
[0070] 1.3 Learning Cell Baseline Features
[0071] Based on the correspondence between the collected historical operating data and the synchronous environmental data, for each environmental condition cell with sufficient historical operating data, the statistical characteristics of the power of all strings in the cell are calculated.
[0072] This embodiment takes power as an example. The statistical features include, but are not limited to: average power μ_cell, power standard deviation σ_cell, coefficient of variation CV_cell, reasonable upper and lower bounds of power based on normal distribution (such as μ_cell ± 3σ_cell), and the average power baseline value of each group.
[0073] Each "(environmental condition cell, statistical feature)" is treated as a data record, where the "environmental condition cell" is represented by its position parameter in the two-dimensional feature space, and the "statistical feature" refers to the statistical feature associated with that cell. All the calculated "(environmental condition cell, statistical feature)" data records are then stored in a unified database to construct a benchmark model library.
[0074] This phase addresses the "one-size-fits-all" static benchmark problem in traditional diagnostic methods, establishing a fine-grained, personalized, multi-dimensional health status map for the power plant. Furthermore, in practice, considering factors such as equipment aging and natural changes in environmental factors, a more recent historical health operation cycle can be selected based on the power plant's operating conditions, and the data in the offline benchmark model library can be updated periodically to improve the reliability of the data in the benchmark model library as a diagnostic reference.
[0075] (ii) Acquisition of online adaptive benchmark
[0076] During online diagnostics, the current real-time environmental parameters (I_curr, T_curr) are obtained, where I_curr and T_curr represent the current real-time irradiance and temperature values, respectively. The following multi-level strategy is executed to obtain the most suitable baseline data.
[0077] First-level exact match query:
[0078] Determine the environmental condition cell in the irradiance-temperature two-dimensional feature space corresponding to the current environmental parameter (I_curr, T_curr), and directly query the benchmark model library to see if the data record of the environmental condition cell is stored. If it exists, it is considered a hit, which is the optimal case. Then, return the statistical features of the hit cell as the output. If it does not exist, i.e., it is a miss, the second-level strategy is activated.
[0079] Second-level weighted nearest neighbor matching:
[0080] Calculate the weighted Euclidean distance (distance) from the parameter point corresponding to the current environmental parameter (I_curr, T_curr) to the center point (I_center, T_center) of each environmental condition cell in the benchmark model library. The expression is: distance=sqrt(w_I*((I_curr-I_center) / I_range)^2+w_T*((T_curr -T_center) / T_range)^2), where distance represents the weighted Euclidean distance, w_I and w_T are weight coefficients, and w_I > w_T (reflecting the physical characteristic that irradiance has a greater impact on photovoltaic output), and I_range and T_range are normalization factors.
[0081] The environmental condition cell with the smallest weighted Euclidean distance is determined, and it is determined whether the weighted Euclidean distance between the cell and the parameter point exceeds a preset confidence threshold. If it does not exceed the threshold, the statistical feature of the environmental condition cell with the smallest weighted Euclidean distance is output. Otherwise, the statistical feature of the single nearest neighbor cell is considered unreliable as the baseline feature, and the third layer strategy is activated.
[0082] The second-layer strategy achieves intelligent similarity retrieval of benchmark data through distance metrics and weight allocation.
[0083] Third-level dynamic multi-point interpolation:
[0084] Using the center point of each environmental condition cell as the reference point, and based on the weighted Euclidean distance from the previously calculated parameter points to each reference point, sort them by proximity and select the K (e.g., K=4) reference points that are closest to the parameter points of the current environmental parameter (I_curr, T_curr).
[0085] Using the reciprocal of the distance between the K reference points and the parameter points of the current environmental parameters (I_curr, T_curr) as weights, a weighted average interpolation is performed on the reference features (such as average power, coefficient of variation, etc.) of the cell to which each reference point belongs, generating a new set of dynamic reference data that best fits the current environment.
[0086] The formula for interpolating to generate a dynamic benchmark is as follows:
[0087] μ_new=∑(ω_i·μ_i) / ∑ω_i, CV_new=∑(ω_i·CV_i) / ∑ω_i;
[0088] Where μ_new and CV_new represent the weighted interpolation of average power and coefficient of variation, respectively, ω_i is the weight coefficient of the i-th reference point, and μ_i and CV_i are the average power and coefficient of variation of the cell to which the i-th reference point belongs, respectively.
[0089] If the obtained dynamic benchmark data is within a reliable range according to the preset standards, it will be used as the output.
[0090] Fourth-level global baseline degradation:
[0091] According to the preset standard, if the dynamic benchmark data obtained by the third-layer strategy is still not within the reliable range, that is, when the current three-layer strategy cannot obtain effective benchmark data (such as when the system is initialized or when encountering extremely rare working conditions), the global benchmark data based on the statistics of the entire station's historical data or the global benchmark data based on the preset experience value is called and used as the output, which can ensure that the diagnostic system does not interrupt its function at any time.
[0092] The aforementioned multi-level strategy forms a complete, adaptive, and degradable benchmark acquisition chain, significantly improving the robustness of the diagnostic system to environmental fluctuations.
[0093] (III) Parallel Fault Diagnosis Based on Adaptive Benchmark
[0094] Using the output of stage (II) as the baseline data, the real-time acquired string-level operation data is analyzed as follows:
[0095] (1) Adaptive string missing diagnosis: The lower limit of power in the reference data is used as a dynamic threshold. Strings whose power is continuously lower than this threshold are judged as "suspected string missing". Topology confidence verification is introduced: If the adjacent strings of the suspected string missing string (determined according to the electrical wiring diagram) are operating normally, the string missing confidence is increased; if the adjacent strings are also abnormal, the confidence is reduced, indicating that it may be due to local environmental interference.
[0096] (2) Dynamic dispersion analysis: In the strings marked as “suspected missing strings”, strings with a missing string confidence level exceeding the preset standard are excluded. The coefficient of variation CV_curr of the current power of the remaining normal strings is calculated, and the dispersion deviation ΔCV is calculated. ΔCV = CV_curr - CV_baseline, where CV_baseline is the coefficient of variation in the baseline data. If ΔCV exceeds the set threshold, the power station is determined to have “performance dispersion”.
[0097] (3) Spatial correlation analysis: Based on the real-time collected string power and the average power μ_cell in the baseline data, inefficient strings are identified; for the identified inefficient strings, the spatial distribution clustering is analyzed in combination with the electrical topology of the power station (such as the relationship between string-combiner box-inverter) and physical layout diagram; if the inefficient strings are concentrated in the same electrical branch or physical area, they are marked as "cluster anomaly", providing key evidence for root cause inference.
[0098] (iv) Integrated decision-making and root cause inference
[0099] (1) Evidence fusion: Based on the three diagnostic results of step S3, the rule engine is used to make a fusion decision on the diagnostic results. For example, if a string marked as "suspected missing string" is also identified as a member of an inefficient cluster by "spatial correlation analysis", it will be reclassified as a "severe manifestation individual" of the cluster failure, rather than an isolated failure, through the preset conflict resolution rules.
[0100] (2) Root cause inference: Based on the fused diagnostic results and spatial correlation information, the fault type is subdivided and the root cause is inferred. For example, if the "cluster anomaly" is concentrated in the same combiner box branch, it can be inferred as "branch contact fault"; if it is concentrated in the same orientation of the array, it can be inferred as "local shadow occlusion".
[0101] (v) Diagnostic output and alarms
[0102] Generate a structured diagnostic report, including fault location (down to the string or branch), fault type, severity, confidence level, and other fault information, as well as targeted operation and maintenance suggestions based on root cause inference, and trigger tiered alarms.
[0103] Example 2:
[0104] Based on the same design concept as Embodiment 1, this embodiment provides a power plant fault intelligent diagnosis system based on adaptive benchmark matching.
[0105] The intelligent fault diagnosis system for power plants based on adaptive benchmark matching includes the following components:
[0106] (1) Data acquisition and preprocessing module
[0107] The data acquisition and preprocessing module is used to connect to the target photovoltaic power station monitoring system, acquire string-level operating data and synchronous environmental data in real time, and clean and preprocess them to remove data that has no reference value based on preset standards.
[0108] (2) Offline benchmark model library construction module
[0109] The offline benchmark model library construction module includes an environment interval division unit, a statistical feature calculation unit, and a data storage unit, wherein:
[0110] The environmental interval division unit is used to construct an irradiance-temperature two-dimensional feature space based on the collected environmental data, such as irradiance and temperature data, and to divide each dimension axis of the feature space into multiple continuous interval segments. Based on the division of each dimension axis, the feature space is divided into multiple cells.
[0111] The statistical feature calculation unit is used to define the collected running data in the corresponding cells according to the environmental data range defined by each cell and the correspondence between the string-level running data and the synchronous environmental data, and to independently calculate the statistical features of the various running data defined by each cell with sufficient running data.
[0112] The data storage unit is used to associate all the obtained statistical features with the corresponding cells and store them in a unified database to build an offline benchmark model library.
[0113] (III) Online Adaptive Benchmark Acquisition Engine
[0114] The online adaptive benchmark acquisition engine includes an exact matching queryer, a weighted distance calculator, a dynamic interpolator, and a global benchmark generator. It executes a multi-level benchmark matching strategy to achieve the online adaptive benchmark acquisition in the second stage of Embodiment 1, thereby obtaining suitable benchmark data for diagnosis.
[0115] (iv) Parallel Fault Analysis Engine
[0116] The parallel fault analysis engine includes an adaptive missing string diagnosis unit, a dynamic dispersion rate analysis unit, and a spatial correlation analysis unit. It acquires the benchmark data output by the engine based on the online adaptive benchmark, as well as the spatial correlation information between strings, and performs fault analysis on each string of the power plant from multiple aspects.
[0117] (v) Integration of Decision Making and Knowledge Base Module
[0118] The fusion decision and knowledge base module includes a rule engine, an electrical topology database, and a fault case library. For the various diagnostic results output by the parallel fault analysis engine, the rule engine is applied to fuse the diagnostic results. Based on the fused diagnostic results and the spatial correlation information between strings, fault type subdivision and root cause inference are performed. This process can incorporate reference to historical fault cases.
[0119] (vi) Alarm and Report Generation Module
[0120] The alarm and report generation module is used to generate a structured diagnostic report, which includes fault location (specific to the string or branch), fault type (such as complete failure, performance degradation, partial obstruction, connection failure, MPPT anomaly, etc.), severity, confidence level and other fault information, as well as targeted operation and maintenance suggestions generated based on root cause inference, and triggers hierarchical alarms.
[0121] This embodiment and the design scheme of Embodiment 1 belong to the same overall design concept. Therefore, the specific operating principle and working process of the diagnostic system can be referred to Embodiment 1, and will not be repeated here.
[0122] The core advantages of the diagnostic methods and systems described in the above embodiments can be seen in the following aspects:
[0123] 1. The qualitative change of the benchmark from "static rigidity" to "dynamic adaptation"
[0124] Problems with traditional diagnostic techniques: They rely on fixed thresholds or static benchmarks calculated based on limited data, which cannot adapt to continuous fluctuations in environmental parameters such as irradiance and temperature, as well as the slow aging of the equipment itself. This can easily lead to problems such as "accurate diagnosis on sunny days, but many false alarms on cloudy days" or "good for new power plants, but malfunctioning for old power plants".
[0125] The breakthrough of this solution lies in its introduction of an architecture that combines "offline learning of historical health data with online multi-layer intelligent matching." The system establishes a dedicated "health memory bank" (benchmark model library) for specific power plants. During operation, the system "recalls" or "calculates" (through precise matching, weighted nearest neighbor analysis, and dynamic interpolation) the most suitable "health standard" (cell division) based on the current weather conditions in real time. This enables personalized and scenario-based dynamic generation of diagnostic benchmarks, fundamentally overcoming environmental interference.
[0126] 2. The Evolution of Diagnostic Logic from "Single Point Alarm" to "Integrated Reasoning"
[0127] Traditional diagnostic techniques suffer from limitations: they rely on a single diagnostic dimension (only looking at whether there is a power outage or only looking at power deviation), which is like "the blind men and the elephant." They can only report "there is an abnormality" but cannot answer "what is abnormal" or "why is there an abnormality."
[0128] The breakthrough of this solution lies in its construction of a three-dimensional parallel analysis engine integrating "string loss diagnosis + dispersion rate analysis + spatial topology correlation". In particular, by fusing decision-making and conflict resolution mechanisms, it can comprehensively analyze various pieces of evidence. For example, it can determine whether a low-power string is an isolated "old, weak, or disabled" component or a member of a "sick family" (such as a combiner box branch), thus elevating simple alarms to precise diagnoses with root cause inference.
[0129] 3. Enhanced system robustness from "fragile" to "tough".
[0130] Problems with traditional diagnostic techniques: When faced with extreme or rare operating conditions that have not appeared in historical data, the system is prone to failure or misjudgment because it cannot find a reference benchmark.
[0131] The breakthrough of this solution lies in its design of a four-level progressive, safely degradable adaptive strategy chain (precise matching → weighted nearest neighbor → dynamic interpolation → global benchmark). Analogous to a navigation system, when the preferred path is blocked, it can automatically plan alternative paths, ultimately ensuring that the destination is always reached. This mechanism guarantees reliable operation of the system in all weather conditions, from common sunny days to rare extreme weather.
[0132] The technical effects that the diagnostic methods and systems described in the above embodiments may actually achieve include:
[0133] 1. Improved operational efficiency: From "finding a needle in a haystack" to "precise navigation"
[0134] This improves fault location accuracy from the "power station level" or "inverter level" to the "string level" and even the "electrical branch level," clearly distinguishing between component problems, connection problems, and localized blockages. Maintenance personnel can then directly address the issue based on the report, significantly reducing the average troubleshooting time. For example, the report can clearly indicate "poor contact in the second branch of combiner box A," rather than a general "low power station efficiency."
[0135] 2. Guaranteeing Power Generation Revenue: From "Post-Event Repair" to "Pre-Event Early Warning"
[0136] High-precision dispersion analysis can detect early, minor degradation in component performance (such as initial hot spots, dust accumulation, and slight oxidation at connection points), providing early warnings before they cause significant power generation losses or trigger serious failures (such as fires). This shifts the maintenance model from "repair after a failure" to "preventative maintenance," maximizing the available generating hours of the power plant and directly protecting power generation revenue.
[0137] 3. Reduction in management costs: From "relying on experts" to "artificial intelligence"
[0138] The diagnostic system encapsulates diagnostic knowledge through algorithms, reducing over-reliance on the personal experience of on-site maintenance personnel. New employees and managers can also understand the power plant's status through clear diagnostic reports. Simultaneously, it significantly reduces false alarms caused by environmental interference, avoids unnecessary on-site inspections, and saves on manpower and travel costs.
[0139] 4. Driving Industry Paradigms: From "Monolithic Tools" to "Evolutionary Platforms"
[0140] This invention's diagnostic system is not merely a diagnostic tool, but also a platform with continuous learning capabilities. As the power plant operates, new health data can be continuously used to optimize the baseline model, enabling the diagnostic system to "grow" and "evolve" along with the power plant's lifecycle, adapting to the aging process of the equipment. This provides a solid technical foundation for the full lifecycle digital and intelligent asset management of photovoltaic power plants.
[0141] In summary, the above-described embodiments of the present invention, through three core innovations—"dynamic adaptive benchmark," "multi-dimensional fusion diagnosis," and "hierarchical robust strategy"—successfully upgrade photovoltaic power plant fault diagnosis from an auxiliary tool that relies on fixed rules, has a delayed response, and is vague in its location to an intelligent operation and maintenance core system with autonomous learning, real-time accuracy, and root cause analysis capabilities.
[0142] The above description is only a preferred embodiment of the present invention. For those skilled in the art, certain adjustments can be made to some method steps or system modules. It should be noted that any improvements and modifications made without departing from the core principles of the present invention should also be considered within the scope of protection of the present invention.
Claims
1. A power plant fault intelligent diagnosis method based on adaptive benchmark matching, characterized in that, Includes the following steps: Step S1. Construction of offline benchmark model library: Collect long-term series string-level operation data and synchronous environmental data of the target photovoltaic power station during its historical healthy operation cycle. The operation data and environmental data include one or more types. Based on any one or more environmental data collected, a one-dimensional or multi-dimensional feature space is constructed. The dimensional axes of the feature space are divided into multiple continuous intervals. Then, based on the division of the dimensional axes, the feature space is divided into several cells. Based on the environmental data range defined for each cell and the correspondence between the string-level running data and the synchronous environmental data, the collected running data is defined in the corresponding cells, and for all cells with running data, the statistical characteristics of the various running data defined are calculated independently. All obtained statistical features are associated with corresponding cell records and stored in a unified database to build an offline benchmark model library; Step S2. Online adaptive benchmark acquisition: Before online diagnosis, the current operating parameters and environmental parameters of the target photovoltaic power station are obtained, and the target cell in the feature space to which the current environmental parameters belong is determined; If the target cell has a record in the benchmark model library, the statistical characteristics of the target cell are directly retrieved as benchmark data output. The benchmark data is the comparison standard for analyzing the current running data in the diagnostic phase. If the target cell is not recorded in the benchmark model library, search for one or more other cells that are closest to the target cell in the benchmark model library, and obtain the statistical characteristics of the closest cell or the statistical characteristic values of multiple other cells after weighted average processing. Based on a preset standard, determine whether the statistical characteristics of the closest cell or the statistical characteristic value after weighted average processing is within a reliable range. If it is within a reliable range, output it as the benchmark data. If not, call the global benchmark data based on the statistics of the entire site's historical data or based on the preset experience value, and output it as the comparison standard for analyzing the current running data in the diagnostic phase.
2. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 1, characterized in that, This includes the process of regularly updating the offline benchmark model library, which is built based on historical healthy operating cycles that are closer to the present time.
3. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 1, characterized in that, After determining the target cell, step S2 performs the following sub-steps: Step S21. Determine whether there is a relevant record for the target cell in the benchmark model library. If there is, directly retrieve the statistical characteristics of the target cell as the benchmark data output. If there is no record, proceed to step S22. Step S22. Calculate the distance parameter from the parameter point corresponding to the current environmental parameter to the center point of each environmental condition cell in the benchmark model library, determine the cell with the smallest distance parameter, and determine whether its distance parameter from the parameter point exceeds the preset confidence threshold. If it does not exceed the threshold, output the statistical characteristics of the cell as the benchmark data. If it exceeds the limit, proceed to step S23; Step S23. Using the center point of the environmental condition cell as the reference point, select multiple reference points in the reference model library that have the smallest distance from the parameter point of the current environmental parameter. Using the reciprocal of the distance between each reference point and the parameter point of the current environmental parameter as the weight, perform weighted average interpolation on the corresponding statistical characteristics of the cell to which each reference point belongs, and output the generated data as the reference data.
4. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 3, characterized in that, Before outputting the generated data as baseline data in step S23, perform the following steps: Determine whether the generated data is within the preset confidence range. If it is within the confidence range, output the generated data as the baseline data. If not, proceed to step S24; Step S24. Call the global baseline data based on the statistics of historical data of the entire station or the global baseline data with preset experience values, and output it as the comparison standard for analyzing the current running data in the diagnostic stage.
5. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 1, characterized in that: The environmental parameters include irradiance and temperature. In step S1, a two-dimensional feature space is constructed based on irradiance and temperature. The statistical characteristics include at least one or more of the following: average power, power standard deviation, coefficient of variation, and reasonable upper and lower bounds for power based on a normal distribution.
6. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 1, characterized in that, The statistical characteristics include reasonable upper and lower bound parameters for power based on a normal distribution. The diagnostic phase includes the following steps: Step S31. Adaptive missing string diagnosis: Using the reasonable lower bound parameter of power in the baseline data output in step S2 as a dynamic threshold, strings whose power is continuously lower than this threshold within a preset time are marked as "suspected missing string". A topology confidence check is introduced. If the adjacent strings of a string marked as "suspected missing string" are running normally, its missing string confidence parameter is increased. If the adjacent strings are also abnormal, its missing string confidence parameter is decreased.
7. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 6, characterized in that, The statistical characteristics include the coefficient of variation (CV_baseline), and the diagnostic phase includes the following steps: Step S32. Dynamic dispersion rate analysis: In the strings marked as "suspected missing strings", strings with a missing string confidence level exceeding the preset standard are excluded. The coefficient of variation CV_curr and dispersion rate deviation ΔCV of the power of the remaining normal strings are calculated. ΔCV = CV_curr - CV_baseline, where CV_baseline is the power coefficient of variation in the baseline data output in step S2. If the dispersion rate deviation ΔCV exceeds the set threshold, the power plant is determined to have performance dispersion.
8. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 7, characterized in that, The diagnostic phase includes the following steps: Step 3.3 Spatial correlation analysis: Based on the real-time collected string power and the average power in the benchmark data output in step S2, identify inefficient strings that are lower than the benchmark data. For inefficient strings, analyze their spatial distribution clustering by combining the electrical topology and physical layout diagram of the power plant; if inefficient strings are concentrated in the same electrical branch or physical area, it is determined that the corresponding strings have a clustering anomaly.
9. The intelligent fault diagnosis method for power plants based on adaptive benchmark matching as described in claim 8, characterized in that, Includes the following steps: Step S4. Integrating Decision Making and Root Cause Inference: Evidence fusion: Taking into account all the diagnostic results in step S3, a rule engine is applied to make a fusion decision on the diagnostic results. The rule engine includes conflict resolution rules. Root cause inference: Based on the fused diagnostic results and the spatial distribution clustering information obtained from the analysis, fault type subdivision and root cause inference are performed; Step S5. Diagnostic Output and Alarms: Generate a structured diagnostic report, which includes fault information and targeted operation and maintenance suggestions generated based on root cause inference, and trigger tiered alarms.
10. A power plant fault intelligent diagnosis system based on adaptive benchmark matching, used to implement the diagnosis method as described in any one of claims 1-9, characterized in that, include: The data acquisition and preprocessing module is used to connect to the target photovoltaic power station monitoring system, acquire string-level operating data and synchronous environmental data in real time, and preprocess them. The offline benchmark model library construction module includes an environment interval division unit, a statistical feature calculation unit, and a data storage unit. The environment interval division unit is used to construct a one-dimensional or multi-dimensional feature space based on any one or more collected environmental data, dividing each dimension axis of the feature space into multiple continuous interval segments, and further dividing the feature space into several cells based on the division of each dimension axis. The statistical feature calculation unit is used to define the collected running data in corresponding cells based on the environmental data range defined for each cell and the correspondence between the string-level running data and the synchronized environmental data, and independently calculates the statistical features of various running data defined for each cell containing running data. The data storage unit is used to associate all obtained statistical features with the corresponding cells, record them in a unified database, and construct the offline benchmark model library. The online adaptive benchmark acquisition engine includes an exact matching queryer, a weighted distance calculator, a dynamic interpolator, and a global benchmark generator. It executes step S2 based on a multi-level benchmark matching strategy to obtain suitable benchmark data for diagnostics. The parallel fault analysis engine acquires benchmark data from the engine output based on the online adaptive benchmark, as well as the spatial correlation between strings, and performs fault analysis on each string of the power plant from multiple aspects. The fusion decision and knowledge base module applies the application rule engine to make fusion decisions on the various diagnostic results output by the parallel fault analysis engine, and performs fault type subdivision and root cause inference based on the fused diagnostic results and spatial correlation information between strings. The alarm and report generation module is used to generate structured diagnostic reports, which include fault information and targeted operation and maintenance suggestions generated based on root cause inference, and trigger hierarchical alarms.