A method for analyzing energy consumption data of a thermal power plant
By using hierarchical data acquisition and an LSTM neural network model, combined with operating condition labels and coal quality correction, the data processing and anomaly identification problems in energy consumption analysis of thermal power plants were solved, achieving efficient energy consumption monitoring and intelligent control, and improving the energy consumption management level of power plants.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- DATANG YANGLING THERMAL POWER CO LTD
- Filing Date
- 2026-05-29
- Publication Date
- 2026-06-26
AI Technical Summary
Energy consumption analysis in thermal power plants suffers from problems such as inconsistent processing of multi-source data, high noise interference, high misjudgment rate of anomaly identification, and low level of intelligence. Existing machine learning models cannot adapt to complex dynamic operating scenarios, resulting in low efficiency of energy consumption management.
By employing hierarchical data acquisition, missing value repair, outlier removal, and data normalization, and combining LSTM neural network to construct an energy consumption benchmark prediction model, an energy consumption anomaly is identified through adaptive deviation threshold, and the causes of the anomaly are located through correlation analysis and grey relational algorithm, generating an intelligent energy-saving optimization scheme.
It has achieved standardized fusion of multi-source data and accurate anomaly identification, improved the accuracy of energy consumption anomaly identification and intelligent management efficiency, reduced power generation coal consumption and plant power consumption rate, and improved the economic benefits of power plants.
Smart Images

Figure CN122292508A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of energy consumption data analysis, machine learning, and industrial intelligent control technology for thermal power plants, specifically to a method for energy consumption data analysis of thermal power plants. Background Technology
[0002] Thermal power generation is the core of my country's power supply, playing a crucial role in ensuring power supply, peak shaving, and stabilizing grid operation within the new power system. With the continued implementation of dual-carbon policies, market-oriented reforms of coal-fired power pricing, and increased volatility in coal prices, energy conservation, cost reduction, and efficiency improvement have become core objectives for thermal power plants. The energy consumption level of generating units directly determines the economic benefits and environmental indicators of a power plant; accurate and efficient energy consumption data analysis is the core foundation for achieving energy optimization, fault diagnosis, and refined operation and maintenance.
[0003] Currently, energy consumption analysis in most domestic thermal power plants still relies on traditional manual statistics and fixed threshold comparison methods, which have many technical shortcomings: First, the energy consumption data sources of thermal power plants are complex, covering multi-dimensional heterogeneous data from units, auxiliary equipment, fuel, environment, and environmental protection systems. The data collection frequency is inconsistent, noise interference is high, and missing data and anomalies occur frequently. Traditional preprocessing methods cannot achieve effective integration and standardized processing of multi-source data, resulting in poor reliability of the data analysis basis. Second, existing energy consumption benchmark models mostly use fixed empirical formulas or static statistical models, which cannot adapt to variable load operation of units and large fluctuations in coal quality. In dynamic scenarios such as dynamic changes, seasonal environmental changes, and equipment aging and degradation, the theoretical energy consumption benchmark calculation has large deviations, making it difficult to accurately reflect the actual energy consumption level of the unit; third, the energy consumption anomaly identification adopts a fixed threshold judgment method, which has poor adaptability to operating conditions and is prone to missed or false judgments. Moreover, after an anomaly occurs, only the result alarm can be realized, which cannot accurately trace the cause of energy consumption deviation and cannot support targeted energy-saving optimization; fourth, the overall analysis has a low level of intelligence, relying on manual experience judgment and manual statistical analysis, which has low analysis efficiency and strong lag, and cannot achieve real-time monitoring, early warning, and intelligent optimization of the entire process of energy consumption management.
[0004] In existing technologies, some solutions attempt to use machine learning for energy consumption prediction and analysis. However, these solutions generally suffer from weak model feature extraction capabilities, poor adaptability to operating conditions, lack of attribution analysis, and insufficient targeting of optimization strategies. They are unable to adapt to the complex, dynamic, and multi-interference operating scenarios of thermal power plants, making it difficult to truly implement them in actual energy consumption management. Therefore, there is an urgent need for an intelligent energy consumption data analysis method that is adaptable to all operating conditions of thermal power plants, provides accurate data processing, clear anomaly tracing, and can output quantitative optimization solutions. Summary of the Invention
[0005] To address the aforementioned technical problems, this invention provides a method for analyzing energy consumption data in thermal power plants. Specifically, the technical solution of this invention includes: A method for analyzing energy consumption data from thermal power plants includes the following steps: S1. Build a full-link data acquisition architecture for thermal power plants, collect multi-source data on unit operation in real time, and store the collected multi-source data in a hierarchical and classified manner according to the unit layer, auxiliary machine layer, and system layer. S2. The collected multi-source data are sequentially processed by missing value repair, outlier removal, data normalization, and time-series alignment to eliminate data noise and dimensionality differences, forming a time-series energy consumption dataset. S3. Based on the time-series energy consumption dataset, an energy consumption benchmark prediction model is constructed using an LSTM neural network. The unit operating condition label and coal quality fluctuation correction coefficient are introduced to fit the theoretical energy consumption benchmark value of the unit under different load ranges, different coal qualities, and different environmental conditions. S4. Compare the actual energy consumption data of the unit with the theoretical energy consumption benchmark value output by the energy consumption benchmark prediction model, set an adaptive deviation threshold, determine the energy consumption operation status, and identify abnormal data of energy consumption operation status. S5. Based on the identified abnormal energy consumption operation data, through correlation analysis and grey relational algorithm, quantify the influence weight of multiple factors on energy consumption deviation and locate the core abnormal causes. S6. Based on the core anomaly causes located in the location, and combined with the operating constraints of thermal power plant units, generate intelligent energy-saving methods for unit parameter control, auxiliary equipment load matching, coal blending optimization, and equipment operation and maintenance rectification, and form energy consumption analysis results and optimization methods.
[0006] The energy consumption data analysis method for thermal power plants mentioned above, in step S1, specifically includes the multi-source data: core unit parameters, fuel parameters, auxiliary machine parameters, energy consumption parameters, and environmental parameters.
[0007] The specific process of step S2 in the aforementioned method for analyzing energy consumption data of thermal power plants includes: S21. For short-term single-point missing data, use the adjacent time series mean interpolation method for repair; for continuous long-term missing data, use historical similar data under the same working conditions for fitting and repair, and mark the data repair label. S22. Using the 3σ criterion combined with the unit operation logic rules, invalid abnormal data caused by sensor failures and instantaneous disturbances are eliminated, while valid abnormal data generated by unit operating condition switching and fault fluctuations are retained. S23. Use the min-max normalization algorithm to uniformly map all dimensional data to the [0,1] interval to eliminate dimensional differences; S24. Based on the unit load time series, perform time series interpolation and alignment on multi-source data with different acquisition frequencies to ensure that all data timestamps are consistent, forming a standardized dataset with continuous time series and consistent dimensions.
[0008] The aforementioned method for analyzing energy consumption data in thermal power plants, in step S3, involves constructing an energy consumption benchmark prediction model based on a time-series energy consumption dataset using an LSTM neural network, introducing unit operating condition labels and coal quality fluctuation correction coefficients, and fitting theoretical energy consumption benchmark values for the unit under different load ranges, different coal qualities, and different environmental conditions. Specifically, this includes: S31. Based on the preprocessed [0,1] normalized multidimensional time series dataset, generate a multidimensional time series feature input matrix to complete the structured definition of the model input data; S32. Calculate the operating condition time-series fluctuation and perform disturbance discrimination by using the parameter difference between adjacent time points, and obtain the operating condition fluctuation and disturbance attribute label for each time-series node to complete the refined discrimination of the unit's operating status. S33. Based on the disturbance discrimination results, the negative deterioration disturbance is weighted and strengthened, while the positive and steady-state fluctuations are fitted in a conventional manner. The learning weights that highlight the high energy consumption risk characteristics are obtained to obtain the differentiated asymmetric attention original scores for each time series node. S34. Introduce a time-series decay factor to weaken the weight of instantaneous invalid disturbances and strengthen the effective time-series characteristics of long-term steady-state and continuous operating condition changes. Obtain the time-series decay weight factor for each time-series node to complete the filtering of invalid instantaneous disturbances and the strengthening of effective long-term characteristics. S35. The asymmetric attention score is fused with the temporal decay factor, and the weights are normalized through the Softmax mechanism to obtain the optimized temporal feature set. S36. Input the weighted features into the LSTM memory layer, mine the time series correlation patterns through the three-gate structure, capture the energy consumption lag characteristics and long-term time series dependencies, extract the deep time series correlation features and lag characteristic features of unit energy consumption, and complete the deep modeling of the time series dimension. S37. Classify and encode typical operating conditions of the unit to realize differentiated operating condition modeling, generate multi-dimensional deep features that fuse time-series features and operating condition features, and realize the model's differentiated perception of different operating states. S38. Input the fused multi-dimensional working condition time series features into the working condition feature embedding layer to obtain the preliminary predicted energy consumption value under the benchmark working condition, complete the basic energy consumption fitting of the model, and introduce two types of dynamic correction coefficients to perform a second precise correction on the preliminary predicted value. S39. The training error is calculated using the mean squared error loss function. All weights and bias parameters of the network are iteratively updated through the backpropagation mechanism to complete the model iterative optimization and parameter solidification, thus obtaining the working condition adaptive energy consumption benchmark prediction model.
[0009] The energy consumption data analysis method for thermal power plants, wherein step S36 of the LSTM three-gate structure includes a forget gate that controls the proportion of discarding historical invalid time-series information, an input gate that controls the proportion of retaining valid features at the current moment, and an output gate that filters the final valid time-series hidden features.
[0010] The aforementioned method for analyzing energy consumption data in thermal power plants, in step S4, compares the actual operating energy consumption data of the unit with the theoretical energy consumption benchmark value output by the energy consumption benchmark prediction model, sets an adaptive deviation threshold, determines the energy consumption operating status, and identifies abnormal energy consumption operating status data. Specifically, this includes: S41. Retrieve the complete historical operating dataset of the unit and strictly classify and divide the data samples according to the five preset operating conditions. We obtained a dataset of historical valid samples from five independent operating condition partitions; S42. For each category of working condition sample set, match the historical measured standard coal consumption for power supply with the theoretical benchmark coal consumption output by the LSTM model under the same working condition, calculate the energy consumption relative deviation rate of a single historical sample, and generate a sequence of individual energy consumption relative deviation rates corresponding to all valid samples under each category of working condition. S43. Using the deviation rate of all valid samples within a single operating condition zone as the calculation object, calculate the overall average level of energy consumption deviation under that operating condition, obtain the average value of energy consumption deviation under a single operating condition, and determine the average level of the unit's normal energy consumption deviation from the benchmark under that operating condition. S44. Combining the actual service time of the unit with the design life cycle, an aging sensitivity coefficient is introduced to quantify the energy consumption judgment threshold deviation caused by the aging of equipment components and the degradation of mechanical performance. A basic abnormal threshold is set and a two-dimensional correction coefficient is constructed. S45. Multiply the basic anomaly threshold, equipment aging correction coefficient, and seasonal environmental correction coefficient together to dynamically scale and adjust the fixed basic threshold, and solve for the dynamic adaptive energy consumption anomaly judgment threshold. S46. Using the standard coal consumption deviation of power supply as the core judgment basis, and the deviation of plant power consumption rate and auxiliary equipment unit consumption as auxiliary verification basis, the measured values of the three types of core energy consumption indicators are matched with the theoretical benchmark values to obtain the full-dimensional energy consumption indicator deviation calculation standard and real-time deviation value. S47. Based on the relative deviation rate of standard coal consumption for power supply as the core criterion, three distinct energy consumption operation levels are defined to accurately determine the real-time energy consumption operation level of the unit.
[0011] The energy consumption data analysis method for thermal power plants, specifically step S44 of setting a basic anomaly threshold and constructing a two-dimensional correction coefficient includes: (1) Set a global fixed initial basic anomaly threshold that is not affected by external factors as the benchmark critical value for energy consumption anomaly judgment; (2) Combining the actual service time of the unit with the design full life cycle time, an aging sensitivity coefficient is introduced to quantify the energy consumption judgment threshold shift caused by the aging of equipment components and the degradation of mechanical performance, and to construct an equipment aging correction coefficient. (3) Based on the differences in ambient temperature, atmospheric humidity and external ventilation conditions throughout the four seasons, the four seasons of spring, summer, autumn and winter are divided, and corresponding fixed seasonal environmental correction coefficients are matched for different seasons.
[0012] The aforementioned method for analyzing energy consumption data in thermal power plants, in step S5, based on the identified abnormal energy consumption operation data, uses correlation analysis and grey relational algorithms to quantify the influence weights of multiple factors on energy consumption deviations, and locates the core abnormal causes, specifically including: S51. Based on the original operating parameters collected from all dimensions of the unit, combined with the thermal power unit's thermal production mechanism, and according to the sources of influence on energy consumption, four dimensions are divided, and specific attribution evaluation indicators are determined for each dimension to form a comprehensive energy consumption influencing factor index library. S52. Set the relative deviation rate of time series energy consumption as the parent sequence for attribution analysis, and use it as the analysis benchmark for abnormal energy consumption changes. The measured time series data of all influencing factors in the four dimensions are uniformly set as the attribution factor sequence to form the parent sequence and sub-sequence attribution analysis dataset. S53. Quantify the linear correlation between each candidate factor and the energy consumption anomaly deviation by using the Pearson correlation coefficient, eliminate redundant factors with weak correlation and no actual impact, simplify the number of attribution indicators, and obtain a simplified core attribution factor set that is highly correlated with energy consumption anomalies. S54. The grey relational algorithm was used to quantify the influence weight of factors, obtain the percentage abnormal contribution weight of each core influencing factor, and complete the quantitative ranking of the influence of data-level causes. It was initially concluded that: coal quality fluctuation is the primary cause, steam parameter deviation is the secondary cause, and auxiliary equipment mismatch is the tertiary cause. S55. Perform a second verification of the thermal mechanism to eliminate pseudo-correlation factors, obtain a list of effective abnormal causes and contribution weights for eliminating pseudo-correlation, and complete the accurate location of energy consumption anomalies.
[0013] The energy consumption data analysis method for thermal power plants, wherein step S54, quantifying the factor influence weights, specifically includes: (1) The initialization method is used to uniformly eliminate the difference in dimensions. The value of the first time step of the sequence is used as the benchmark to complete the standardization of all factors and energy consumption deviation data, and the dimensionless standardized factor sequence and energy consumption deviation sequence are obtained. (2) Solve for the absolute value of the difference between the standardized energy consumption deviation sequence and each core factor sequence to generate the absolute difference sequence between the factor and the energy consumption deviation at each time node; (3) Traverse all absolute difference data, extract the global maximum and global minimum values, and determine the global two-pole difference values required for grey relational analysis; (4) Introduce the resolution coefficient and combine the two extreme differences to calculate the correlation coefficient between the factor at each time point and the energy consumption anomaly, and obtain the gray correlation coefficient of each time point at each time series node. (5) Perform time-series averaging on the correlation coefficients at a single moment to obtain the overall correlation degree of a single factor and obtain the overall grey correlation degree of each core factor with energy consumption anomaly; (6) Normalize the correlation of all core factors, output the percentage abnormal contribution weight of each core influencing factor, complete the quantitative ranking of the influence of data-level causes, and initially conclude that: coal quality fluctuation is the primary cause, steam parameter deviation is the secondary cause, and auxiliary equipment mismatch is the tertiary cause.
[0014] The energy consumption data analysis method for thermal power plants, in step S6, based on the location of the core anomaly causes and combined with the operating constraints of the thermal power plant units, generates intelligent energy-saving methods for unit parameter control, auxiliary equipment load matching, coal blending optimization, and equipment operation and maintenance rectification, and forms energy consumption analysis results and optimization methods, specifically including: S61. Construct an adaptive intelligent energy-saving optimization framework, adopt a steady-state optimal operating condition matching model, and search through massive historical data to match the optimal and efficient operating paradigm under the same conditions; for dynamic energy consumption deviations caused by load fluctuations, coal quality changes, and environmental changes, adopt a dynamic real-time fine-tuning model to achieve precise parameter correction. S62. Select the three key features of unit power generation load, low calorific value of coal fed into the furnace, and ambient temperature, and construct the real-time current operating condition feature vector and the historical sample operating condition feature vector respectively to complete the data structure preparation for operating condition similarity calculation. S63. Use the Euclidean distance algorithm to quantify the similarity between the current operating condition and all historical sample operating conditions, match historical operating conditions that are highly consistent with the current operating conditions, and eliminate optimization deviations caused by differences in operating conditions. S64. To address the dynamic instantaneous energy consumption deviation caused by unit load changes, coal quality fluctuations, and sudden environmental changes, a negative feedback fine-tuning model is constructed by combining the real-time energy consumption deviation degree and the contribution weight of abnormal causes, and the parameter optimization adjustment amount is automatically calculated. S65. Build a pre-evaluation model for the optimization effect, calculate the predicted coal consumption for power supply after regulation, accurately calculate the energy-saving benefits, and obtain the predicted coal consumption value, energy consumption reduction rate, and quantified energy-saving benefits after optimization. S66. Integrate the output results of all the aforementioned models, and combine them with the anomaly tracing conclusions, optimal steady-state operating parameters, dynamic fine-tuning schemes, and pre-assessed energy-saving benefits to form a systematic energy consumption optimization analysis conclusion.
[0015] Compared with the prior art, the present invention has the following beneficial effects: 1. This invention achieves standardized fusion processing of multi-source heterogeneous energy consumption data. Through hierarchical acquisition, graded preprocessing, and time-series alignment processing, it effectively eliminates data noise, dimensional differences, and time-series misalignment problems, preserves the true energy consumption characteristics of the unit's operation, and provides a high-quality data foundation for accurate analysis.
[0016] 2. This invention uses an LSTM neural network to construct an energy consumption benchmark model. By combining operating condition labels and multi-dimensional correction coefficients, it accurately fits the theoretical energy consumption value of the unit under all operating conditions. It adopts an adaptive dynamic deviation threshold for operating conditions to identify anomalies. It can adapt to complex scenarios such as unit start-up and shutdown, load changes, seasonal changes, and coal quality fluctuations. The accuracy of energy consumption anomaly identification is greatly improved, effectively avoiding the problems of missed judgments and misjudgments.
[0017] 3. This invention uses a multi-dimensional quantitative attribution analysis method to accurately locate the core causes of abnormal energy consumption, quantify the influence weight of each factor, and generate two types of intelligent energy-saving strategies: steady-state optimization and dynamic control. This achieves a fully intelligent closed loop of energy consumption monitoring, abnormal early warning, source analysis, and optimization control without the need for extensive manual intervention, significantly improving the efficiency of power plant energy consumption management, effectively reducing coal consumption for power generation and plant power consumption rate, and improving the economic benefits of power plants. Attached Figure Description
[0018] The present invention will be further explained below with reference to the accompanying drawings and embodiments: Figure 1 This is a flowchart of the method of the present invention. Detailed Implementation
[0019] To further understand the structure, features, and other objectives of the present invention, a detailed description is provided below with reference to the accompanying drawings. The embodiments illustrated in these drawings are for illustrative purposes only and are not intended to limit the scope of the invention.
[0020] The present invention will be further described in detail below with reference to specific embodiments.
[0021] This invention discloses a method for analyzing energy consumption data in thermal power plants, the specific implementation steps of which are as follows: Step 1: Establish a full-link data acquisition architecture for thermal power plants, collect multi-source data on unit operation in real time, and store the collected multi-source data in a hierarchical and classified manner according to the unit layer, auxiliary equipment layer, and system layer, as follows: Multi-source data collection requires the acquisition of core unit parameters, fuel parameters, auxiliary equipment parameters, energy consumption parameters, and environmental parameters. Specific data collection points and parameter definitions are as follows: (1) Unit core layer (unit operating reference parameters): Real-time acquisition of actual power generation load N and main steam pressure of the unit. Main steam temperature Reheat steam temperature Smoke exhaust temperature Unit absolute internal efficiency ; (2) Fuel parameter layer (energy consumption input parameters): Real-time collection of the low received calorific value of coal fed into the furnace. Ash content A, moisture content M, sulfur content S, instantaneous coal combustion flow rate B; (3) Auxiliary equipment and energy consumption layer (core energy consumption output parameters): Real-time acquisition of the operating current I, speed n, and real-time output F of the blower, induced draft fan, feed water pump, and coal mill; synchronous acquisition of the instantaneous power generation, plant power consumption, process water consumption, and power consumption of the desulfurization and denitrification environmental protection system. (4) Environmental parameter layer (operating condition correction parameters): Real-time acquisition of local ambient temperature Ambient humidity (H) and atmospheric pressure ; Using unit load fluctuation rate as the core indicator for operating condition determination, the system achieves adaptive switching of the data acquisition frequency, accurately adapting to both steady-state and dynamic operating scenarios. The formula for calculating load fluctuation rate is as follows: ; In the formula: Unit load fluctuation rate; This represents the current generating load of the unit. The generating load of the unit at the previous sampling time; The basic sampling interval.
[0022] Acquisition frequency switching criteria: ① Steady-state operating condition: when The unit is determined to be in steady-state operation. ② Dynamic operating conditions: When If the condition is identified as a variable load, start-up / stop, coal quality switching, or fault fluctuation, it will automatically switch to high-frequency data acquisition.
[0023] Based on the collected raw parameters, the standard coal consumption and plant power consumption rate of thermal power plants are calculated in real time, using the following formulas: (1) Calculation of standard coal consumption for instantaneous power generation; formula: ; In the formula: Y represents the standard coal consumption for instantaneous power generation, expressed in g / kWh; Y represents the lower heating value of standard coal. Instantaneous coal flow rate (t / h); The lower calorific value (kJ / kg) of the coal fed into the furnace. The real-time power generation load (kW) of the unit.
[0024] (2) Calculation of instantaneous plant power consumption rate formula: ; In the formula: Plant power consumption rate; The power consumption of the generating unit (kWh) during the statistical period; The total power generation of the unit during the statistical period is (kWh).
[0025] (3) Calculation of instantaneous power supply standard coal consumption; formula: ; All the above-mentioned raw collected parameters and real-time calculated energy consumption indicators are uniformly classified and stored in the InfluxDB industrial time series database. A three-level storage directory is established according to the unit level, auxiliary machine level, and system level. Each data is bound to a unique high-precision timestamp (accurate to ms) to realize the foundation for time series alignment of multi-source heterogeneous data. (1) Unit-level storage: load, steam parameters, flue gas parameters, coal consumption for power generation, coal consumption for power supply; (2) Auxiliary machine layer storage: four core auxiliary machine current, speed, output, unit consumption, and auxiliary machine power consumption; (3) System-level storage: fuel parameters, environmental parameters, total plant power consumption, water consumption, and environmental protection energy consumption; Step 2: Perform missing value repair, outlier removal, data normalization, and time-series alignment on the collected multi-source data in sequence to eliminate data noise and dimensionality differences, forming a time-series energy consumption dataset. This includes the following steps: This step processes the raw data collected in step 1, purifying the data while retaining effective characteristics such as unit load changes, coal quality fluctuations, and operating condition switching. The final result is a structured time-series dataset that is time-consistent, dimensionally standardized, and of controllable quality. Based on the sampling data from step 1, the number of consecutive missing data points K within the time window is used as the core criterion to distinguish between short-term minor missing data and long-term large-area missing data. A differentiated algorithm is used to complete the data completion, and source tags are added to all repaired data to distinguish between the original real data and the repaired data, thereby preventing false data from interfering with subsequent model training from the source.
[0026] Using the unit time-series sampling window as the statistical unit, the number of time-series nodes within the statistical window that have not collected valid values consecutively is defined as the number of consecutive missing points K, and two-level judgment rules are formulated: when When the data is missing for a short period of time, it is determined to be a short-term missing data. This scenario is mostly caused by instantaneous transmission jitter. The missing time is extremely short, and the data before and after is highly continuous. It can be accurately restored by adjacent real data. When the number of consecutive missing points meets the condition When the data is missing for an extended period, it is considered to be missing for a long time. This scenario is often caused by issues such as sensor failure, equipment downtime, or network interruption. The missing time spans a large range, adjacent data has no reference value, and it cannot be restored by simple interpolation.
[0027] For short-term missing data, mean interpolation is performed using real and valid time-series data before and after the missing node to complete the data. This fully utilizes the characteristics of stable short-term operating conditions and small data fluctuations to ensure that the repaired data closely matches the actual operating state of the unit. The repair calculation formula is as follows: ; In the formula: These are the repaired data values; This refers to the most recent valid time series data before the missing period. This is the most recent valid time series data after the missing period; The effective time interval between adjacent sequences.
[0028] This method accurately completes all short-term missing time-series data, initially restoring the continuity of data time sequence, and relies entirely on the original real data for calculation, without introducing any external false data.
[0029] To address the issue of long-term data gaps, where the interval between adjacent valid data points is too large and operating conditions such as unit load and coal quality may have changed, interpolation between adjacent data points can lead to severe data distortion. Therefore, a historical standard dataset of the same source and operating conditions is retrieved from the unit, and a weighted fitting is performed to repair the data by combining it with the deviation from the current real-time operating conditions. This fits the dynamic operating characteristics of the unit, and the fitting and repair formula is as follows: ; In the formula: These are historical standard parameter values under the same operating conditions; This refers to the deviation between the current load and the historical load. This refers to the deviation between the current calorific value of coal and its historical calorific value. These are the fitting weighting coefficients for thermal power plant operating conditions, representing the influence weights of historical operating condition benchmarks, load fluctuations, and coal quality fluctuations on the parameters, respectively.
[0030] After long-term missing data is fitted and repaired according to the same operating conditions, all long-term missing node data is completed by adaptive fitting and repair of operating conditions, which solves the problem of large-area data missing and the repaired data fits the current actual operating conditions of the unit.
[0031] For all data nodes that have been repaired through interpolation and fitting, a unique traceability label is added, while the original labels of the original collected data are retained. The labels clearly distinguish between the original real collected data, short-term interpolated repair data, and long-term operating condition fitting repair data, enabling traceability of data sources.
[0032] To address invalid anomalies such as sensor jumps, electromagnetic interference, and transient pulses in the data collected in step 1, a dual-layer filtering mechanism is employed to distinguish invalid noise from actual operating condition fluctuations.
[0033] A time-series sliding window is set up with a single stable operating condition as the unit. Statistical analysis is performed on all time-series parameter data within the window to calculate the mean and standard deviation of the data, define the normal fluctuation range of the data, and initially screen out abnormal data nodes. The formula for calculating the mean is as follows: ; The formula for calculating standard deviation is as follows: ; In the formula: The average value of the time series data within the sliding window represents the steady-state baseline value under operating conditions. is the standard deviation of the data, representing the normal fluctuation range of the data; n is the number of valid data samples within the sliding window; This refers to the data of a single time-series node within the window.
[0034] Statistical anomaly detection criteria: When the data of a single node meets the following criteria... or At that time, it was initially determined to be statistically abnormal data and included in the dataset to be identified.
[0035] The abnormal data identified in the initial screening are further verified by combining the unit's operating condition log, load fluctuation data, and coal quality change records to eliminate false noise and retain the true operating condition characteristics.
[0036] Secondary screening of operating condition characteristics to distinguish between valid and invalid anomalies: ① Invalid and abnormal data are directly removed: The unit is in steady-state load condition (load fluctuation rate) If there are no changes in operating conditions, no unit start-up or shutdown, and no coal quality ledger change records, but parameters such as steam temperature, coal calorific value, and auxiliary machine current show instantaneous sudden changes (fluctuation amplitude exceeds the normal range by 20%), it is determined to be invalid noise caused by sensor failure, electromagnetic interference, or instantaneous pulses, and such abnormal nodes are directly deleted. ② Valid abnormal data, fully preserved: Data mutations are accompanied by unit load fluctuations. Changes in coal quality parameters, unit start-up and shutdown, and switching of operating conditions are real operational actions that are determined to be data changes caused by fluctuations in the actual operating conditions of the unit or minor equipment abnormalities. These are core effective features, and the original data and fluctuation characteristics are fully preserved.
[0037] This data filtering process thoroughly removes interfering and invalid noise data, retaining 100% of the core operating characteristics such as unit load changes, coal quality fluctuations, and operating condition switching, resulting in a purified dataset free of false anomalies and with truly valid features.
[0038] The preprocessed dataset contains various parameters such as pressure, temperature, current, coal calorific value, power generation load, and energy consumption. These parameters have different dimensions and significantly different numerical magnitudes. Directly inputting them into the LSTM model would lead to model weight bias and decreased training accuracy. This step employs the Min-Max linear normalization algorithm to map all parameter data to the standard interval [0,1]. This eliminates differences in dimensions and magnitudes while fully preserving the relative trends and operating condition fluctuations of the data, adapting to the input specifications of deep learning models. The normalization calculation formula is as follows: ; In the formula: The data is after normalization; The original data after preprocessing; This represents the historical extreme value of this parameter within the current operating condition range.
[0039] By performing multi-dimensional parameter Min-Max linear normalization, all dimensional parameter data are eliminated to eliminate differences in scale and magnitude, and are uniformly standardized to the [0,1] value range, resulting in a time series dataset with dimensional standardization and feature normalization, which fully meets the input requirements of the LSTM model.
[0040] The multi-source data collected in Step 1 comes from different acquisition terminals, including the unit level, auxiliary equipment level, fuel level, environmental level, and energy consumption derivative indicators. These terminals have different sampling frequencies and inconsistent acquisition timestamps, resulting in time series misalignment and hindering multi-parameter correlation analysis and model time series modeling. This step uses the unit's power generation load time series timestamp as a globally unified benchmark and completes full-dimensional data time series alignment through a linear interpolation algorithm.
[0041] (1) Using the load reference timestamp as the core, perform time-series interpolation to complete heterogeneous parameters such as fuel, auxiliary equipment, environment, and energy consumption, unify the time dimension, and align the time-series interpolation formula. ; In the formula: This is the baseline load time series timestamp; Timestamps for collecting adjacent heterogeneous parameters; For the parameter values at the corresponding time, The interpolated standard parameter value corresponding to the base timestamp t.
[0042] Based on the above interpolation algorithm, the time-series adaptation and alignment of all non-baseline parameters are completed, and all dimensions of data such as unit load, pressure, temperature, current, coal quality, environment, and energy consumption are uniformly matched to a millisecond-level base timestamp. This completely solves the problems of asynchronous time series and inconsistent sampling frequencies of multi-source data, and achieves accurate spatiotemporal matching of data at different levels.
[0043] By aligning all data at the millisecond level, all data at the unit, auxiliary, system, fuel, and environmental levels share the same millisecond-level timestamp, forming a standardized, structured time-series energy consumption dataset that is dimensionally unified, time-synchronized, quality-controllable, and feature-complete. This dataset can be directly used for subsequent model training and data analysis.
[0044] Step 3: Based on the time-series energy consumption dataset, an energy consumption benchmark prediction model is constructed using an LSTM neural network. The model incorporates unit operating condition labels and coal quality fluctuation correction coefficients to fit the theoretical energy consumption benchmark values for different load ranges, coal qualities, and environmental conditions, as detailed below: This step relies on the preprocessed [0,1] normalized multidimensional time series dataset to construct a unified standard input matrix for the model, standardize the input dimensions and time series structure, and provide a standardized data foundation for subsequent feature calculation and model training.
[0045] Combining multi-dimensional operating parameters such as unit load, temperature, pressure, coal quality, environment, and auxiliary equipment operation, a time-series feature matrix is defined for the model input. The overall input form is a set of time-series sequences. ; In the formula This is the time step size, corresponding to a millisecond-level timestamp; For the first The multi-dimensional operating condition feature vector, which has been normalized at all times, integrates all operating parameters of the unit; For fixed feature dimensions, it represents the total number of operating condition parameters involved in the modeling; The time window length for a single model input is used to limit the historical data span of a single model learning iteration.
[0046] The core cause of energy consumption changes in thermal power units lies in the dynamic fluctuations of operating parameters. Negative disturbances (such as deteriorating coal quality, sudden load drops, and abnormally low parameters) are the key triggers for increased energy consumption, while positive fluctuations or steady-state operating conditions have minimal negative impact on energy consumption. To achieve differentiated feature learning, this step calculates the temporal fluctuation of operating conditions by using the parameter differences between adjacent time points, accurately distinguishing between positive and negative disturbance states.
[0047] Formula for calculating operating condition time series fluctuations: ; In the formula: For the first Time-series fluctuation of operating parameters; This is the feature vector at the current time. The feature vector from the previous time step; Disturbance discrimination criteria: when When this occurs, it is judged as a negative deterioration disturbance, corresponding to adverse operating conditions such as a decrease in coal calorific value, a sudden drop in unit load, and low operating parameters, which will lead to increased energy consumption. when When the condition is determined to be a positive optimization disturbance or a steady-state stable condition, it corresponds to a benign operating state with stable operating conditions, optimized parameters, reduced energy consumption, or no significant fluctuations.
[0048] By calculating the time-series fluctuations and identifying the disturbance type under this operating condition, the operating condition fluctuations and disturbance attribute labels of each time-series node are obtained, thus completing the refined identification of the unit's operating status and providing a basis for subsequent asymmetric attention weight assignment. Traditional attention mechanisms use uniform weighting to calculate all temporal fluctuations, which cannot adapt to the asymmetric energy consumption characteristics of thermal power plants, where negative disturbances lead to energy loss while positive fluctuations are harmless. This step, based on the disturbance discrimination results, strengthens the weights of negative deterioration disturbances and uses conventional fitting for positive and steady-state fluctuations, highlighting the learning weights that emphasize the high energy consumption risk characteristics. Formula for calculating asymmetric attention score: ; In the formula: For the first Moment-asymmetric attention score; This is a negative disturbance enhancement coefficient specific to thermal power plants, with a value greater than 1, used to amplify the characteristic weights of adverse working conditions; The attention layer is a trainable weight matrix and bias parameters.
[0049] During unit operation, there are instantaneous and non-continuous random disturbance fluctuations. These disturbances do not have long-term energy consumption impact patterns, and direct fitting would reduce the accuracy of the model's baseline prediction. This step introduces a time-series decay factor to weaken the weight of instantaneous ineffective disturbances and strengthen the effective time-series characteristics of long-term steady-state and continuous operating condition changes. Formula for calculating time decay factor: ; In the formula: To fix the timing decay coefficient and adapt to the timing decay characteristics of thermal power units; This represents the total length of the timing window; This represents the current time step size.
[0050] By introducing a time-series decay factor, the time-series decay weight factor of each time-series node is obtained, thereby completing the filtering of invalid instantaneous disturbances and the enhancement of effective long-term features. To ensure that the weight distribution of each time series is reasonable and the overall weight is controllable, the asymmetric attention score and the time series decay factor are fused together, and the weight normalization is completed through the Softmax mechanism to ensure that the total weight of the entire time series window is 1, thereby achieving the standardized allocation of feature weights. Normalized attention weight calculation formula: ; In the formula: For the first The asymmetric attention weights are eventually normalized at each time step, satisfying the following conditions: .
[0051] The original time-series features are reconstructed by weighting with optimized adaptive weights, which strengthens key consumption-causing features and weakens invalid interference features, generating high-quality features that are adapted to the thermal power mechanism, and then used as input to the LSTM layer. Feature-weighted optimization formula: ; In the formula: This is the temporal feature vector enhanced by the asymmetric mechanism.
[0052] Attention weight normalization and temporal feature weighting optimization yield an optimized temporal feature set that is mechanistic-adaptive, anti-interference, and feature-focused, thus completing the fine-tuning and enhancement of shallow features of the model. Thermal power unit energy consumption exhibits significant time-series lag characteristics, with the current energy consumption level influenced by the operating conditions at multiple previous moments. This step inputs the optimized weighted features into the LSTM memory layer, using a three-gate structure (forget gate, input gate, output gate) to mine temporal correlation patterns and capture the energy consumption lag characteristics and long-term temporal dependencies. The core update formula is as follows: Forget gate: controls the proportion of invalid time-series information discarded. Formula: ; Input gate: controls the proportion of valid features retained at the current time step. Formula: ; Unit state update: Integrate historical state with current new features to update the temporal memory unit. Formula: ; Output gate: Filters the final effective temporal hidden features. Formula: ; In the formula: The sigmoid activation function is used to normalize the weights to 0-1. The trainable weight matrix for each layer; For the corresponding bias parameters; These are the deep temporal hidden features output by the LSTM layer.
[0053] By using LSTM time-series memory layer deep time-series feature analysis, we can extract deep time-series correlation features and hysteresis characteristics of unit energy consumption, and complete deep modeling of the time-series dimension.
[0054] The unit operating conditions are classified and coded to achieve differentiated operating condition modeling. Five standard operating conditions are defined: low load steady state, medium load steady state, high load steady state, conventional variable load, and rapid operating condition switching / start-stop. Five-dimensional unique thermal coding is performed on these five conditions, generating a dimension of... Standard chemical condition label vector The working condition label vector is combined with the deep temporal features output by LSTM. By performing dimensional concatenation, the working condition fusion features are obtained: .
[0055] The fused multi-dimensional working condition time-series features The input operating condition feature embedding layer performs feature dimensionality reduction and mapping through a fully connected network, fits the basic correlation between the operating condition and the energy consumption benchmark, and outputs a preliminary energy consumption prediction value without considering real-time disturbances, thus obtaining the preliminary predicted energy consumption value under the benchmark operating condition. Complete the basic energy consumption fitting of the model; In actual unit operation, fluctuations in the quality of coal fed into the furnace and changes in ambient temperature are the core external disturbances affecting the energy consumption baseline. To further improve the model accuracy, this step introduces two types of dynamic correction coefficients to perform a second, more precise correction on the preliminary predicted values: (1) Based on the deviation between the rated calorific value of coal and the real-time calorific value of coal fed into the furnace, the energy consumption deviation caused by the fluctuation of coal quality is corrected, and the coal quality calorific value correction coefficient is used. : ; In the formula: Design the rated calorific value of coal for the unit (baseline value); The lower calorific value of the coal fed into the furnace in real time; The sensitivity coefficient for coal quality correction is used to adaptively match the degree of influence of coal quality on energy consumption. (2) Based on the deviation between the design ambient temperature and the real-time ambient temperature, the impact of environmental disturbances on the unit's energy consumption is corrected, and the ambient temperature correction coefficient is used. : ; In the formula: The rated design ambient temperature of the unit; Real-time ambient temperature; This is the ambient temperature sensitivity coefficient.
[0056] (3) By integrating the two types of correction coefficients, the dynamic correction of the preliminary predicted value is completed, and the accurate adaptive energy consumption benchmark value under the working condition is output: ; In the formula: This is the baseline value for standard coal consumption for adaptive power supply under operating conditions, which is the final output of the model.
[0057] Multi-factor dynamic correction yields accurate energy consumption benchmark predictions that adapt to real-time coal quality, environment, and operating conditions, completing the entire forward inference process of the model. To minimize the deviation between the model's predicted values and the actual energy consumption values, the mean squared error loss function is used to calculate the training error. All network weights and bias parameters are iteratively updated through backpropagation to continuously optimize model accuracy. The loss function formula is as follows: ; In the formula: To measure the actual coal consumption for power supply; To predict baseline coal consumption for the model; To determine the number of training samples, the network weights and biases are iteratively updated through backpropagation.
[0058] If the validation set loss does not decrease for 20 consecutive rounds and the prediction accuracy is ≥98%, the iteration is stopped, the model parameters are fixed, and the final adaptive energy consumption benchmark model is obtained.
[0059] The trained model can output the optimal theoretical energy consumption benchmark value of the unit under the corresponding operating condition in milliseconds, based on the real-time operating parameters at the current moment. Step 4: Compare the actual energy consumption data of the unit with the theoretical energy consumption benchmark value output by the energy consumption benchmark prediction model, set an adaptive deviation threshold, determine the energy consumption operation status, and identify abnormal energy consumption operation status data, as follows: Retrieve the complete historical operation dataset of the unit that has undergone complete preprocessing in step 2, has unified timing, and meets quality standards. Strictly classify and divide the data samples according to the five preset operating conditions. The five operating conditions are low load steady-state condition, medium load steady-state condition, high load steady-state condition, normal variable load condition, and rapid operating condition switching / start-stop condition.
[0060] Five independent operating condition partitions of historical valid sample datasets were obtained, realizing the isolation of data from different operating states and avoiding statistical distortion caused by mixed operating conditions.
[0061] For each category of operational condition sample set, the historical measured standard coal consumption for power supply is matched one by one with the theoretical baseline coal consumption output by the LSTM model under the same operational condition, and the relative deviation rate of energy consumption for a single historical sample is calculated: ; In the formula: This is the i-th measured standard coal consumption for power supply in history; This is the theoretical baseline coal consumption. This refers to the relative deviation rate of energy consumption. Generate a sequence of relative deviation rates of unit energy consumption for all valid samples under each type of working condition, and clarify the degree of deviation between the measured energy consumption of each historical data point and the ideal baseline energy consumption; Using the deviation rate of all valid samples within a single operating condition zone as the calculation object, the overall average level of energy consumption deviation under that operating condition is obtained, reflecting the average degree of long-term overall energy consumption deviation of the unit from the benchmark under that operating condition. The formula for calculating the average operating condition deviation is: ; In the formula, n is the total number of valid standardized samples in the current working condition; The average value of energy consumption deviation under a single operating condition is obtained, and the average level of the unit's normal energy consumption deviation from the benchmark under that operating condition is determined. Based on the obtained mean deviation of operating conditions, the standard deviation of the deviation rate of all samples under the same operating condition is calculated. This standard deviation is used to quantify the normal fluctuation range and intensity of the unit's energy consumption under that operating condition, and to characterize the unit's inherent normal energy consumption fluctuation range. The formula for calculating the standard deviation of operating condition deviation is as follows: ; In the formula To determine the number of effective standardized samples within the corresponding operating condition partition, a global initial basic anomaly threshold is defined. ; The standard deviation of energy consumption under single operating conditions is obtained, the inherent fluctuation range of normal energy consumption of units under five operating conditions is characterized, and the allowable natural energy consumption fluctuation range of each operating condition is clarified. Based on the energy consumption control standards of the thermal power industry, the unit design and operation specifications, and the historical deviation distribution statistics, a unified, fixed initial basic anomaly threshold that is unaffected by external factors is set. As the benchmark threshold for judging abnormal energy consumption, the fixed foundation judgment threshold is determined. ; Combining the actual service life of the unit with its designed lifespan, an aging sensitivity coefficient is introduced to quantify the shift in energy consumption judgment threshold caused by the aging of equipment components and the degradation of mechanical performance. An equipment aging correction coefficient is then constructed, and the calculation formula is as follows: ; In the formula: The actual cumulative operating time of the unit (h); Design the full lifecycle operating time (h) of the unit; This refers to the aging sensitivity coefficient. This is a seasonal environmental correction factor, with values assigned according to the season.
[0062] Based on the differences in ambient temperature, atmospheric humidity, and external ventilation conditions throughout the year, the climate is divided into four seasonal zones: spring, summer, autumn, and winter. A corresponding fixed seasonal environmental correction coefficient is then assigned to each season. To adapt to the impact of external environmental changes on unit heat dissipation, auxiliary equipment operation, and coal combustion efficiency, seasonal environmental correction coefficients are implemented. The assigned value can be quickly retrieved and used based on the real-time seasonal operation. By multiplying the basic anomaly threshold, equipment aging correction coefficient, and seasonal environmental correction coefficient together, the fixed basic threshold is dynamically scaled and adjusted to achieve an adaptive judgment threshold that automatically adapts to the random group's service status and external environment.
[0063] Calculation formula: ; In the formula This is the equipment aging correction factor; Generate real-time dynamic adaptive energy consumption anomaly detection threshold It is suitable for the entire life cycle and all seasons of the unit's operation. The system integrates three core energy consumption monitoring parameters of the generating unit, including three major categories of indicators: real-time coal consumption for power generation, total plant power consumption rate, and unit consumption of various major auxiliary machines. The three categories of measured energy consumption indicators are matched one by one with the theoretical optimal benchmark indicators for the same operating conditions output by the LSTM model.
[0064] Unified and universal energy consumption deviation calculation formula: ; In the formula: To collect and measure energy consumption indicators in real time on site, including measured coal consumption for power supply, measured plant power consumption rate, and measured unit consumption of auxiliary equipment; The model outputs the corresponding optimal theoretical baseline energy consumption index.
[0065] Using the standard coal consumption deviation for power supply as the core judgment basis, and the deviation of plant power consumption rate and auxiliary equipment unit consumption as auxiliary verification basis, we avoid misjudgment due to a single indicator, complete the matching of the measured values and theoretical benchmark values of the three core energy consumption indicators, and obtain the full-dimensional energy consumption indicator deviation calculation standard and real-time deviation value. relative deviation rate of standard coal consumption for power supply Based on the core judgment criteria and combined with industry operating experience and the actual operating characteristics of the units, three distinct energy consumption operation levels are defined, with corresponding handling mechanisms and early warning rules matched accordingly. The core judgment deviation formula is as follows: ; The energy consumption operation status is divided into three levels: (1) Normal energy consumption state: The unit's energy consumption fluctuations are within the normal tolerance range of the model fitting, with no additional energy loss and no need for optimization and control. (2) Minor energy consumption anomalies: The unit exhibits controllable and continuous performance consumption deviations, mostly caused by minor parameter deviations, auxiliary equipment mismatches, and minor fluctuations in coal quality, triggering routine warnings in the background logs. (3) Severe energy consumption anomalies: The unit exhibits significant and continuous performance waste, with issues such as equipment performance degradation, severe parameter deviations, and deterioration of combustion conditions. This triggers audible and visual alarms on the platform and push notifications to mobile devices, automatically locking onto the abnormal full-dimensional dataset.
[0066] Complete the accurate determination of the real-time energy consumption operation level of the unit, output the unit energy consumption operation status results and anomaly locking dataset, and complete the comparison and intelligent determination of energy consumption deviations throughout the entire process.
[0067] Step 5: Based on the identified abnormal energy consumption operation data, use correlation analysis and grey relational analysis to quantify the influence weights of multiple factors on energy consumption deviations, and locate the core abnormal causes, as follows: Based on the energy consumption anomaly time series dataset determined in step 4, a complete energy consumption influencing factor system is constructed from four dimensions: fuel, equipment, operation, and environment. Effective influencing factors are screened layer by layer, and the anomaly contribution weight of each factor is quantified. Primary, secondary, and tertiary anomaly causes are automatically classified, eliminating the need for manual experience-based judgment and achieving data-driven, precise source tracing and localization of abnormal energy consumption. The specific steps are as follows: Based on the raw operating parameters collected from all dimensions of the generating unit, combined with the thermal production mechanism of thermal power units, and according to the sources of influence on energy consumption, the system is divided into four dimensions. Specific attribution evaluation indicators are determined for each dimension, forming a comprehensive energy consumption influencing factor index library. (1) Fuel dimension (core influencing factors of combustion energy consumption): lower heating value of coal Q, ash content of coal A, moisture content of coal M, and blending ratio of high and low coal R; (2) Equipment dimension (core influencing factors of equipment loss): operating efficiency of the four auxiliary machines η, aging coefficient of unit equipment K, pipeline loss of flue / steam-water system ΔP, and output matching degree of coal mill; (3) Operational dimensions (core influencing factors of manual control): unit load fluctuation rate δ, main steam pressure deviation ΔP, main steam temperature deviation ΔT, auxiliary machine speed matching coefficient; (4) Environmental dimension (core influencing factors of external disturbances): ambient temperature T, ambient humidity H, atmospheric pressure P.
[0068] The relative deviation rate of energy consumption calculated in step 4 As the parent sequence for attribution analysis and as the benchmark for analyzing abnormal energy consumption changes, the measured time series data of all influencing factors in the four dimensions established in the previous step are uniformly set as the attribution factor sequence. The timestamps of abnormal periods are strictly aligned to ensure that the energy consumption deviation data of the parent sequence corresponds one-to-one with the time series of influencing factor data of each subsequence. Abnormal samples with time series misalignment and severe data missing are removed to form a parent sequence-subsequence attribution analysis dataset with time series alignment and complete pairing. The linear correlation between each candidate factor and the energy consumption anomaly is quantified using the Pearson correlation coefficient. Redundant factors with weak correlation and no actual impact are eliminated, the number of attribution indicators is reduced, and the correlation coefficient is calculated for each influencing factor by substituting them into the Pearson correlation coefficient formula. ; In the formula: is the Pearson correlation coefficient between the i-th influencing indicator and the energy consumption deviation; Let be the measured value of the i-th index at time t; Let i be the time series mean of the i-th indicator; The energy consumption deviation rate at time t; This represents the average energy consumption deviation during abnormal periods. This represents the number of abnormal time series samples.
[0069] We obtain a simplified set of core attribution factors that are highly correlated with energy consumption anomalies, remove invalid interference factors, and reduce the complexity of subsequent weight calculations. The grey relational analysis algorithm is used to quantify the weights of the selected core indicators, accurately representing the contribution of each factor to energy consumption anomalies. The various factors have significant differences in dimensions and numerical magnitudes, making direct correlation analysis impossible. Therefore, an initialization method is used to uniformly eliminate these dimensional differences. ; Using the value of the first time step of the sequence as a benchmark, all factor and energy consumption deviation data are standardized to obtain dimensionless standardized factor sequences and energy consumption deviation sequences.
[0070] Solve for the absolute value of the difference between the standardized energy consumption deviation sequence and the sequences of each core factor to generate the absolute difference sequence between the factors and the energy consumption deviation at each time point: ; Iterate through all absolute difference data, extract the global maximum and global minimum values, and determine the global two-range differences required for grey relational analysis: ; By introducing a resolution coefficient and combining the difference between the two time points, the correlation coefficient between the factor at each time point and the energy consumption anomaly is calculated, thus obtaining the single-time grey correlation coefficient for each time series node: ; In the formula: The resolution coefficient; By averaging the correlation coefficients at a single time point, the overall correlation degree of each individual factor can be obtained, thus yielding the overall grey correlation degree of each core factor with energy consumption anomalies. ; In the formula: Let be the correlation between the i-th core indicator and the energy consumption deviation; The correlations of all core factors were normalized and converted into percentage weights for outlier contributions. ; In the formula: To contribute weight to the abnormality of a single indicator, The core indicator quantity is represented by all weights, which sum to 100%. Output the percentage of abnormal contribution weights of each core influencing factor, complete the quantitative ranking of the influence of data-level causes, and initially conclude that: coal quality fluctuation is the primary cause, steam parameter deviation is the secondary cause, and auxiliary equipment mismatch is the tertiary cause. Weighted ranking results: Coal quality fluctuation (primary cause) Deviation in steam parameters (minor contributing factor) Auxiliary equipment mismatch (Level 3 cause).
[0071] To avoid misattribution caused by purely statistical data correlation, a secondary verification is performed to eliminate spurious correlation factors. The judgment rules are as follows: (1) Fuel factor verification: The decrease in calorific value and increase in ash content of the coal fed into the furnace will inevitably lead to a decrease in the effective heat absorption of the boiler and an increase in the unit coal consumption, which is consistent with the boiler combustion mechanism and the correlation results are valid. (2) Verification of operating parameters: If the main steam temperature is lower than the design value, it will reduce the cycle thermal efficiency and increase the unit energy consumption, which is in line with the Rankine cycle mechanism and the correlation results are valid; (3) Equipment factor verification: The efficiency of the induced draft fan decreased, the flue gas exhaust resistance increased, the exhaust heat loss increased, and the plant power consumption rate increased, which is consistent with the auxiliary machine energy consumption coupling mechanism, and the correlation results are valid; (4) False association elimination rule: If there are factors with high data correlation but which violate the thermodynamic mechanism (such as rising ambient temperature and theoretically decreasing coal consumption but data showing an increase), they are directly judged as false data correlation and the attribution results are eliminated.
[0072] Step 6: Based on the identified core anomaly causes and combined with the operating constraints of the thermal power plant units, generate intelligent energy-saving methods for unit parameter control, auxiliary equipment load matching, coal blending optimization, and equipment operation and maintenance rectification. The energy consumption analysis results and optimization methods are then presented, as detailed below: Based on the multi-dimensional energy consumption anomaly attribution results, the quantified contribution weights of each inducing factor, and the core energy consumption contributing factors output in step 5, an adaptive intelligent optimization system is constructed. The system comprises three core modules: steady-state optimal operating condition matching, real-time fine-tuning of dynamic parameters, and pre-assessment of energy-saving effects. Relying on the unit's comprehensive historical operating condition database, the optimal benchmark operating condition is matched through Euclidean distance operating condition similarity retrieval. A negative feedback fine-tuning mechanism is constructed by combining energy consumption deviation and inducing factor weights to achieve precise reverse correction of abnormal parameters. Furthermore, the energy-saving and consumption-reduction benefits are quantified through a pre-assessment model, ultimately completing a closed-loop intelligent energy consumption optimization analysis encompassing anomaly location, intelligent optimization, parameter control, and benefit assessment, as detailed below: Based on the operating characteristics of thermal power units, this paper distinguishes between two operating scenarios: steady-state operation and dynamic fluctuations under varying operating conditions. A dual-model collaborative adaptive intelligent optimization framework is then established. For long-term steady-state operation, a steady-state optimal operating condition matching model is employed, retrieving massive historical data to match the optimal and efficient operating paradigm under the same conditions. For dynamic energy consumption deviations caused by load fluctuations, coal quality changes, and environmental variations, a dynamic real-time fine-tuning model is used to achieve precise parameter correction. The two models work together to adapt to the optimization needs of the unit's operation across all scenarios. The weights of the abnormal causes quantified in step 5 are used as the optimization priority; parameters with higher weights have higher optimization and control priority.
[0073] A dual-mode adaptive intelligent energy-saving optimization framework for steady-state matching and dynamic fine-tuning has been established, and the overall technical route for subsequent operating condition matching, parameter fine-tuning, and effect evaluation has been clarified.
[0074] The energy consumption level of a generating unit is mainly determined by three core conditions: unit output, fuel quality, and external environment. To ensure the accuracy of operating condition matching and eliminate redundant interference parameters, three key features are selected as the sole evaluation indicators for operating condition similarity matching: unit power generation load, lower heating value of coal fed into the furnace, and ambient temperature. Real-time current operating condition feature vectors and historical sample operating condition feature vectors are constructed to achieve a digital and standardized representation of the operating condition status. ; In the formula: This represents the current real-time power generation load of the generating unit. This represents the lower calorific value of the coal currently being fed into the furnace. This represents the current real-time ambient temperature.
[0075] Constructing feature vectors of working conditions from historical database samples: ; In the formula: The load of the units corresponding to the historical samples; The calorific value of the coal fed into the furnace corresponds to the historical sample. The corresponding ambient temperature for historical samples; Generate standardized three-dimensional feature vectors of current working conditions and historical sample working conditions to complete the data structure preparation for working condition similarity calculation; To accurately match historical operating conditions that are highly consistent with the current operating conditions and eliminate optimization biases caused by differences in operating conditions, the Euclidean distance algorithm is used to quantify the similarity between the current operating conditions and all historical sample operating conditions. Euclidean distance accurately reflects the overall degree of difference in the multi-dimensional feature space; the smaller the distance, the closer the operating conditions of the two sets of conditions are, and it has direct reference value.
[0076] Formula for calculating Euclidean distance of operating condition similarity: ; In the formula: D is the Euclidean similarity distance of the working conditions; by taking the square root of the sum of the squares of the differences of the three core characteristics of load, coal quality and environment, the overall difference of the multi-dimensional working conditions is quantified.
[0077] After completing the distance calculation of all historical samples, the set of highly similar operating conditions with the smallest D value is uniformly sorted and selected. Among the similar operating condition samples, the sample with the lowest standard coal consumption for power supply is further selected, and its corresponding set of operating parameters is defined as the benchmark optimal operating condition under the current operating condition. The historical optimal energy-saving benchmark operating condition and the set of optimal operating parameters corresponding to the current operating condition are matched to obtain the standard reference for the optimization of the unit's steady-state operating condition.
[0078] The steady-state optimal operating condition is only applicable to stable operation scenarios. For dynamic instantaneous energy consumption deviations caused by unit load changes, coal quality fluctuations, and sudden environmental changes, dynamic correction is required through real-time parameter fine-tuning. This step combines the degree of real-time energy consumption deviation and the contribution weight of abnormal causes to construct a negative feedback fine-tuning model, automatically calculate the parameter optimization adjustment amount, and specifically offset abnormal energy consumption deviations.
[0079] Formula for calculating dynamic parameter fine-tuning: ;
[0080] In the formula: Optimize the fine-tuning amount for the parameters; To dynamically adjust the learning coefficient; Weighting of core incentive factors; Real-time energy consumption deviation rate; This represents the measured value of the current abnormal parameter; the negative sign indicates a reverse correction to offset the energy consumption deviation. It outputs precise real-time fine-tuning values for each core energy consumption parameter, forming parameter optimization and control commands under dynamic operating conditions to quickly suppress dynamic energy consumption anomalies. To predict the energy-saving effect of parameter optimization in advance and quantify the improvement in energy consumption, a pre-evaluation model for the optimization effect is built based on the current measured coal consumption, real-time deviation rate, and attribution weights of each factor. This model calculates the predicted coal consumption for power supply after regulation, accurately measures the energy-saving benefits, and provides the formula for calculating the optimized predicted coal consumption: ; In the formula: The predicted standard coal consumption for power supply after parameter optimization and control; This refers to the unit's current real-time measured coal consumption for power supply. The total amount of coal consumption that can be reduced for each abnormal cause represents the ineffective energy consumption that can be eliminated by this optimization.
[0081] By calculating the difference between the measured coal consumption before optimization and the predicted coal consumption after optimization, the energy consumption reduction, coal saving, and electricity saving benefits of this intelligent optimization can be obtained, thus completing the quantitative evaluation of energy-saving effect, obtaining the predicted coal consumption value after optimization, the energy consumption reduction rate, and the quantitative energy-saving benefits, and realizing the advance prediction of energy-saving effect.
[0082] By integrating the outputs of all the aforementioned models, and combining the anomaly tracing conclusions, optimal steady-state operating parameters, dynamic fine-tuning schemes, and pre-assessed energy-saving benefits, a systematic energy consumption optimization analysis conclusion is formed. Distinguishing between steady-state long-term optimization strategies and dynamic instantaneous control strategies, the energy-saving improvement directions in different dimensions such as coal quality, operating parameters, auxiliary equipment matching, and environmental adaptation are clarified. This results in a feasible long-term energy consumption management plan for the unit, and a complete intelligent energy consumption optimization analysis report is output, including optimal operating condition standards, parameter fine-tuning schemes, energy consumption improvement potential, and conclusions on energy-saving economic benefits.
[0083] It should be stated that the above-described invention content and specific embodiments are intended to demonstrate the practical application of the technical solution provided by this invention and should not be construed as limiting the scope of protection of this invention. Those skilled in the art can make various modifications, equivalent substitutions, or improvements within the spirit and principles of this invention. The scope of protection of this invention is defined by the appended claims.
Claims
1. A method of analyzing energy consumption data of a thermal power plant, characterized by, Includes the following steps: S1. Build a full-link data acquisition architecture for thermal power plants, collect multi-source data on unit operation in real time, and store the collected multi-source data in a hierarchical and classified manner according to the unit layer, auxiliary machine layer, and system layer. S2. The collected multi-source data are sequentially processed by missing value repair, outlier removal, data normalization, and time-series alignment to eliminate data noise and dimensionality differences, forming a time-series energy consumption dataset. S3. Based on the time-series energy consumption dataset, an energy consumption benchmark prediction model is constructed using an LSTM neural network. The unit operating condition label and coal quality fluctuation correction coefficient are introduced to fit the theoretical energy consumption benchmark value of the unit under different load ranges, different coal qualities, and different environmental conditions. S4. Compare the actual energy consumption data of the unit with the theoretical energy consumption benchmark value output by the energy consumption benchmark prediction model, set an adaptive deviation threshold, determine the energy consumption operation status, and identify abnormal data of energy consumption operation status. S5. Based on the identified abnormal energy consumption operation data, through correlation analysis and grey relational algorithm, quantify the influence weight of multiple factors on energy consumption deviation and locate the core abnormal causes. S6. Based on the core anomaly causes located in the location, and combined with the operating constraints of thermal power plant units, generate intelligent energy-saving methods for unit parameter control, auxiliary equipment load matching, coal blending optimization, and equipment operation and maintenance rectification, and form energy consumption analysis results and optimization methods.
2. The method of claim 1, wherein, The multi-source data mentioned in step S1 specifically includes: core unit parameters, fuel parameters, auxiliary machine parameters, energy consumption parameters, and environmental parameters.
3. The method of claim 1, wherein the method further comprises: The specific process of step S2 includes: S21. For short-term single-point missing data, use the adjacent time series mean interpolation method for repair; for continuous long-term missing data, use historical similar data under the same working conditions for fitting and repair, and mark the data repair label. S22. Using the 3σ criterion combined with the unit operation logic rules, invalid abnormal data caused by sensor failures and instantaneous disturbances are eliminated, while valid abnormal data generated by unit operating condition switching and fault fluctuations are retained. S23. Use the min-max normalization algorithm to uniformly map all dimensional data to the [0,1] interval to eliminate dimensional differences; S24. Based on the unit load time series, perform time series interpolation and alignment on multi-source data with different acquisition frequencies to ensure that all data timestamps are consistent, forming a standardized dataset with continuous time series and consistent dimensions.
4. The method of claim 1, wherein, Step S3 involves constructing an energy consumption benchmark prediction model based on the time-series energy consumption dataset using an LSTM neural network. This model incorporates unit operating condition labels and coal quality fluctuation correction coefficients to fit the theoretical energy consumption benchmark values for different load ranges, coal qualities, and environmental conditions. Specifically, this includes: S31. Based on the preprocessed [0,1] normalized multidimensional time series dataset, generate a multidimensional time series feature input matrix to complete the structured definition of the model input data; S32. Calculate the operating condition time-series fluctuation and perform disturbance discrimination by using the parameter difference between adjacent time points, and obtain the operating condition fluctuation and disturbance attribute label for each time-series node to complete the refined discrimination of the unit's operating status. S33. Based on the disturbance discrimination results, the negative deterioration disturbance is weighted and strengthened, while the positive and steady-state fluctuations are fitted in a conventional manner. The learning weights that highlight the high energy consumption risk characteristics are obtained to obtain the differentiated asymmetric attention original scores for each time series node. S34. Introduce a time-series decay factor to weaken the weight of instantaneous invalid disturbances and strengthen the effective time-series characteristics of long-term steady-state and continuous operating condition changes. Obtain the time-series decay weight factor for each time-series node to complete the filtering of invalid instantaneous disturbances and the strengthening of effective long-term characteristics. S35. The asymmetric attention score is fused with the temporal decay factor, and the weights are normalized through the Softmax mechanism to obtain the optimized temporal feature set. S36. Input the weighted features into the LSTM memory layer, mine the time series correlation patterns through the three-gate structure, capture the energy consumption lag characteristics and long-term time series dependencies, extract the deep time series correlation features and lag characteristic features of unit energy consumption, and complete the deep modeling of the time series dimension. S37. Classify and encode typical operating conditions of the unit to realize differentiated operating condition modeling, generate multi-dimensional deep features that fuse time-series features and operating condition features, and realize the model's differentiated perception of different operating states. S38. Input the fused multi-dimensional working condition time series features into the working condition feature embedding layer to obtain the preliminary predicted energy consumption value under the benchmark working condition, complete the basic energy consumption fitting of the model, and introduce two types of dynamic correction coefficients to perform a second precise correction on the preliminary predicted value. S39. The training error is calculated using the mean squared error loss function. All weights and bias parameters of the network are iteratively updated through the backpropagation mechanism to complete the model iterative optimization and parameter solidification, thus obtaining the working condition adaptive energy consumption benchmark prediction model.
5. The method of claim 4, wherein the method further comprises: The LSTM three-gate structure in step S36 includes a forget gate that controls the proportion of discarded historical invalid time-series information, an input gate that controls the proportion of retained valid features at the current time, and an output gate that filters the final valid time-series hidden features.
6. The method of claim 1, wherein, Step S4 compares the actual energy consumption data of the unit with the theoretical energy consumption benchmark value output by the energy consumption benchmark prediction model, sets an adaptive deviation threshold, determines the energy consumption operation status, and identifies abnormal energy consumption operation status data, specifically including: S41. Retrieve the complete historical operating dataset of the unit and strictly classify and divide the data samples according to the five preset operating conditions. We obtained a dataset of historical valid samples from five independent operating condition partitions; S42. For each category of working condition sample set, match the historical measured standard coal consumption for power supply with the theoretical benchmark coal consumption output by the LSTM model under the same working condition, calculate the energy consumption relative deviation rate of a single historical sample, and generate a sequence of individual energy consumption relative deviation rates corresponding to all valid samples under each category of working condition. S43. Using the deviation rate of all valid samples within a single operating condition zone as the calculation object, calculate the overall average level of energy consumption deviation under that operating condition, obtain the average value of energy consumption deviation under a single operating condition, and determine the average level of the unit's normal energy consumption deviation from the benchmark under that operating condition. S44. Combining the actual service time of the unit with the design life cycle, an aging sensitivity coefficient is introduced to quantify the energy consumption judgment threshold deviation caused by the aging of equipment components and the degradation of mechanical performance. A basic abnormal threshold is set and a two-dimensional correction coefficient is constructed. S45. Multiply the basic anomaly threshold, equipment aging correction coefficient, and seasonal environmental correction coefficient together to dynamically scale and adjust the fixed basic threshold, and solve for the dynamic adaptive energy consumption anomaly judgment threshold. S46. Using the standard coal consumption deviation of power supply as the core judgment basis, and the deviation of plant power consumption rate and auxiliary equipment unit consumption as auxiliary verification basis, the measured values of the three types of core energy consumption indicators are matched with the theoretical benchmark values to obtain the full-dimensional energy consumption indicator deviation calculation standard and real-time deviation value. S47. Based on the relative deviation rate of standard coal consumption for power supply as the core criterion, three distinct energy consumption operation levels are defined to accurately determine the real-time energy consumption operation level of the unit.
7. The method of claim 6, wherein the method further comprises: The specific steps in step S44, including setting the basic anomaly threshold and constructing the two-dimensional correction coefficient, include: (1) Set a global fixed initial basic anomaly threshold that is not affected by external factors as the benchmark critical value for energy consumption anomaly judgment; (2) Combining the actual service time of the unit with the design full life cycle time, an aging sensitivity coefficient is introduced to quantify the energy consumption judgment threshold shift caused by the aging of equipment components and the degradation of mechanical performance, and to construct an equipment aging correction coefficient. (3) Based on the differences in ambient temperature, atmospheric humidity and external ventilation conditions throughout the four seasons, the four seasons of spring, summer, autumn and winter are divided, and corresponding fixed seasonal environmental correction coefficients are matched for different seasons.
8. The method for analyzing energy consumption data in a thermal power plant according to claim 1, characterized in that, In step S5, based on the identified abnormal energy consumption operation data, correlation analysis and grey relational analysis are used to quantify the influence weights of multiple factors on energy consumption deviations, and the core abnormal causes are located, including: S51. Based on the original operating parameters collected from all dimensions of the unit, combined with the thermal power unit's thermal production mechanism, and according to the sources of influence on energy consumption, four dimensions are divided, and specific attribution evaluation indicators are determined for each dimension to form a comprehensive energy consumption influencing factor index library. S52. Set the relative deviation rate of time series energy consumption as the parent sequence for attribution analysis, and use it as the analysis benchmark for abnormal energy consumption changes. The measured time series data of all influencing factors in the four dimensions are uniformly set as the attribution factor sequence to form the parent sequence and sub-sequence attribution analysis dataset. S53. Quantify the linear correlation between each candidate factor and the energy consumption anomaly deviation by using the Pearson correlation coefficient, eliminate redundant factors with weak correlation and no actual impact, simplify the number of attribution indicators, and obtain a simplified core attribution factor set that is highly correlated with energy consumption anomalies. S54. The grey relational algorithm was used to quantify the influence weight of factors, obtain the percentage abnormal contribution weight of each core influencing factor, and complete the quantitative ranking of the influence of data-level causes. It was initially concluded that: coal quality fluctuation is the primary cause, steam parameter deviation is the secondary cause, and auxiliary equipment mismatch is the tertiary cause. S55. Perform a second verification of the thermal mechanism to eliminate pseudo-correlation factors, obtain a list of effective abnormal causes and contribution weights for eliminating pseudo-correlation, and complete the accurate location of energy consumption anomalies.
9. The method for analyzing energy consumption data in a thermal power plant according to claim 8, characterized in that, The step S54 of completing the factor influence weight quantification specifically includes: (1) The initialization method is used to uniformly eliminate the difference in dimensions. The value of the first time step of the sequence is used as the benchmark to complete the standardization of all factors and energy consumption deviation data, and the dimensionless standardized factor sequence and energy consumption deviation sequence are obtained. (2) Solve for the absolute value of the difference between the standardized energy consumption deviation sequence and each core factor sequence to generate the absolute difference sequence between the factor and the energy consumption deviation at each time node; (3) Traverse all absolute difference data, extract the global maximum and global minimum values, and determine the global two-pole difference values required for grey relational analysis; (4) Introduce the resolution coefficient and combine the two extreme differences to calculate the correlation coefficient between the factor at each time point and the energy consumption anomaly, and obtain the gray correlation coefficient of each time point at each time series node. (5) Perform time-series averaging on the correlation coefficients at a single moment to obtain the overall correlation degree of a single factor and obtain the overall grey correlation degree of each core factor with energy consumption anomaly; (6) Normalize the correlation of all core factors, output the percentage abnormal contribution weight of each core influencing factor, complete the quantitative ranking of the influence of data-level causes, and initially conclude that: coal quality fluctuation is the primary cause, steam parameter deviation is the secondary cause, and auxiliary equipment mismatch is the tertiary cause.
10. The method for analyzing energy consumption data in a thermal power plant according to claim 1, characterized in that, Step S6, based on the location-based core anomaly causes and combined with the operating constraints of thermal power plant units, generates intelligent energy-saving methods for unit parameter control, auxiliary equipment load matching, coal blending optimization, and equipment operation and maintenance rectification, and forms energy consumption analysis results and optimization methods, specifically including: S61. Construct an adaptive intelligent energy-saving optimization framework, adopt a steady-state optimal operating condition matching model, and search through massive historical data to match the optimal and efficient operating paradigm under the same conditions; for dynamic energy consumption deviations caused by load fluctuations, coal quality changes, and environmental changes, adopt a dynamic real-time fine-tuning model to achieve precise parameter correction. S62. Select the three key features of unit power generation load, low calorific value of coal fed into the furnace, and ambient temperature, and construct the real-time current operating condition feature vector and the historical sample operating condition feature vector respectively to complete the data structure preparation for operating condition similarity calculation. S63. Use the Euclidean distance algorithm to quantify the similarity between the current operating condition and all historical sample operating conditions, match historical operating conditions that are highly consistent with the current operating conditions, and eliminate optimization deviations caused by differences in operating conditions. S64. To address the dynamic instantaneous energy consumption deviation caused by unit load changes, coal quality fluctuations, and sudden environmental changes, a negative feedback fine-tuning model is constructed by combining the real-time energy consumption deviation degree and the contribution weight of abnormal causes, and the parameter optimization adjustment amount is automatically calculated. S65. Build a pre-evaluation model for the optimization effect, calculate the predicted coal consumption for power supply after regulation, accurately calculate the energy-saving benefits, and obtain the predicted coal consumption value, energy consumption reduction rate, and quantified energy-saving benefits after optimization. S66. Integrate the output results of all the aforementioned models, and combine them with the anomaly tracing conclusions, optimal steady-state operating parameters, dynamic fine-tuning schemes, and pre-assessed energy-saving benefits to form a systematic energy consumption optimization analysis conclusion.