A risk management and control system and method for an energy storage power station
By constructing a multi-scale risk trend prediction engine and a multi-objective optimization model, the problems of early warning lag and rigid decision-making mechanism in the risk management system of energy storage power stations have been solved. This has enabled advanced prediction and proactive control of thermal runaway risks, thereby improving the operational economy and safety of energy storage systems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- IANGSU COLLEGE OF ENG & TECH
- Filing Date
- 2026-03-26
- Publication Date
- 2026-06-19
Smart Images

Figure CN122243209A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of energy storage power station technology, and in particular to a risk management system and method for energy storage power stations. Background Technology
[0002] While it integrates multiple algorithms for intelligent diagnosis, its core logic still relies on current or historical operational data fed back by sensors. The technical goal is to improve the accuracy of abnormal state identification and the reliability of alarms, triggering fixed emergency response plans at preset levels.
[0003] This technical approach has inherent flaws at the methodological level: First, the early warning is delayed, and the response is only initiated after the abnormal characteristics are clearly manifested, by which time the thermal runaway chain reaction may have already entered the acceleration stage, and the emergency response window is extremely short; second, the decision-making mechanism is static and rigid, unable to distinguish the trend of risk evolution, and may trigger the same response for risks of different urgency; third, resource allocation is extensive, and a one-size-fits-all response is prone to excessive intervention and waste of resources; fourth, the function is limited, only able to judge the current dangerous state, and cannot provide support for predictive maintenance and operation optimization.
[0004] The information disclosed in this background section is intended only to enhance the understanding of the overall background of the invention and should not be construed as an admission or in any way implying that the information constitutes prior art known to those skilled in the art. Summary of the Invention
[0005] The purpose of this invention is to provide a risk management system and method for energy storage power stations, which can not only significantly extend the risk warning time window, but also realize the management paradigm innovation from passive emergency response to active prediction and control. While strengthening the safety defense line of the energy storage system, it effectively improves the economic efficiency of the entire life cycle operation.
[0006] To achieve the above objectives, embodiments of the present invention provide a risk management system for energy storage power stations, comprising: The multi-source synchronous monitoring module is configured to acquire multi-dimensional monitoring data of key battery parameters in the energy storage power station in real time through a multi-dimensional sensing unit. A multi-scale risk trend prediction module, connected to the multi-source synchronous monitoring module, is configured to predict the risk trend of the energy storage power station based on the monitoring data, and generate a prediction sequence for identifying the change trend of key battery parameters and the overall risk level within a future preset time window, as well as a confidence index corresponding to the prediction sequence. The dynamic risk assessment and fusion decision-making module is connected to the multi-source synchronous monitoring module and the multi-scale risk trend prediction module, respectively, and is configured to generate a risk index based on the prediction sequence, the corresponding confidence index, and the real-time monitoring data collected by the multi-source synchronous monitoring module. The resource scheduling and strategy optimization module, connected to the dynamic risk assessment and fusion decision module, is configured to perform multi-objective optimization based on the risk index and generate scheduling instructions that adapt to the current and future risk states. The execution and closed-loop feedback module is connected to the resource scheduling and strategy optimization module, the multi-scale risk trend prediction module, and the dynamic risk assessment and fusion decision module, respectively. It is configured to execute the scheduling command, collect the system response feedback data of the energy storage power station after execution, and feed the system response feedback data back to the multi-scale risk trend prediction module and / or the dynamic risk assessment and fusion decision module in real time.
[0007] In one or more embodiments of the present invention, the multi-scale risk trend prediction module includes The instantaneous anomaly detection module is configured as a gated cyclic unit network using a fusion attention mechanism. It takes the multi-dimensional monitoring data of the energy storage power station in the past M1 seconds as input and outputs a continuous prediction sequence of key operating parameters in the next preset M2 seconds, as well as a confidence interval corresponding to each prediction value in the prediction sequence. The short-term trend prediction module, connected to the instantaneous anomaly detection module, is configured to extract local fluctuation patterns of monitoring data within the past preset N1 minutes based on a CNN network, and then learn sequence dependencies using SBiGRU-AM to output a continuous prediction sequence within the future preset N2 minutes, including a prediction sequence of key parameters and a prediction sequence of comprehensive risk score. The long-term trend prediction engine, connected to the short-term trend prediction module, is configured to take multi-dimensional monitoring data of a preset time period as input, select or calculate the optimal weight allocation scheme from a predefined strategy library through the reinforcement learning module, dynamically adjust the weight coefficients of the basic prediction model, construct a combined prediction model adapted to the current operating state, and output a continuous prediction sequence of key operating parameters within the next N3 minutes. The prediction results within the next preset N3 minutes are compared with the actual monitoring data, and the Q-value table or neural network parameters of the reinforcement learning module are updated according to the comparison results.
[0008] In one or more embodiments of the present invention, the basic prediction model includes a convolutional neural network model, a bidirectional long short-term memory network model, and a structured bidirectional gated recurrent unit model. The convolutional neural network model is configured to extract spatial features from the monitoring data and output intermediate results for spatial dimension risk prediction. The bidirectional long short-term memory network model is configured to extract deep temporal dependency features from the monitoring data and output intermediate results for temporal dimension risk prediction. The structured bidirectional gated recurrent unit is configured to extract bidirectional contextual features from the monitoring data and output intermediate results for contextual dimension risk prediction.
[0009] In one or more embodiments of the present invention, the reinforcement learning module includes The status awareness module is configured to analyze the spatiotemporal characteristics of multi-dimensional monitoring data to determine the fault mode. The action selection module is configured to dynamically allocate the weights of each model in the base prediction model based on the identified fault modes. The reward feedback module is configured to determine the accuracy of the prediction based on the deviation between the prediction results and the actual monitoring data, and to adjust the weight allocation of each model in the basic prediction model based on the accuracy.
[0010] In one or more embodiments of the present invention, the dynamic risk assessment and fusion decision module includes The real-time risk calculation module is configured to generate a risk score R based on monitoring data. current ; The predictive feature extraction module is configured to extract key quantitative features based on the predictive sequence. The key quantitative features include risk trend intensity T, risk change acceleration A, and future risk peak P. The decision-level fusion module is configured to output R with time dimension labels and spatial location labels based on the risk trend intensity T, risk change acceleration A, and future risk peak P. fusion The index consists of a time dimension label used to mark the current time t corresponding to the index and the future time window corresponding to the prediction sequence, and a spatial location label used to associate the sensor location of the monitoring data.
[0011] In one or more embodiments of the present invention, the real-time risk calculation module includes The data preprocessing module is configured to normalize the monitoring data, which includes at least temperature data corresponding to temperature parameters, voltage data corresponding to voltage parameters, and gas data corresponding to gas parameters. The weighted fusion calculation module is configured to assign weights to each parameter based on its contribution to the risk of thermal runaway, and generate the current risk score R through linear weighting. current .
[0012] In one or more embodiments of the present invention, when the prediction confidence level is greater than 90% and the risk trend characteristics (T, A) are in the same direction, the decision-level fusion module increases the weight of risk trend intensity and risk change acceleration.
[0013] In one or more embodiments of the present invention, the decision-level fusion module includes The normalization module is configured to normalize the risk trend intensity T and the risk change acceleration A, mapping them to the risk score R. current A consistent 0-1 interval yields the standardized feature T. normand A norm ; The dynamic weight fusion module is configured to dynamically adjust the risk score R based on the prediction confidence and trend consistency. current The weighting coefficients α(t), β(t), and γ(t) in the calculation formula are given by R. fusion (t) = α(t)·R current (t) +β(t)·T norm + γ(t)·A norm ; The output module is configured to be based on the R fusion Data, outputting R data with time dimension labels and spatial location labels. fusion index.
[0014] In one or more embodiments of the present invention, the resource scheduling and strategy optimization module includes The optimized trigger module is configured to trigger real-time R fusion The index, risk positioning, time characteristics and policy mapping knowledge base are matched, and the multi-objective optimization function corresponding to the policy is called during the matching process. The multi-objective optimization solution module is configured to use R... fusion Using the index and risk as inputs, the optimal values of the control variables are determined under the constraints of three priority objectives: safety, economy, and system stability, thus forming a dynamic control instruction set.
[0015] Embodiments of the present invention also provide a risk management method for energy storage power stations, comprising the following steps: The multi-source synchronous monitoring module acquires multi-dimensional monitoring data of key battery parameters in the energy storage power station in real time. The multi-scale risk trend prediction module predicts the risk trend of the energy storage power station based on the monitoring data, and generates a prediction sequence to identify the change trend of key battery parameters and the overall risk level within a future preset time window, as well as a confidence index corresponding to the prediction sequence. The dynamic risk assessment and fusion decision-making module generates a risk index based on the predicted sequence, the corresponding confidence index, and the real-time monitoring data collected by the multi-source synchronous monitoring module. The resource scheduling and strategy optimization module performs multi-objective optimization based on the risk index to generate scheduling instructions that are adapted to the current and future risk states; The execution and closed-loop feedback module executes the scheduling instructions, collects the system response feedback data of the energy storage power station after execution, and feeds back the system response feedback data to the multi-scale risk trend prediction module and / or dynamic risk assessment and fusion decision module in real time.
[0016] Compared with existing technologies, this invention constructs a multi-scale risk trend prediction engine (integrating at least three dimensions: instantaneous anomaly detection, short-term trend prediction, and long-term evolution simulation) to achieve advanced prediction of thermal runaway risks. Furthermore, relying on a decision-level fusion algorithm, it deeply couples the prediction results of the multi-scale risk trend prediction engine with real-time diagnostic results to generate a comprehensive, forward-looking risk index covering the entire domain. Based on this risk index, and using a multi-objective optimization model that considers safety, economy, and system stability, it dynamically generates and executes precisely adapted resource scheduling strategies. This system not only significantly expands the risk warning time window but also realizes a paradigm shift from passive emergency response to proactive predictive control, effectively improving the economic efficiency of the entire lifecycle while strengthening the safety defenses of energy storage systems. Attached Figure Description
[0017] Figure 1 This is a structural block diagram of a risk management system for an energy storage power station according to an embodiment of the present invention; Figure 2 This is a flowchart of the dynamic risk assessment and fusion decision-making module according to an embodiment of the present invention; Figure 3 This is a flowchart of the resource scheduling and strategy optimization module according to an embodiment of the present invention. Detailed Implementation
[0018] The specific embodiments of the present invention will now be described in detail with reference to the accompanying drawings, but it should be understood that the scope of protection of the present invention is not limited to the specific embodiments.
[0019] Unless otherwise expressly stated, throughout the specification and claims, the term "comprising" or its variations such as "including" or "comprises" shall be understood to include the stated elements or components without excluding other elements or other components.
[0020] like Figure 1 As shown, this invention discloses a risk management and control system for energy storage power stations. By constructing a multi-scale risk trend prediction engine (integrating at least three dimensions: instantaneous anomaly detection, short-term trend prediction, and long-term evolution simulation), it achieves advanced prediction of thermal runaway risks. Furthermore, relying on a decision-level fusion algorithm, it deeply couples the prediction results of the multi-scale risk trend prediction engine with real-time diagnostic results to generate a comprehensive forward-looking risk index covering the entire domain. Based on this risk index, and using a multi-objective optimization model that considers safety, economy, and system stability, it dynamically generates and executes precisely adapted resource scheduling strategies. This system not only significantly expands the risk warning time window but also realizes a paradigm shift from passive emergency response to proactive predictive control, effectively improving the economic efficiency of the entire lifecycle operation while strengthening the safety defenses of the energy storage system.
[0021] Specifically, the energy storage power station risk management system includes a multi-source synchronous monitoring module, a multi-scale risk trend prediction module, a risk assessment and fusion decision-making module, a resource scheduling and strategy optimization module, and an execution and closed-loop feedback module. Specifically, the multi-source synchronous monitoring module is configured to acquire multi-dimensional monitoring data of key battery parameters within the energy storage power station in real time through a multi-dimensional sensing unit. The multi-scale risk trend prediction module, connected to the multi-source synchronous monitoring module, is configured to predict the risk trend of the energy storage power station based on the monitoring data, generating a prediction sequence to identify the changing trends of key battery parameters and the overall risk level within a future prediction time window, as well as a confidence index corresponding to the prediction sequence. The dynamic risk assessment and fusion decision-making module, connected to both the multi-source synchronous monitoring module and the multi-scale risk trend prediction module, is configured to generate a comprehensive forward-looking risk index based on the prediction sequence, the corresponding confidence index, and the real-time monitoring data collected by the multi-source synchronous monitoring module. The resource scheduling and strategy optimization module, connected to the dynamic risk assessment and fusion decision-making module, is configured to perform multi-objective optimization based on the risk index, generating scheduling instructions adapted to the current and future risk states. The execution and closed-loop feedback module is connected to the resource scheduling and strategy optimization module, the multi-scale risk trend prediction module, and the dynamic risk assessment and fusion decision-making module, respectively. It is configured to execute scheduling commands, collect the system response feedback data of the energy storage power station after execution, and feed the system response feedback data back to the multi-scale risk trend prediction module and / or the dynamic risk assessment and fusion decision-making module in real time.
[0022] The multi-source synchronous monitoring module includes, but is not limited to, a temperature sensing unit for detecting temperature, an electrical parameter sensing unit for detecting voltage and / or current, a gas detection unit for real-time detection of specific gas components and concentrations, and a deformation sensing unit for detecting battery expansion or deformation. The temperature sensing unit may employ, but is not limited to, distributed fiber optic temperature sensors to achieve wide-range, long-distance temperature monitoring; the electrical parameter sensing unit may include, but is not limited to, a BMS (Battery Management System); the gas detection unit may include, but is not limited to, a laser aerosol detection array to improve detection sensitivity in complex environments; and the deformation sensing unit may include, but is not limited to, a displacement sensor.
[0023] The multi-scale risk trend prediction module includes a transient anomaly detection module, a short-term trend prediction module, and a long-term evolution simulation module. Among them, The instantaneous anomaly detection module is configured as a gated recurrent unit network (SBIGRU-AM, SBIGRU-Attention Mechanism) employing a fusion attention mechanism. It takes multi-dimensional monitoring data from the energy storage power station over the past M1 seconds as input and outputs a continuous prediction sequence of key operating parameters for the next preset M2 seconds, along with confidence intervals corresponding to each predicted value in the prediction sequence. The confidence interval quantifies the reliability of the prediction results and is visually represented as a range of numerical fluctuations. This range is calculated through uncertainty analysis of the prediction model and directly reflects the dispersion of the predicted values, providing a quantifiable basis for confidence judgment in subsequent risk decisions. An example is provided below: Temperature parameter scenario: Predicted temperature = 45℃ ± 2℃ (95% confidence interval), which means that there is a 95% probability that the temperature at the monitoring point of the energy storage power station will be in the range of 43℃ to 47℃ in the future. Voltage parameter scenario: Predicted voltage = 3.65V ± 0.05V, indicating that the predicted fluctuation range of the future battery cell voltage is 3.60V to 3.70V; Risk score scenario: If the output is a normalized risk score of 0 to 1, the confidence interval can be expressed as 0.35±0.10, corresponding to a risk score fluctuation range of 0.25 to 0.45.
[0024] The instantaneous anomaly detection engine of this invention employs a gated recurrent unit network with fused attention mechanism (SBiGRU-AM) as its core prediction model, possessing the ability to capture sudden anomalies at the millisecond to second level and exhibiting high-precision short-term prediction performance. This engine uses multi-dimensional monitoring data from the past M1 seconds (e.g., 60-100 seconds) of the energy storage power station as input, covering real-time acquisition data of key operating parameters such as voltage, temperature, and current. Its built-in attention mechanism, through dynamic weight allocation of time-series data, can accurately focus on sudden anomalies at the millisecond to second level, such as instantaneous voltage drops and abnormal temperature spikes, effectively filtering out invalid noise data and improving the sensitivity and response speed of anomaly feature identification.
[0025] The short-term trend prediction module is configured with a CNN-SBiGRU-AM fusion network architecture. For sequential data such as energy storage power station monitoring data, which exhibits both local fluctuations and strong temporal correlations, a hybrid prediction scheme of "local feature extraction - temporal dependency modeling" is constructed. This module takes multi-dimensional monitoring data from the past N1 minutes as input. First, it extracts local fluctuation patterns from the data using a convolutional neural network (CNN) to capture short-term anomalies. Then, it uses a gated recurrent unit network (SBiGRU-AM) with an attention mechanism to deeply learn the long-term temporal dependencies of the data and uncover trend evolution patterns. Through the complementary functions and collaborative computation of the two models, the module ultimately outputs a continuous temporal prediction sequence of key operating parameters for the next N2 minutes, along with a confidence interval corresponding to each predicted value in the sequence. This provides high-precision, quantifiable decision-making basis for the operational trend analysis and risk prediction of energy storage power stations. The continuous prediction sequence here is a set of predicted data points generated by the short-term trend prediction module and arranged in chronological order, covering every time point (e.g., one point per second or every 5 seconds) for the next N2 minutes from the current moment. This sequence is not a single value, but rather a continuous trajectory depicting the changes of key parameters or comprehensive risk indicators over time.
[0026] The continuous prediction sequences here include, but are not limited to, key parameter prediction sequences and comprehensive risk score prediction sequences. The key parameter prediction sequence is a direct trend prediction output of the raw physical quantities collected by the monitoring module. Each data point in the sequence corresponds to a predicted parameter value at a future time, directly related to the raw physical quantity, and serves as the fundamental quantitative basis for judging the equipment's operating status. For example, the temperature prediction sequence is: [T(t+1min)=48.2℃, T(t+2min)=50.1℃, T(t+3min)=52.5℃, ..., T(t+5min)=58.0℃]. The temperature prediction sequence is the most direct basis for judging the heat accumulation trend. Through continuous time-dimension numerical changes, the rate of temperature rise can be accurately perceived, providing core data support for early warning of thermal runaway risks.
[0027] The comprehensive risk score prediction sequence is a quantitative index sequence characterizing the comprehensive risk level of equipment, generated through multi-parameter fusion calculation within the model based on key parameter predictions. It serves as the direct input to subsequent decision-level fusion modules. Risk scores are typically normalized (range 0-1), with higher values representing higher comprehensive risk levels. For example, consider the normalized risk score sequence: [R] pred (t+1min)=0.35, R pred (t+2min)=0.45, R pred (t+3min)=0.60,...,R pred(t+5min)=0.85]. In this sequence, the risk score increases from 0.35 to 0.85 over time, intuitively reflecting the trend of a continuous increase in the overall risk level, providing quantifiable and directly applicable risk assessment results for operation and maintenance decisions.
[0028] In practical implementation, a CNN (Convolutional Neural Network) is used to extract local fluctuation patterns from the past N1 minutes of monitoring data. These patterns include short-term abrupt changes, periodic small oscillations, local peaks / troughs, and the shape characteristics of abnormal fluctuations. Short-term abrupt changes include regular voltage fluctuations of ±0.05V every 30 seconds during charging and discharging; periodic small oscillations include regular voltage fluctuations of ±0.05V every 30 seconds during charging and discharging; local peaks / troughs include the highest temperature and lowest voltage points within a 5-minute period; and the shape characteristics of abnormal fluctuations include waveforms such as a sudden rise followed by a rapid fall, or a slow rise followed by stabilization. Traditional recurrent neural networks (such as GRUs) often suffer from reduced efficiency in capturing key features due to redundant information in the sequence when processing long-term raw data. They may even experience gradient vanishing or information forgetting problems, making it difficult to accurately focus on the details of local fluctuations. In contrast, Convolutional Neural Networks (CNNs), through their sliding kernel scanning mechanism, can quickly pinpoint core features within local time windows, much like taking close-up shots of localized areas in time-series data. They slide across the data with a fixed stride, automatically filtering redundant information through convolution operations to accurately extract key fluctuation patterns such as voltage spikes and temperature surges. These scattered local features are then transformed into more representative, compact feature sequences (e.g., using a set of high-dimensional vectors to represent the core features such as fluctuation intensity, frequency, and trend within each time window). This ability to focus on local features and encode them compactly compensates for the shortcomings of recurrent neural networks in processing localized abrupt signals, providing more efficient and targeted input data for subsequent time-series modeling.
[0029] The SBiGRU-AM fusion network, based on local fluctuation features extracted by CNN, achieves accurate prediction of future trends by modeling the temporal causal dependencies between fluctuations at multiple time points. Specifically, the CNN transforms raw monitoring data into a structured sequence of local features (e.g., a set of feature vectors representing the fluctuation pattern for each minute within the past N2 minutes); the SBiGRU (Bidirectional Gated Recurrent Unit) learns the correlation patterns of historical fluctuations, capturing temporal causal relationships such as "a sudden temperature rise of 2°C in the 2nd minute usually leads to a slow drop of 1°C after 5 minutes." Its bidirectional structure can simultaneously construct a forward evolution model from "past to future" and a reverse verification logic from "future to past" (i.e., using subsequent data to backtrack and verify the actual impact of previous fluctuations), further improving the accuracy of trend judgment.
[0030] Attention mechanism (AM) achieves key feature focus through dynamic weight allocation: different local fluctuations have significantly different predictive value for future trends. For example, a sudden temperature rise has a much greater impact on the risk of thermal runaway than small voltage oscillations. AM assigns differentiated weights to each local feature, guiding the model to prioritize key fluctuations that play a decisive role in trend evolution, such as sudden temperature rises and abnormal peaks, while filtering out irrelevant noise such as regular voltage oscillations, significantly improving prediction accuracy. Finally, the model outputs a continuous prediction sequence for the next N² minutes (e.g., the predicted temperature and voltage values for each minute within the next 10 minutes, forming a smooth trend curve), along with confidence intervals to quantify the reliability of the results. A specific example follows: CNN Feature Extraction Stage: Feature extraction is performed on the temperature data of the past 10 minutes (1 sampling point per second) to identify three key local fluctuation features: "a sudden increase of 2℃ in the 2nd minute", "a peak of 43℃ in the 5th minute", and "an oscillation of 0.5℃ per minute from the 7th to the 9th minute". During the SBiGRU-AM modeling phase, the following temporal patterns were learned: after a sudden temperature rise of 2°C, the temperature usually drops slowly by 1°C within 5 minutes; the oscillation pattern disappears after 3 minutes; and after a peak of 43°C, the temperature remains around 42°C if there are no abnormalities. At the same time, weights were assigned to the features: sudden temperature rise (weight 0.6) > peak value (weight 0.3) > normal oscillation (weight 0.1). Prediction Output Stage: Outputs the predicted temperature sequence for the next 5 minutes: [42.8℃, 42.5℃, 42.2℃, 42.0℃, 41.9℃], corresponding to a confidence interval of ±0.3℃ (95% confidence level). The short-term trend prediction module, employing a fusion architecture, offers significant technical advantages compared to a single model: Improved prediction accuracy: The collaborative mechanism of CNN filtering noise, SBiGRU capturing long-term temporal dependencies, and AM focusing on key features is perfectly adapted to the operating data characteristics of energy storage power stations, which are characterized by "many local fluctuations and strong temporal correlations". Computational efficiency optimization: CNN pre-compresses the data dimension, effectively reducing the computational load of SBiGRU and significantly improving the prediction response speed; Enhanced interpretability: By using attention weights, the core fluctuation characteristics that affect the prediction results can be traced back (such as the temperature rise in this prediction mainly originating from the sudden increase in the second minute), providing a clear technical basis for subsequent risk tracing and root cause analysis.
[0031] The long-term evolution simulation module is configured to dynamically assess and predict long-term chronic risks based on historical monitoring data and reinforcement learning mechanisms. This module takes multi-dimensional monitoring data from a preset time period as input, selects or calculates the optimal weight allocation scheme from a predefined strategy library through a built-in reinforcement learning module, dynamically adjusts the weight coefficients of each basic prediction model, constructs a combined prediction model adapted to the current operating state, and outputs a continuous predicted value sequence of key operating parameters for the next N3 minutes. This provides a quantitative data foundation for risk analysis and allows for judgment of risk evolution trends based on the predicted sequence. By analyzing the direction, rate, and correlation of parameter changes, it outputs risk level assessment conclusions with direct decision-making value, such as the battery pack consistency risk level rising from low to medium within the next 20 minutes.
[0032] At the same time, the system will continuously compare the predicted sequence of key operating parameters for the next N3 minutes with the subsequent actual monitoring data in real time, and update the Q-value table or neural network parameters of the reinforcement learning module based on the deviation results to achieve continuous iterative optimization of model performance.
[0033] This long-term evolution model module enables a value upgrade from numerical prediction to risk interpretation: the predicted sequence is the foundational data for risk analysis, while the judgment of risk evolution trends and chronic risk assessment are the core products serving operation and maintenance decisions. Through dynamic participation and closed-loop optimization using reinforcement learning, the module can not only accurately predict future parameter changes but also deeply interpret the risk implications behind these changes. This provides a full-cycle, quantifiable management approach for long-term chronic risks in energy storage power stations (such as battery consistency degradation and performance degradation), truly achieving intelligent decision support for preventative operation and maintenance.
[0034] In practical implementation, the long-term evolution simulation module is built upon a reinforcement learning module (which runs reinforcement learning algorithms) to construct its core architecture. Through a three-step process of dynamic weight allocation, closed-loop verification feedback, and continuous model optimization, it achieves adaptive prediction of different failure modes and precise management of long-term chronic risks. The module incorporates three basic models: CNN (Convolutional Neural Network), BiNLSTM (Bidirectional Long Short-Term Memory Network), and SBiGRU (Structured Bidirectional Gated Recurrent Unit). It dynamically allocates model weights using the Q-Learning (reinforcement learning) algorithm, enabling the system to adapt to different operating scenarios and failure modes. Specifically, First, dynamic weight allocation is performed. That is, the system takes multi-dimensional monitoring data of the past preset duration as input, selects or calculates the optimal weight scheme from the predefined strategy library through the Q-Learning algorithm, adjusts the weight coefficients of the three basic models in real time, and builds a combined prediction model that adapts to the current running state. Secondly, closed-loop verification feedback is carried out, that is, the deviation analysis is performed between the predicted sequence of key operating parameters for the next N3 minutes and the subsequent actual monitoring data to quantify the accuracy of the prediction results. Finally, continuous model optimization is performed, that is, the Q-value table of the Q-Learning algorithm or the parameters of the neural network are updated based on the validation results to achieve iterative optimization of model performance and improve the accuracy and adaptability of long-term predictions.
[0035] Furthermore, Q-Learning is a reinforcement learning algorithm that dynamically assigns weights to three base models through interactive learning with the operating environment. For example, it assigns 40% weight to CNN, 35% to BiNLSTM, and 25% to SBiGRU, allowing the entire system to adapt to different fault modes. Specifically, CNN is used to extract spatial features and analyze spatial differences between individual cells within the battery pack, such as generally higher temperatures in a certain row of cells or poor voltage consistency among several cells; BiNLSTM (Bidirectional Long Short-Term Memory Network) is used to capture deep temporal dependencies, learning long-term evolutionary patterns, such as the decay trend of battery capacity with charge-discharge cycles and temperature variation patterns with the seasons; and SBiGRU (Structured Bidirectional Gated Recurrent Unit) is used to model bidirectional contextual relationships, simultaneously analyzing the influence of past data and feedback on future trends, such as the impact of charging strategy adjustments on subsequent battery consistency.
[0036] The reinforcement learning module includes a state awareness module, an action selection module, and a reward feedback module. The specific working process of the reinforcement learning module is as follows: First, the system uses a state awareness module to detect the current operating status and fault mode characteristics in real time. The system analyzes the spatiotemporal characteristics of the input data to determine the type of potential risk. For example, when the voltage difference between multiple battery cells continues to widen, it is marked as "battery inconsistency mode"; when the capacity continues to decrease with charge and discharge cycles, it is marked as "capacity decay mode"; and when the local temperature continues to rise and spreads, it is marked as "heat accumulation mode".
[0037] Secondly, action selection is performed through the action selection module, that is, the model weights are dynamically allocated based on the identified fault modes to maximize the predictive performance of the combined model. For example, for the battery inconsistency mode, Q-Learning will increase the CNN weight (e.g., 50%) and decrease the BiNLSTM weight (e.g., 25%), because CNN is best at capturing spatial distribution differences; for the capacity decay mode, Q-Learning will increase the BiNLSTM weight (e.g., 50%) and decrease the CNN weight (e.g., 20%), because BiNLSTM is best at analyzing long-term time-series trends; for the thermal accumulation mode, Q-Learning will increase the SBiGRU weight (e.g., 45%) while retaining the CNN (30%) to monitor local temperature differences, because BiGRU can combine previous and subsequent data to analyze thermal evolution.
[0038] Finally, the reward feedback module provides reward feedback, that is, the deviation between the prediction result and the actual data is used as the reward signal. If the prediction accuracy improves, the current weight strategy is strengthened; if the deviation is large, the weight allocation scheme is adjusted to achieve continuous self-optimization of the model.
[0039] The following detailed explanation uses monitoring data from an energy storage power station as an example: Over the past 30 minutes, the voltage of module A in the battery pack slowly decreased from 3.2V to 3.15V, while the voltages of other modules remained stable at around 3.2V. First, state awareness is performed, that is, the system's Q-Learning algorithm increases the weight of CNN to 50%, BiNLSTM to 35%, and SBiGRU to 15%, to enhance the ability to capture spatial differences and analyze short-term time-series trends. The data features are analyzed and marked as "battery inconsistency mode". Secondly, action selection was performed. The Q-Learning algorithm increased the weights of CNN to 50%, BiNLSTM to 35%, and SBiGRU to 15% to enhance its ability to capture spatial differences and analyze short-term temporal trends. Specifically, CNN extracted spatial features: identifying voltage differences between module A and other modules, and locating abnormal cells; BiNLSTM captured temporal trends: discovering that the voltage of module A continuously decreased within 30 minutes, judging it to be chronic degradation rather than instantaneous fluctuations; SBiGRU modeled bidirectional correlation: combining historical charge and discharge data, verifying that this downward trend matched the typical characteristics of battery consistency degradation.
[0040] Finally, a predictive output is made, indicating that the voltage of module A will continue to drop to 3.12V within the next 20 minutes, while the battery inconsistency risk level is assessed as medium, providing accurate risk warnings and decision-making basis for the future.
[0041] like Figure 2As shown, the dynamic risk assessment and fusion decision-making module is the core decision-making unit of the intelligent operation and maintenance system of the energy storage power station. It establishes data connections with the multi-source synchronous monitoring module and the multi-scale risk trend prediction module, respectively. Its core function is to deeply fuse multi-source real-time monitoring data with multi-scale prediction sequences and output a comprehensive forward-looking risk index R that combines temporal continuity and spatial positioning accuracy. fusion (Risk Fusion Index). Specifically, the dynamic risk assessment and fusion decision-making module transforms real-time monitoring data streams and multi-scale prediction sequences into a comprehensive forward-looking risk index R that combines time and spatial positioning through three layers of logic: real-time status diagnosis, future trend prediction, and dynamic weight fusion. fusion Specifically, in the real-time status diagnosis layer, real-time health status scanning is performed based on multi-source synchronous monitoring data to accurately identify current operational anomalies and potential risk points; in the future trend prediction layer, the output of the multi-scale risk trend prediction module is combined to analyze the evolution direction, rate, and scope of impact of risks; in the dynamic weight fusion layer, the fusion weights of real-time data and predicted data are dynamically allocated according to the current risk level and evolution trend to generate an R-value that is both timely and forward-looking. fusion The index. In practical implementation, the dynamic risk assessment and fusion decision-making module receives or acquires monitoring data and multi-scale prediction output sequences. The multi-dimensional real-time monitoring data originates from the full-dimensional data collected by the multi-source synchronous monitoring module, including thermal management (temperature, rate of temperature rise), electrical management (voltage difference, current), gas detection (CO concentration, etc.), and mechanical sensing (strain). The multi-scale risk prediction sequence, generated by the multi-scale risk trend prediction module, is a forward-looking risk assessment result covering continuous prediction values across different time dimensions. It includes short-term warning sequences (1-5 minutes) for real-time risk early warning and medium-term trend sequences (10-30 minutes) for risk evolution analysis, uniformly denoted as sequence form: R. pred (t+1), R pred (t+2), ..., R pred (t+n), where t is the current time, n is the time step corresponding to the prediction duration, and each R pred The value represents the comprehensive risk prediction score at the corresponding time.
[0042] The dynamic risk assessment and fusion decision module includes a real-time risk calculation module, a predictive feature extraction module, and a decision-level fusion module.
[0043] The real-time risk calculation module is used to generate the current instantaneous risk score R. currentIn practical implementation, the real-time risk calculation module first normalizes the monitoring data, such as standardizing different types of sensor readings to the 0-1 range. This monitoring data includes, but is not limited to, the temperature anomaly index (IT), voltage consistency anomaly index (IV), and gas concentration anomaly index (IG). For example, The Temperature Anomaly Index (IT) combines the ratio of the current highest temperature to the threshold and the ratio of the temperature rise rate to the threshold, taking the maximum of the two as the comprehensive index (IT = max(current temperature / temperature threshold, temperature rise rate / rate threshold)).
[0044] The voltage consistency anomaly index IV is based on the ratio of the maximum voltage difference between individual cells to a threshold (e.g., IV = current maximum voltage difference / voltage difference threshold). The gas concentration anomaly index IG is based on the ratio of the critical gas concentration to a threshold (e.g., IG = current CO concentration / CO threshold). Of course, in other embodiments, normalization can be applied to other parameter data, such as adding the current mutation index II, mechanical strain index IS, etc.
[0045] Finally, a weighted fusion calculation is performed, that is, weights are assigned to each parameter according to its contribution to the risk of thermal runaway (e.g., temperature weight WT=0.6, voltage weight WV=0.25, gas weight WG=0.15), and the current risk score is generated through linear weighting. R current = W_T×I_T + W_V×I_V + W_G×I_G + W_I×I_I + W_S×I_S For example, when IT=0.92, IV=0.75, and IG=0.6, substituting the weights, we get R. current ≈0.83 (0 represents no risk, 1 represents extremely high risk), providing a real-time risk benchmark for fusion decision-making.
[0046] The core task of the predictive feature extraction module is to mine key evolutionary patterns from multi-scale risk prediction sequences. By extracting three core features—risk trend intensity T, risk change acceleration A, and future risk peak P—it achieves a precise quantitative description of the future risk evolution direction, deterioration rate, and extreme scenarios. Among these, the risk trend intensity T is used to intuitively reflect the overall trend of risk change within the prediction window, and its calculation formula is as follows:
[0047] in, To predict the risk value at the start of the window, Let T be the risk value at the end of the prediction window, and (t+n)-(t+1)(t+n)−(t+1) be the prediction time span (in minutes). If T > 0, it indicates that the risk is increasing; if T < 0, it indicates that the risk is decreasing. The larger |T| is, the faster the risk changes and the more significant the trend.
[0048] The risk change acceleration A is used to capture the nonlinear characteristics of risk trends and determine whether the risk is accelerating, developing at a constant rate, or slowing down. It is calculated using the second-order difference method. First-order difference (rate of risk change between adjacent time points):
[0049] Second-order difference (rate of change of the speed of risk change):
[0050] Finally, the average or maximum value of the second-order differences is taken as the acceleration characteristic A. If A > 0, it indicates that the rate of risk deterioration is accelerating; if A ≈ 0, it indicates that the risk is changing at an approximately uniform rate; if A < 0, it indicates that the rate of risk deterioration is slowing down.
[0051] The future risk peak P is used to predict the worst-case scenario within the forecast window; that is, it is used to lock in the worst-case risk situation that may occur within the forecast time range, and directly extracts the maximum value in the forecast sequence.
[0052] This feature provides a quantitative basis for extreme risk warnings, ensuring that operation and maintenance decisions fully cover potential highest-risk scenarios and avoid missing extreme events.
[0053] The decision-level fusion module is used to generate a comprehensive forward-looking risk index R. fusion The decision-level fusion module is the core decision-making hub of the dynamic risk assessment module. Through a dynamic adaptive fusion algorithm, it deeply couples the current real-time risk status with future trend characteristics, generating a comprehensive risk index that is both timely and forward-looking, providing accurate quantitative basis for operational decisions. The decision-level fusion module includes a normalization module, a dynamic weight fusion module, and an output module. First, the normalization module normalizes the risk trend intensity T and the risk change acceleration A, mapping them to R... current A consistent 0-1 interval yields the standardized feature T. norm and A norm This ensures that multi-source features can be fused and computed in the same dimension. Simultaneously, the future risk peak P is used as an auxiliary constraint to ensure that the fusion result fully covers the worst-case scenario risk and avoids omission of extreme risks.
[0054] Secondly, the dynamic weight fusion module performs dynamic weight fusion, that is, it uses adaptive weight allocation logic to allocate weights: R fusion (t) = α(t)·R current (t) + β(t)·T norm + γ(t)·A norm .
[0055] The weighting coefficients α(t), β(t), and γ(t) are dynamically adjusted based on the prediction confidence and trend consistency. When the prediction confidence is high (e.g., greater than 90%) and the multi-scale trend features (T, A) are consistent in direction, the weights of β and γ are increased, and the system prioritizes making decisions based on future trends. When the prediction results are ambiguous or contradictory, the weight of α is increased, and the system regresses to rely on the current real-time state diagnosis.
[0056] Finally, based on the above R fusion The data output module outputs R data with time dimension labels and spatial location labels. fusion The index includes a time dimension label: marking the current time t corresponding to the index, and the future time window (e.g., t+1 to t+n) corresponding to the predicted sequence, clearly distinguishing the time boundaries between current risks and future trends; and a spatial location label: associating the sensor locations of monitoring data (e.g., battery module number, regional coordinates) to achieve precise risk location. This R... fusion The index simultaneously quantifies the urgency of the current danger and the possibility of future deterioration, fundamentally solving the shortcomings of traditional early warning systems such as delayed warnings and static and rigid decision-making mechanisms, and providing accurate basis for subsequent protective actions.
[0057] like Figure 3 As shown, the resource scheduling and strategy optimization module transforms the comprehensive forward-looking risk index and precise risk location information output by the dynamic risk assessment module into actionable and efficient resource scheduling and operation and maintenance intervention instructions. The module is supported by a strategy mapping knowledge base and a multi-objective optimization scheduler, and achieves continuous self-optimization through execution and closed-loop feedback units, constructing a complete operation and maintenance closed loop of "perception-decision-execution-optimization".
[0058] The policy mapping knowledge base abandons the traditional, simple threshold-action storage model and instead stores tuples of conditional policies containing multi-objective optimization functions. Each policy must simultaneously match R... fusion Triggering conditions are defined by three dimensions: risk range, risk positioning, and time characteristics.
[0059] Taking the "pre-cooling-cluster CO2" strategy for early-stage heat accumulation at the cluster level as an example, the associated R... fusion The interval is 0.4 ≤ R fusion< 0.7 corresponds to a medium-to-high risk but non-emergency scenario; the risk is located in the core area of a single battery cluster CO2, ensuring a precise and focused response; the time characteristic is identified as "short-term trend acceleration type," meaning the short-term trend prediction module output shows a significantly positive acceleration in risk change, with the rate of deterioration continuously accelerating. When real-time monitoring data simultaneously meets these three conditions, the system will trigger a set of dynamic control instructions, such as... In terms of precise cooling, the valve opening of the dedicated cooling circuit of Cluster C02 was increased to 70% and the fan speed was increased simultaneously to strengthen local thermal management in a targeted manner; In terms of preventative isolation and load reduction, the upper limit of the charging current of cluster CO2 is reduced by 20% to reduce internal heat generation and curb the trend of accelerated deterioration; In terms of neighborhood preventive alerts, the cooling power of adjacent clusters C01 and C03 is increased to 40%, and preventive mild cooling is carried out on the neighboring area based on the risk of heat diffusion to avoid over-response; In terms of emergency resource pre-positioning, the solenoid valve of the fire foam pipeline for cluster CO2 is pre-activated to the "standby" state to shorten the emergency response delay but not to trigger active spraying.
[0060] Compared to traditional systems, this strategy mechanism achieves three key improvements: First, it addresses the problem of static and rigid decision-making mechanisms in traditional systems: traditional systems rely solely on the current risk value R. current This system might implement homogeneous interventions across all risk areas; however, even if R... fusion Even with the same risk level of 0.6, differentiated strategies will be implemented based on risk positioning and time characteristics. For example, for the "accelerated risk of cluster C02", precise intervention will be proactively implemented, while for the "slow risk of module M05", only the monitoring frequency will be increased.
[0061] Secondly, it solves the problem of inefficient resource scheduling. Traditional systems often trigger global scheduling such as full-speed operation of all fans in the cabin, while this system achieves differentiated gradient control, such as 70% CO2 cooling power, 40% for neighboring clusters, and normal operation for other areas. Thirdly, it achieves predictive and maintenance guidance by directly converting the trend characteristics output by the prediction engine into decision factors. The system not only handles current risks but also proactively manages future trends in the predictions.
[0062] The core input of the multi-objective optimization scheduler is R fusion Index and risk positioning information, with risk positioning information obtained through a dual-path approach: hardware-level physical positioning and logical-level fusion confirmation. Hardware-level positioning relies on a distributed sensor network of multi-source synchronous monitoring modules. For example, distributed fiber optic temperature sensors are closely laid along the battery system to achieve centimeter-level spatial resolution temperature positioning by detecting Raman scattering or Brillouin scattering light signals, which can accurately identify physical locations such as "the second layer module of column A in battery cabinet No. 3". Another example is array-type laser aerosol / gas monitoring, which uses multiple laser emitting and receiving units to form a monitoring grid. Using triangulation or tomographic scanning algorithms, it calculates the three-dimensional spatial coordinates of gas leaks or smoke based on the spatiotemporal changes in laser beam signal attenuation. Yet another example is the battery management system, which manages electrical parameters at the individual cell, module, and cluster levels. Its voltage and current data naturally carry hierarchical labels, which can quickly locate the risks of electrical management units such as "cluster C02".
[0063] Logical-level positioning: Logical-level fusion confirmation enhances and verifies hardware positioning information, including the following methods: Through cross-validation of multi-source data, when distributed optical fiber detects a temperature rise at a certain location, laser grid detects an increase in gas concentration in a nearby area, and BMS data shows a divergence in the corresponding cluster voltage, the spatiotemporal consistency of multi-source information can reduce the positioning accuracy from "area" to "near a specific module or battery cell". With the help of prediction engines for localization, prediction models such as CNN, which are good at capturing spatial features, can identify the spatial propagation path of abnormal patterns, predict the direction of risk spread, and provide a basis for decision-making for preventive vigilance in the neighborhood.
[0064] The final risk location information input to the scheduler is a structured description, including logical units (such as "cluster C02"), physical coordinates (such as "coordinates (X, Y, Z)"), and impact range assessment (such as "the core area is located in the middle of cluster C02 and has a tendency to spread to the adjacent cluster C01"), ensuring that the system knows both how high the risk is and where the risk is located.
[0065] Based on R fusion The index and risk positioning information are used to optimize variables; that is, the multi-objective optimization scheduler completes variable optimization in three steps: First, the triggering is optimized; the system will perform real-time R... fusion The index, risk positioning, time characteristics and the conditional policy tuples in the policy mapping knowledge base are matched. When the real-time situation completely matches a certain policy, the multi-objective optimization function corresponding to that policy is called. Secondly, there is the multi-objective optimization solution. After matching the policy, the scheduler uses R... fusionTaking risk positioning as input, the optimal control variable values are solved under the constraints of three priority objectives: safety, economy, and system stability. Among them, the safety objective is the highest priority, which requires maximizing the reduction of the predicted maximum temperature in the target area. This is achieved by adjusting the opening of cooling loop valves and the fan speed to enhance local cooling capacity and reducing the PCS charging and discharging power limit to reduce heat generation. The economic objective requires minimizing the total intervention energy consumption and battery life loss cost. During optimization, the energy consumption of different control combinations needs to be evaluated, and a mild and gradual control scheme with less impact on battery SOH should be selected based on the battery life loss cost model. The system stability objective requires minimizing the impact on the power supply of non-faulty areas. When adjusting the PCS charging and discharging power limit, the power of the faulty area should be limited as much as possible rather than global load reduction.
[0066] Furthermore, the solution process employs a fast algorithm combining heuristic rules and linear programming: First, according to R fusion Size and risk assessment quickly determine the initial scope of intervention, narrowing the search space; Then, the safety and economic objectives are transformed into linear or linearly approximate objective functions, and the system stability objective is used as a constraint to establish a mathematical model; Finally, the specific values of a set of optimization variables are obtained, forming a dynamic control instruction set. For example, when R... fusion When the value is 0.65, the location is "cluster C02", and the time characteristic is "accelerated", the optimization result is that the cooling valve opening of cluster C02 is 70%, the fan speed is 3000rpm, the valve opening of adjacent clusters C01 and C03 is 40%, the fan speed is 2000rpm, the charging current limit of cluster C02 is reduced by 20%, and fire compartment 2 is pre-started to "standby" state; finally, the command output is that the dispatcher outputs the solution results in a structured form, which clarifies the operation object, action parameters and execution logic.
[0067] After the control command is generated, the execution and closed-loop feedback module is responsible for the implementation of the command and collecting response effect data, such as the actual cooling curve and energy consumption data. This data is then fed back to the prediction model and strategy mapping knowledge base to achieve continuous self-optimization of the system and continuously improve the accuracy and economy of risk management.
[0068] Furthermore, assuming the system contains multiple battery clusters, with cluster C02 representing the faulty region and the remaining clusters representing non-faulty regions, the constraints in the mathematical model established using the system stability objective as a constraint are explained in detail. Specifically, these constraints include multiple constraints, namely Constraint 1, Constraint 2, and Constraint 3, as described below. Constraint 1: The total power output in the non-faulty region shall not be lower than a certain threshold. ,Right now:
[0069] Where i: the index of the battery cluster (for example, i=1 represents C01, i=2 represents C02, i=3 represents C03). : The actual output power of the i-th battery cluster; The minimum power supply threshold required by the system; This constraint ensures that even if the power of the faulty cluster CO2 is reduced, the total output power of all other normally functioning battery clusters combined will not fall below the minimum power supply requirement that the power plant must maintain. .
[0070] Constraint 2: Individual operational stability constraints in non-faulty regions, namely:
[0071] in, : The rated output power of the i-th battery cluster under normal operating conditions; The upper limit of the allowable power relative fluctuation is a small number between 0 and 1 (e.g., 5%), which is a preset constant.
[0072] This constraint limits the power adjustment range for each non-faulty battery cluster. It requires that the power variation of each normal cluster (relative to its rated value) be limited. The percentage of ) cannot exceed This is to prevent excessive and frequent power increases or decreases in other normal clusters in order to compensate for the power drop of the faulty cluster CO2. In this way, the power supply impact on non-faulty areas can be minimized, and unnecessary operational interference and losses can be avoided for normal equipment. This reflects a precise control process that avoids a one-size-fits-all approach.
[0073] Constraint 3: System total power safe operation constraint, namely:
[0074] in, The minimum total power allowed by the system. A lower limit set based on grid connection protocols, minimum operating load of equipment, and safety considerations.
[0075] The maximum total power allowed by the system. This is usually limited by transformer capacity, line current carrying capacity, or the contracted capacity of the power station.
[0076] This constraint ensures that the total output power of the entire power plant, after optimized scheduling, is within a safe and permissible range. It prevents both system instability problems that may be caused by excessively low total power and equipment overload risks caused by excessively high total power.
[0077] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0078] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0079] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0080] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0081] The foregoing description of specific exemplary embodiments of the invention is for illustrative and explanatory purposes. These descriptions are not intended to limit the invention to the precise forms disclosed, and it will be apparent that many changes and variations can be made in accordance with the foregoing teachings. The exemplary embodiments were chosen and described in order to explain the specific principles of the invention and its practical application, thereby enabling those skilled in the art to implement and utilize various different exemplary embodiments of the invention, as well as various different choices and variations. The scope of the invention is intended to be defined by the claims and their equivalents.
Claims
1. A risk management and control system for an energy storage power station, characterized in that, include: The multi-source synchronous monitoring module is configured to acquire multi-dimensional monitoring data of key battery parameters in the energy storage power station in real time through a multi-dimensional sensing unit. A multi-scale risk trend prediction module, connected to the multi-source synchronous monitoring module, is configured to predict the risk trend of the energy storage power station based on the monitoring data, and generate a prediction sequence for identifying the change trend of key battery parameters and the overall risk level within a future preset time window, as well as a confidence index corresponding to the prediction sequence. The dynamic risk assessment and fusion decision-making module is connected to the multi-source synchronous monitoring module and the multi-scale risk trend prediction module, respectively, and is configured to generate a risk index based on the prediction sequence, the corresponding confidence index, and the real-time monitoring data collected by the multi-source synchronous monitoring module. The resource scheduling and strategy optimization module, connected to the dynamic risk assessment and fusion decision module, is configured to perform multi-objective optimization based on the risk index and generate scheduling instructions that adapt to the current and future risk states. The execution and closed-loop feedback module is connected to the resource scheduling and strategy optimization module, the multi-scale risk trend prediction module, and the dynamic risk assessment and fusion decision module, respectively. It is configured to execute the scheduling command, collect the system response feedback data of the energy storage power station after execution, and feed the system response feedback data back to the multi-scale risk trend prediction module and / or the dynamic risk assessment and fusion decision module in real time.
2. The system as described in claim 1, characterized in that, The multi-scale risk trend prediction module includes The instantaneous anomaly detection module is configured as a gated cyclic unit network using a fusion attention mechanism. It takes the multi-dimensional monitoring data of the energy storage power station in the past M1 seconds as input and outputs a continuous prediction sequence of key operating parameters in the next preset M2 seconds, as well as a confidence interval corresponding to each prediction value in the prediction sequence. The short-term trend prediction module, connected to the instantaneous anomaly detection module, is configured to extract local fluctuation patterns of monitoring data within the past preset N1 minutes based on a CNN network, and then learn sequence dependencies using SBiGRU-AM to output a continuous prediction sequence within the future preset N2 minutes, including a prediction sequence of key parameters and a prediction sequence of comprehensive risk score. The long-term trend prediction engine, connected to the short-term trend prediction module, is configured to take multi-dimensional monitoring data of a preset time period as input, select or calculate the optimal weight allocation scheme from a predefined strategy library through the reinforcement learning module, dynamically adjust the weight coefficients of the basic prediction model, construct a combined prediction model adapted to the current operating state, and output a continuous prediction sequence of key operating parameters within the next N3 minutes. The prediction results within the next preset N3 minutes are compared with the actual monitoring data, and the Q-value table or neural network parameters of the reinforcement learning module are updated according to the comparison results.
3. The system as described in claim 2, characterized in that, The basic prediction model includes a convolutional neural network model, a bidirectional long short-term memory network model, and a structured bidirectional gated recurrent unit model. The convolutional neural network model is configured to extract spatial features from the monitoring data and output intermediate results for spatial dimension risk prediction. The bidirectional long short-term memory network model is configured to extract deep temporal dependency features from the monitoring data and output intermediate results for temporal dimension risk prediction. The structured bidirectional gated recurrent unit is configured to extract bidirectional contextual features from the monitoring data and output intermediate results for contextual dimension risk prediction.
4. The system as described in claim 2, characterized in that, The reinforcement learning module shown includes The status awareness module is configured to analyze the spatiotemporal characteristics of multi-dimensional monitoring data to determine the fault mode. The action selection module is configured to dynamically allocate the weights of each model in the base prediction model based on the identified fault modes. The reward feedback module is configured to determine the accuracy of the prediction based on the deviation between the prediction results and the actual monitoring data, and to adjust the weight allocation of each model in the basic prediction model based on the accuracy.
5. The system as described in claim 1, characterized in that, The dynamic risk assessment and fusion decision-making module includes The real-time risk calculation module is configured to generate a risk score R based on monitoring data. current ; The predictive feature extraction module is configured to extract key quantitative features based on the predictive sequence. The key quantitative features include risk trend intensity T, risk change acceleration A, and future risk peak P. The decision-level fusion module is configured to output R with time dimension labels and spatial location labels based on the risk trend intensity T, risk change acceleration A, and future risk peak P. fusion The index consists of a time dimension label used to mark the current time t corresponding to the index and the future time window corresponding to the prediction sequence, and a spatial location label used to associate the sensor location of the monitoring data.
6. The system as described in claim 5, characterized in that, The real-time risk calculation module includes The data preprocessing module is configured to normalize the monitoring data, which includes at least temperature data corresponding to temperature parameters, voltage data corresponding to voltage parameters, and gas data corresponding to gas parameters. The weighted fusion calculation module is configured to assign weights to each parameter based on its contribution to the risk of thermal runaway, and generate the current risk score R through linear weighting. current .
7. The system as described in claim 5, characterized in that, When the prediction confidence level is greater than 90% and the risk trend characteristics (T, A) are in the same direction, the decision-level fusion module increases the weight of risk trend intensity and risk change acceleration.
8. The system as described in claim 5, characterized in that, The decision-level fusion module includes The normalization module is configured to normalize the risk trend intensity T and the risk change acceleration A, mapping them to the risk score R. current A consistent 0-1 interval yields the standardized feature T. norm and A norm ; The dynamic weight fusion module is configured to dynamically adjust the risk score R based on the prediction confidence and trend consistency. current The weighting coefficients α(t), β(t), and γ(t) in the calculation formula are given by R. fusion (t) = α(t)·R current (t) + β(t)·T norm + γ(t)·A norm ; The output module is configured to be based on the R fusion Data, outputting R data with time dimension labels and spatial location labels. fusion index.
9. The system as described in claim 1, characterized in that, The resource scheduling and strategy optimization module includes The optimized trigger module is configured to trigger real-time R fusion The index, risk positioning, time characteristics and policy mapping knowledge base are matched, and the multi-objective optimization function corresponding to the policy is called during the matching process. The multi-objective optimization solution module is configured to use R... fusion Using the index and risk as inputs, the optimal values of the control variables are determined under the constraints of three priority objectives: safety, economy, and system stability, thus forming a dynamic control instruction set.
10. A risk management method for an energy storage power station, characterized in that, Includes the following steps: The multi-source synchronous monitoring module acquires multi-dimensional monitoring data of key battery parameters in the energy storage power station in real time. The multi-scale risk trend prediction module predicts the risk trend of the energy storage power station based on the monitoring data, and generates a prediction sequence to identify the change trend of key battery parameters and the overall risk level within a future preset time window, as well as a confidence index corresponding to the prediction sequence. The dynamic risk assessment and fusion decision-making module generates a risk index based on the predicted sequence, the corresponding confidence index, and the real-time monitoring data collected by the multi-source synchronous monitoring module. The resource scheduling and strategy optimization module performs multi-objective optimization based on the risk index to generate scheduling instructions that are adapted to the current and future risk states; The execution and closed-loop feedback module executes the scheduling instructions, collects the system response feedback data of the energy storage power station after execution, and feeds back the system response feedback data to the multi-scale risk trend prediction module and / or dynamic risk assessment and fusion decision module in real time.