Intelligent dispatching control method and system for hybrid energy storage and renewable energy in data center
By combining a two-layer temporal memory network and a multi-head attention mechanism with a hybrid prediction model, the problem of predicting data center power load and renewable energy generation was solved, achieving stable operation and efficient utilization of the energy storage system, improving system energy efficiency and reducing operating costs.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHANGZHOU RUIWU TECH CO LTD
- Filing Date
- 2025-05-29
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies cannot effectively capture the complex time-series characteristics of data center power load and renewable energy generation, resulting in low prediction accuracy, difficulty in balancing the characteristic differences of different types of energy storage devices in the energy storage system, and lack of effective mechanisms to deal with system operational uncertainties, which affects the system's economy and reliability.
A two-layer temporal memory network combined with a multi-head attention mechanism is used to establish an electricity load prediction model. The power generation of renewable energy is predicted through a hybrid prediction model, and a hybrid energy storage optimization scheduling model is constructed. By combining the synergistic effect of distributed optimization and robust controller, accurate charging and discharging power commands are output.
It improves the accuracy of data center power load and renewable energy generation forecasts, enables stable operation and efficient utilization of energy storage systems, reduces operating costs, enhances system energy efficiency, and reduces dependence on the power grid.
Smart Images

Figure CN120601468B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to data energy storage and resource scheduling technology, and more particularly to a method and system for intelligent scheduling and control of hybrid energy storage and renewable energy in data centers. Background Technology
[0002] With the rapid development of cloud computing and internet technologies, data centers have become a crucial infrastructure in modern information society. However, the high energy consumption of data centers is becoming increasingly prominent, with their electricity demand continuing to grow and putting enormous pressure on the power grid. To reduce traditional energy consumption and carbon emissions, more and more data centers are introducing renewable energy and hybrid energy storage systems to improve energy efficiency and system stability.
[0003] Data center power load exhibits significant volatility and uncertainty, while the power output of renewable energy sources such as photovoltaics and wind power is also subject to randomness due to weather conditions. Traditional data center energy management technologies mainly rely on simple historical statistical data or basic linear forecasting methods to predict power load and renewable energy generation. These methods cannot effectively capture complex time-series characteristics and the influence of external factors, resulting in low forecast accuracy and affecting the scientific nature of subsequent scheduling decisions.
[0004] Existing hybrid energy storage control methods often employ a single optimization algorithm, which makes it difficult to balance the characteristic differences of different types of energy storage devices in the energy storage system, such as the high power density characteristics of supercapacitors and the high energy density characteristics of lithium batteries. This makes it impossible to achieve the optimal allocation of energy storage resources, reducing the economy and reliability of the system.
[0005] Furthermore, most existing energy storage dispatch and control strategies lack effective mechanisms to cope with system operational uncertainties. When there is a large deviation between actual operating data and predicted data, the control effect decreases significantly, making it difficult to ensure stable system operation and optimal energy efficiency. This is especially true under conditions of large-scale renewable energy integration and sudden load changes. Summary of the Invention
[0006] The present invention provides a method and system for intelligent scheduling and control of hybrid energy storage and renewable energy in data centers, which can solve the problems in the prior art.
[0007] A first aspect of the present invention provides a method for intelligent scheduling and control of hybrid energy storage and renewable energy in data centers, comprising:
[0008] Collect data on data center power load, hybrid energy storage system operation status, and renewable energy power generation.
[0009] A two-layer temporal memory network is used to extract the temporal features of the power load data. The temporal features are then processed through a multi-head attention mechanism to establish a data center power load prediction model. Based on the power load prediction model, the data center power load in the future time period is predicted to obtain the predicted power load data.
[0010] A renewable energy power generation prediction model is established based on the power generation data of the renewable energy source. The power generation of renewable energy in the future time period is predicted based on the power generation prediction model to obtain the predicted power generation data.
[0011] After feature enhancement based on the predicted electricity load data and the predicted power generation data, a hybrid energy storage optimization scheduling model is constructed, and an initial charging and discharging power command is output based on the hybrid energy storage optimization scheduling model.
[0012] Based on the operating status data of the hybrid energy storage system and the initial charge and discharge power command, a two-layer model predictive control framework is used for optimization. The corrected charge and discharge power command is output through the synergistic effect of distributed optimization and robust controller.
[0013] The charging and discharging operation of the hybrid energy storage system is controlled according to the corrected charging and discharging power command.
[0014] A two-layer temporal memory network is used to extract the temporal features of the power load data. A multi-head attention mechanism is then used to process these temporal features to establish a data center power load prediction model. Based on this model, the data center power load for future time periods is predicted, resulting in predicted power load data, including:
[0015] The electricity load data is preprocessed, and the maximum and minimum value normalization method is used to convert the electricity load data into standardized load data.
[0016] A two-layer temporal memory network is used to extract the temporal features of the standardized load data. A random deactivation layer is set between the two layers of the two-layer temporal memory network. The temporal features are fused with time features, temperature features, and load features to obtain fused features.
[0017] The fused features are processed using a multi-head attention mechanism to calculate multi-head attention weights. These weights are determined by attention scores calculated by multiple attention heads, which are obtained from query vectors, key vectors, and value vectors.
[0018] The multi-head attention weights are weighted and the fused features are weighted to obtain attention features, and a data center power load prediction model is established based on the attention features.
[0019] A loss function is constructed that includes a weighted combination of the root mean square error term and the mean absolute percentage error term. The data center power load prediction model is then optimized and trained based on the loss function.
[0020] Input the feature data corresponding to the time period to be predicted into the trained data center power load prediction model to obtain the initial prediction result;
[0021] The confidence interval of the initial prediction result is calculated based on the historical prediction error distribution. When the initial prediction result exceeds the confidence interval, it is corrected to obtain the predicted electricity load data.
[0022] A renewable energy power generation prediction model is established based on the power generation data of the aforementioned renewable energy sources. The power generation of renewable energy sources within a future time period is predicted using this model, resulting in predicted power generation data, including:
[0023] The power generation data of the renewable energy source is preprocessed, and the power generation data is denoised by wavelet transform. The denoised power generation data is then converted into standardized power generation data.
[0024] A hybrid prediction model is constructed. The periodic decomposition unit of the hybrid prediction model uses the variational mode decomposition method to decompose the standardized power generation data into trend components and fluctuation components. The trend prediction unit of the hybrid prediction model uses a long short-term memory network structure to model the trend components, and the fluctuation prediction unit of the hybrid prediction model uses a gated recurrent neural network structure to model the fluctuation components.
[0025] Meteorological forecast data is input into the hybrid forecast model, and the hybrid forecast model is optimized and trained based on the combined loss function of root mean square error and mean absolute error. The meteorological forecast data corresponding to the time period to be predicted is input into the trained hybrid forecast model to obtain the predicted power generation data.
[0026] A hybrid energy storage optimization scheduling model is constructed after feature enhancement based on the predicted electricity load data and the predicted power generation data. An initial charge / discharge power command is output based on the hybrid energy storage optimization scheduling model, including:
[0027] A multidimensional spatiotemporal state space is constructed based on the predicted electricity load data and the predicted power generation data. A spatiotemporal attention mechanism is used to enhance the features of the multidimensional spatiotemporal state space to obtain an enhanced state space.
[0028] A hierarchical deep reinforcement learning network is constructed. The state encoder of the hierarchical deep reinforcement learning network uses a dual-flow graph convolutional neural network to extract the temporal and spatial features of the augmented state space. The action generator of the hierarchical deep reinforcement learning network uses a gated recurrent unit to fuse the temporal and spatial features to generate a charge-discharge power decision. The value evaluator of the hierarchical deep reinforcement learning network uses a dual evaluation network structure to evaluate the value function of the charge-discharge power decision.
[0029] Based on historical scheduling data, an immediate reward item containing multiple costs is constructed. A long-term reward item is obtained by modeling the charging and discharging operation sequence using a recurrent neural network. The immediate reward item is dynamically adjusted through adaptive weight optimization, and the combination yields an adaptive hybrid reward function.
[0030] The adaptive hybrid reward function is combined with the hierarchical deep reinforcement learning network to construct a hybrid energy storage optimization scheduling model, and the hybrid energy storage optimization scheduling model is trained.
[0031] The current state information is input into the trained hybrid energy storage optimization scheduling model, and the initial charging and discharging power command is output through the action generator.
[0032] Based on historical scheduling data, an immediate reward term incorporating multiple costs is constructed. A long-term reward term is obtained by modeling the charge-discharge operation sequence using a recurrent neural network. The immediate reward term is dynamically adjusted through adaptive weight optimization, resulting in an adaptive hybrid reward function, including:
[0033] Collect historical scheduling data of hybrid energy storage systems, and construct real-time reward items including multiple costs based on the historical scheduling data;
[0034] A recurrent neural network is used to perform time-series modeling on the charging and discharging operation sequences in the historical scheduling data, extract the long-term dependency features of the charging and discharging operation sequences, and construct a long-term reward item that reflects the long-term operating benefits of the energy storage system based on the long-term dependency features.
[0035] An evaluation index is constructed based on the system operation status sequence and cost-benefit sequence. The scheduling effect score of each group of charging and discharging operations in the historical scheduling data is calculated based on the evaluation index. The scheduling effect score is input into an adaptive weight optimizer. The adaptive weight optimizer dynamically adjusts the weight coefficient of each cost sub-item in the immediate reward item according to the changing trend of the scheduling effect score. When the scheduling effect score corresponding to a specific cost sub-item decreases, the weight coefficient of that cost sub-item is increased.
[0036] The immediate reward and the long-term reward are weighted and combined to obtain an adaptive hybrid reward function.
[0037] Based on the operating status data of the hybrid energy storage system and the initial charge / discharge power command, a two-layer model predictive control framework is used for optimization. Through the synergistic effect of distributed optimization and a robust controller, a corrected charge / discharge power command is output, including:
[0038] Based on the operating status data of the hybrid energy storage system, the energy storage units are grouped to construct a hierarchical system state space. State feature vectors are extracted based on variational autoencoders. A hybrid structure of gated recurrent neural networks and residual convolutional networks is used to extract spatiotemporal features from the state feature vectors to construct a multimodal state prediction model.
[0039] A system dynamic response model is established, the output of the multimodal state prediction model is used to estimate the state through a particle filter algorithm, and the prediction results of the system dynamic response model are fitted with a probability distribution. An adaptive constraint estimator is constructed based on the information entropy criterion.
[0040] A two-layer model predictive control framework is constructed, wherein the long-time domain optimization layer performs prediction optimization based on the multimodal state prediction model, and the short-time domain execution layer dynamically corrects the initial charge and discharge power command according to the prediction error of the long-time domain optimization layer to obtain the initial correction result.
[0041] Based on the initial correction results, a multi-timescale optimization objective function is constructed. The multi-timescale optimization objective function is decomposed using the distributed alternating direction multiplier method, and the sub-problems are solved. The solution results are then input into the robust controller.
[0042] The initial charge / discharge power command is co-optimized with the robust controller through the two-layer model predictive control framework, and the corrected charge / discharge power command is output.
[0043] A two-layer model predictive control framework is constructed, wherein the long-time domain optimization layer performs prediction optimization based on the multimodal state prediction model, and the short-time domain execution layer dynamically corrects the initial charge and discharge power command based on the prediction error of the long-time domain optimization layer to obtain the initial correction result, including:
[0044] Construct a multimodal state space and map the multimodal state space into a long-time domain prediction state set and a short-time domain execution state set;
[0045] Based on the long-term predicted state set, a gated jump connection recurrent neural network structure is used to extract multi-scale state temporal features. The multi-scale state temporal features are input into a prediction optimizer constructed by a conditional generative adversarial network, and the predicted state sequence is output.
[0046] A short-term execution layer is established based on the short-term execution state set. The short-term execution layer uses a bidirectional long short-term memory network to calculate the temporal correlation between the predicted state sequence and the actual state sequence, and generates a prediction error feature matrix.
[0047] An adaptive corrector is constructed, which performs multi-dimensional feature extraction on the prediction error feature matrix to obtain multi-dimensional features, and constructs a multi-head cross-attention mechanism based on the multi-dimensional features to generate adaptive correction weights.
[0048] A deep neural network with residual connections is used to perform nonlinear feature fusion of the adaptive correction weights, the multi-scale state time-series features and the initial charge-discharge power command to obtain a nonlinear feature fusion result.
[0049] The nonlinear feature fusion result is input into a probability distribution generation network. The probability distribution generation network samples and generates a set of candidate correction coefficients in the latent variable space based on the distribution characteristics of the prediction error feature matrix. The set of candidate correction coefficients is evaluated, and the optimal correction coefficient is selected based on the Bayesian optimization criterion. The optimal correction coefficient is used to dynamically adjust the initial charge and discharge power command to obtain the initial correction result.
[0050] A second aspect of the present invention provides a data center hybrid energy storage and renewable energy intelligent scheduling and control system, comprising:
[0051] The first unit is used to collect data on the power load of data centers, the operating status of hybrid energy storage systems, and the power generation of renewable energy.
[0052] The second unit is used to extract the temporal features of the power load data using a two-layer temporal memory network, process the temporal features through a multi-head attention mechanism to establish a data center power load prediction model, and predict the data center power load in the future time period based on the power load prediction model to obtain predicted power load data.
[0053] The third unit is used to establish a renewable energy power generation prediction model based on the power generation data of the renewable energy source, and to predict the power generation of renewable energy source in the future time period according to the renewable energy power generation prediction model to obtain the predicted power generation data.
[0054] The fourth unit is used to construct a hybrid energy storage optimization scheduling model after feature enhancement based on the predicted electricity load data and the predicted power generation data, and to output an initial charging and discharging power command based on the hybrid energy storage optimization scheduling model.
[0055] The fifth unit is used to optimize the hybrid energy storage system based on the operating status data and the initial charge and discharge power command using a two-layer model predictive control framework, and output the corrected charge and discharge power command through the synergistic effect of distributed optimization and robust controller.
[0056] The sixth unit is used to control the charging and discharging operation of the hybrid energy storage system according to the corrected charging and discharging power command.
[0057] A third aspect of the present invention provides an electronic device, comprising:
[0058] processor;
[0059] Memory used to store processor-executable instructions;
[0060] The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0061] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0062] The beneficial effects of this application are as follows:
[0063] The intelligent scheduling and control method for hybrid energy storage and renewable energy in data centers provided by this invention extracts time-series features through a two-layer time-series memory network and constructs a load prediction model by combining a multi-head attention mechanism. This method can accurately predict the power load of data centers, improve prediction accuracy, and lay the foundation for subsequent energy storage scheduling.
[0064] The hybrid energy storage optimization scheduling model established in this invention performs feature enhancement processing on the prediction data, comprehensively considers the characteristics of renewable energy power generation and the power demand of data centers, and proposes a distributed optimization and robust control scheme based on a two-layer model predictive control framework, which effectively solves the system uncertainty problem and improves the operational stability of the energy storage system.
[0065] This invention achieves efficient utilization of renewable energy and stable supply of electricity to data centers through precise charge and discharge control of hybrid energy storage systems, reducing data center operating costs, improving system energy efficiency, and reducing dependence on the power grid, resulting in significant economic and environmental benefits. Attached Figure Description
[0066] Figure 1 This is a flowchart illustrating the intelligent scheduling and control method for hybrid energy storage and renewable energy in data centers according to an embodiment of the present invention.
[0067] Figure 2This is a bar chart comparing the performance of data center power load prediction models according to embodiments of the present invention.
[0068] Figure 3 This is a schematic diagram comparing the root mean square error of different prediction models in an embodiment of the present invention;
[0069] Figure 4 This is a flowchart illustrating the construction of the adaptive hybrid reward function according to an embodiment of the present invention.
[0070] Figure 5 A bar chart showing the performance comparison of the two-layer model predictive control framework in this invention. Detailed Implementation
[0071] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0072] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0073] Figure 1 This is a flowchart illustrating the intelligent scheduling and control method for hybrid energy storage and renewable energy in data centers according to an embodiment of the present invention. Figure 1 As shown, the method includes:
[0074] Collect data on data center power load, hybrid energy storage system operation status, and renewable energy power generation.
[0075] A two-layer temporal memory network is used to extract the temporal features of the power load data. The temporal features are then processed through a multi-head attention mechanism to establish a data center power load prediction model. Based on the power load prediction model, the data center power load in the future time period is predicted to obtain the predicted power load data.
[0076] A renewable energy power generation prediction model is established based on the power generation data of the renewable energy source. The power generation of renewable energy in the future time period is predicted based on the power generation prediction model to obtain the predicted power generation data.
[0077] After feature enhancement based on the predicted electricity load data and the predicted power generation data, a hybrid energy storage optimization scheduling model is constructed, and an initial charging and discharging power command is output based on the hybrid energy storage optimization scheduling model.
[0078] Based on the operating status data of the hybrid energy storage system and the initial charge and discharge power command, a two-layer model predictive control framework is used for optimization. The corrected charge and discharge power command is output through the synergistic effect of distributed optimization and robust controller.
[0079] The charging and discharging operation of the hybrid energy storage system is controlled according to the corrected charging and discharging power command.
[0080] In one optional implementation, a two-layer temporal memory network is used to extract the temporal features of the power load data. A multi-head attention mechanism is then used to process these temporal features to establish a data center power load prediction model. Based on this model, the data center power load for future time periods is predicted to obtain predicted power load data, including:
[0081] The electricity load data is preprocessed, and the maximum and minimum value normalization method is used to convert the electricity load data into standardized load data.
[0082] A two-layer temporal memory network is used to extract the temporal features of the standardized load data. A random deactivation layer is set between the two layers of the two-layer temporal memory network. The temporal features are fused with time features, temperature features, and load features to obtain fused features.
[0083] The fused features are processed using a multi-head attention mechanism to calculate multi-head attention weights. These weights are determined by attention scores calculated by multiple attention heads, which are obtained from query vectors, key vectors, and value vectors.
[0084] The multi-head attention weights are weighted and the fused features are weighted to obtain attention features, and a data center power load prediction model is established based on the attention features.
[0085] A loss function is constructed that includes a weighted combination of the root mean square error term and the mean absolute percentage error term. The data center power load prediction model is then optimized and trained based on the loss function.
[0086] Input the feature data corresponding to the time period to be predicted into the trained data center power load prediction model to obtain the initial prediction result;
[0087] The confidence interval of the initial prediction result is calculated based on the historical prediction error distribution. When the initial prediction result exceeds the confidence interval, it is corrected to obtain the predicted electricity load data.
[0088] Acquire historical power load data from the data center, including hourly power consumption data over the past 30 days. Preprocess the acquired power load data by using a maximum-minimum normalization method to convert it into standardized load data. Specifically, for the original load data x, it is converted into standardized data x' using a formula, ensuring that all data is mapped to the interval [0,1]. For example, if the maximum power load in the original dataset is 10000kW and the minimum power load is 2000kW, then the normalized value for an 8000kW power load is 0.75.
[0089] For the standardized load data, a two-layer temporal memory network was used for temporal feature extraction. This network consists of two recurrent neural networks, each containing 128 hidden units. The output of the first layer serves as the input to the second layer. A random deactivation layer with a deactivation rate of 0.2 is placed between the two layers to prevent overfitting. The input data is the electricity load sequence over the past 168 hours (7 days), and training samples are constructed using a sliding window mechanism. After network processing, a 128-dimensional temporal feature vector is obtained.
[0090] Simultaneously, auxiliary features related to electricity load are extracted, including: time features (containing time information such as hour, date, day of the week, month, and season, encoded as a 24-dimensional vector); temperature features (containing current temperature, highest temperature, and lowest temperature, a total of 3 dimensions); and load features (containing server CPU utilization, memory usage, network traffic, etc., a total of 12 dimensions). These auxiliary features are concatenated with the time-series features to form a 167-dimensional fused feature vector.
[0091] A multi-head attention mechanism is used to process the fused features and calculate attention weights. In this embodiment, eight attention heads are used, each with a hidden dimension of 64. For each attention head, the fused features are linearly transformed into a query vector, a key vector, and a value vector. An attention score is obtained by performing a dot product operation between the query vector and the key vector, followed by scaling and softmax normalization. The attention score is then weighted and summed with the value vector to obtain the output of that attention head. Finally, the outputs of the eight attention heads are concatenated and linearly transformed to obtain the multi-head attention output with a dimension of 512. This mechanism can capture dependencies between different time steps, improving the model's ability to model long-term dependencies.
[0092] The output of the multi-head attention is processed through a feedforward neural network. This network contains two fully connected layers: the first layer has 256 neurons and uses the ReLU activation function, while the second layer outputs a dimension of 1, corresponding to the predicted power load value. This completes the construction of the data center power load prediction model.
[0093] To optimize model performance, a loss function is constructed that includes a weighted combination of the root mean square error (RMSE) and mean absolute percentage error (MAS) terms. In this embodiment, the weight of the RMSE term is set to 0.6, and the weight of the MAS term is set to 0.4. This combined loss function considers both the absolute error and the relative error between the predicted and actual values, enabling the model to maintain good prediction accuracy under both high and low load conditions.
[0094] Model training employs mini-batch gradient descent with a batch size of 64. The initial learning rate is set to 0.001, and a learning rate decay strategy is used, multiplying the learning rate by 0.9 every 50 training epochs. Training continues for 300 epochs, or is stopped early when the loss on the validation set fails to improve for 20 consecutive epochs.
[0095] After the model is trained, for the electricity load to be predicted in the next 24 hours, the characteristic data of the corresponding time period (including time characteristics, predicted temperature characteristics, estimated load characteristics, etc.) are input into the trained model to obtain the initial prediction results.
[0096] The confidence interval for the initial forecast is calculated based on the historical forecast error distribution. Specifically, the forecast errors for the same period over the past 30 days are statistically analyzed, and their mean μ and standard deviation σ are calculated. μ ± 2σ is used as the 95% confidence interval. When the initial forecast exceeds this confidence interval, the forecast value is corrected to the boundary value of the confidence interval. For example, if the forecast load at a certain moment is 9500 kW, but the upper limit of the confidence interval is 9200 kW, then the forecast result is corrected to 9200 kW.
[0097] After the above steps, the predicted power load data for the future time period is finally obtained. In practical applications, this method achieved good performance on the test set, with an average absolute percentage error of 2.3% and a root mean square error of 185kW. Compared with traditional methods, it improves the prediction accuracy by about 25%, effectively supporting the energy planning and optimization of data centers.
[0098] Figure 2 A bar chart comparing the performance of the data center power load prediction model in this invention:
[0099] This chart compares the performance of different models across multiple evaluation metrics, including three model types: a two-layer temporal memory network model (blank bars), a multi-head attention model (slanted bars), and a fusion model (lattice bars). Looking at four key metrics: In terms of prediction accuracy, the fusion model performs best at 96.8%, slightly higher than the multi-head attention model's 94.2% and the two-layer temporal model's 92.7%; in peak load prediction, the fusion model also leads at 94.1%, while the multi-head attention model and the two-layer temporal model are at 89.5% and 87.2%, respectively; in terms of 24-hour prediction error, all three models maintain low levels, with the fusion model at the lowest at 2.5%, the multi-head attention model at 4.3%, and the two-layer temporal model at 5.2%; for abnormal load response capability, the fusion model still performs best at 92.7%, the multi-head attention model at 87.2%, and the two-layer temporal model at 83.1%. Overall, the fusion model demonstrates the best performance across all evaluation metrics.
[0100] In one optional implementation, a renewable energy power generation prediction model is established based on the renewable energy power generation data. The renewable energy power generation data for a future time period is then predicted using this model, including:
[0101] The power generation data of the renewable energy source is preprocessed, and the power generation data is denoised by wavelet transform. The denoised power generation data is then converted into standardized power generation data.
[0102] A hybrid prediction model is constructed. The periodic decomposition unit of the hybrid prediction model uses the variational mode decomposition method to decompose the standardized power generation data into trend components and fluctuation components. The trend prediction unit of the hybrid prediction model uses a long short-term memory network structure to model the trend components, and the fluctuation prediction unit of the hybrid prediction model uses a gated recurrent neural network structure to model the fluctuation components.
[0103] Meteorological forecast data is input into the hybrid forecast model, and the hybrid forecast model is optimized and trained based on the combined loss function of root mean square error and mean absolute error. The meteorological forecast data corresponding to the time period to be predicted is input into the trained hybrid forecast model to obtain the predicted power generation data.
[0104] In the data preprocessing stage, the system first collects historical power generation data from renewable energy sources, which typically contains noise and outliers. To improve prediction accuracy, wavelet transform is used to denoise the power generation data. Specifically, the db4 wavelet basis function is selected to perform a four-level decomposition on the original power generation data, obtaining approximation coefficients and detail coefficients. A soft thresholding process is applied to the detail coefficients by setting a threshold function, with the threshold value being 0.6 times the standard deviation. Then, wavelet reconstruction is performed using the processed coefficients to obtain the denoised power generation data.
[0105] For 96 power point data points collected at 10-minute intervals from a wind farm, the original data ranged from 0 to 50 MW and contained random noise. After wavelet denoising, the data curves became smoother, removing the influence of short-term random fluctuations and preserving the main trend of power generation. After denoising, the data was standardized, mapping the data values to the [0,1] interval. Specifically, the current power value was subtracted from the minimum power value, and then divided by the difference between the maximum and minimum power values to obtain the standardized power generation data.
[0106] In the model building phase, a hybrid prediction model was designed, comprising three core components: a periodic decomposition unit, a trend prediction unit, and a fluctuation prediction unit. The periodic decomposition unit employs variational mode decomposition to decompose standardized power generation data into trend and fluctuation components. During implementation, the number of decomposition modes K was set to 3, the penalty factor α to 2000, and the tolerance ε to 1e-7, to decompose the standardized power data. After decomposition, the first mode function was used as the fluctuation component, and the sum of the remaining mode functions and the residual term was used as the trend component.
[0107] The standardized 96-point power data is input into the variational mode decomposition algorithm to obtain three mode functions that reflect the power variation law of different frequency characteristics. The first mode function contains high-frequency fluctuation information, while the other two mode functions and the residual term reflect the main trend of power variation.
[0108] The trend prediction unit uses a Long Short-Term Memory (LSTM) network structure to model the trend components. This network consists of an input layer, two LSTM hidden layers, and an output layer. The first LSTM layer has 64 units, and the second LSTM layer has 32 units. Each LSTM layer is followed by a Dropout layer with a dropout rate of 0.2 to prevent overfitting. Input features include historical trend component data and relevant meteorological forecast data, such as wind speed, wind direction, temperature, and air pressure. A sliding window method is used to construct the samples, with a window size of 24, meaning that data from the previous 24 time points is used to predict the value at the next time point.
[0109] The volatility prediction unit uses a gated recurrent neural network (GRN) structure to model volatility components. This network consists of an input layer, two GRU hidden layers, and an output layer. The first GRU layer has 48 units, and the second GRU layer has 24 units. Each layer is followed by a Dropout layer with a dropout rate of 0.2. The input features are the same as those of the trend prediction unit, but the focus is on capturing short-term volatility.
[0110] During the model training phase, a combined loss function was used to optimize the hybrid prediction model. This loss function considers both the root mean square error and the mean absolute error, and is defined as a weighted sum of the two with a weight ratio of 0.6:0.4. During training, the Adam optimizer was used with an initial learning rate of 0.001. A learning rate decay strategy was employed, reducing the learning rate to 90% of its original value every 50 epochs. The batch size was set to 32, and the training duration was 200 epochs. To prevent overfitting, an early stopping strategy was adopted, stopping training when the loss function on the validation set showed no improvement for 10 consecutive epochs.
[0111] In the prediction implementation phase, meteorological forecast data corresponding to the time period to be predicted is input into the trained hybrid prediction model. First, the meteorological forecast data undergoes the same standardization process as the training data. Then, it is input into the trend prediction unit and the fluctuation prediction unit respectively to obtain the predicted values of the trend component and the fluctuation component for the future time period. The two predicted values are added together to obtain the power generation prediction result under the standardized scale. Finally, the prediction result is destandardized to map the predicted values in the [0,1] interval back to the original power value range, obtaining the final predicted power generation data.
[0112] Experimental results show that for 24-hour ahead prediction of a 100MW photovoltaic power plant, the average absolute percentage error of this method is 3.2%, and the root mean square error is 2.8MW, which is more than 15% higher than the prediction accuracy of traditional single-model methods. The advantage in prediction accuracy is even more pronounced under weather conditions with large power fluctuations, accurately capturing power abrupt changes and fluctuation trends, providing a reliable basis for grid dispatch and energy market transactions.
[0113] Figure 3 This is a schematic diagram comparing the root mean square error of different prediction models in an embodiment of the present invention:
[0114] This figure compares the root mean square error (RMSE) of three different forecasting schemes over eight forecast days. As can be seen, the RMSE of this proposed scheme remains consistently low, fluctuating between 2.2% and 2.6% throughout the forecast period, demonstrating the most stable and superior performance. The traditional VMD-LSTM scheme has a moderate RMSE, fluctuating between 4.5% and 4.9%, nearly twice as high as this proposed scheme. The single LSTM scheme performs the worst, with an RMSE fluctuating between 5.7% and 6.2%, and the fluctuation range is relatively large. The differences among the three schemes are most pronounced on the fourth forecast day: this proposed scheme has an RMSE of 2.6%, the traditional VMD-LSTM has 4.9%, and the single LSTM reaches 6.0%. Overall, the trend shows that this proposed scheme significantly outperforms the other two schemes in both forecast accuracy and stability, demonstrating its significant advantages in the field of renewable energy power generation forecasting.
[0115] In one optional implementation, a hybrid energy storage optimal scheduling model is constructed after feature enhancement based on the predicted electricity load data and the predicted power generation data. An initial charge / discharge power command is output based on the hybrid energy storage optimal scheduling model, including:
[0116] A multidimensional spatiotemporal state space is constructed based on the predicted electricity load data and the predicted power generation data. A spatiotemporal attention mechanism is used to enhance the features of the multidimensional spatiotemporal state space to obtain an enhanced state space.
[0117] A hierarchical deep reinforcement learning network is constructed. The state encoder of the hierarchical deep reinforcement learning network uses a dual-flow graph convolutional neural network to extract the temporal and spatial features of the augmented state space. The action generator of the hierarchical deep reinforcement learning network uses a gated recurrent unit to fuse the temporal and spatial features to generate a charge-discharge power decision. The value evaluator of the hierarchical deep reinforcement learning network uses a dual evaluation network structure to evaluate the value function of the charge-discharge power decision.
[0118] Based on historical scheduling data, an immediate reward item containing multiple costs is constructed. A long-term reward item is obtained by modeling the charging and discharging operation sequence using a recurrent neural network. The immediate reward item is dynamically adjusted through adaptive weight optimization, and the combination yields an adaptive hybrid reward function.
[0119] The adaptive hybrid reward function is combined with the hierarchical deep reinforcement learning network to construct a hybrid energy storage optimization scheduling model, and the hybrid energy storage optimization scheduling model is trained.
[0120] The current state information is input into the trained hybrid energy storage optimization scheduling model, and the initial charging and discharging power command is output through the action generator.
[0121] Obtain forecasted electricity load and forecasted power generation data with a 24-hour time granularity. This data typically comes from power grid monitoring systems or energy management platforms. Taking an industrial park as an example, the forecasted electricity load peaks at 15MW between 8:00 AM and 6:00 PM on weekdays, while photovoltaic power generation reaches its highest value of 10MW between 11:00 AM and 2:00 PM. Wind power is relatively high at night and in the early morning, averaging 3MW.
[0122] Based on these prediction data, the system constructs a multi-dimensional spatiotemporal state space. This state space includes a temporal dimension (24 hours) and a spatial dimension (including the geographical distribution of load nodes, generation nodes, and energy storage nodes). For example, a power grid may have 10 load nodes, 5 generation nodes, and 3 hybrid energy storage nodes, forming a complex network topology. The system employs a spatiotemporal attention mechanism to enhance the features of this multi-dimensional state space. Specifically, for the temporal dimension, self-attention is used to calculate the correlation weights between different times; for the spatial dimension, a graph attention network is used to capture the topological relationships between different nodes, ultimately resulting in an enhanced state space representation. In experiments, this mechanism improved the model's prediction accuracy for peak and valley periods from 85% to 93%.
[0123] In the construction of the hierarchical deep reinforcement learning network, the state encoder adopts a dual-flow graph convolutional neural network architecture. The temporal feature extraction stream uses a 10-layer temporal graph convolutional network with a kernel size of 3 and a hidden layer dimension of 128 to capture the load and power generation change trends over 24 hours; the spatial feature extraction stream uses an 8-layer spatial graph convolutional network with a node embedding dimension of 64 and an edge feature dimension of 32 to extract the mutual influence between nodes in the power grid topology.
[0124] The action generator, based on a two-layer gated recurrent unit (GRU) network with 128 hidden units, receives fused spatiotemporal features as input to generate charge / discharge power decisions for the hybrid energy storage system. These decisions are made at one-hour intervals, with the output power value for each energy storage device ranging from -5MW to 5MW; positive values represent discharging, and negative values represent charging. The value evaluator employs a dual-evaluation network structure, comprising a main network and a target network, both consisting of three fully connected neural networks with 256, 128, and 64 hidden layer nodes respectively, evaluating the long-term value of the current charge / discharge decisions. This dual structure effectively reduces excessive bias in Q-value estimation, making the training process more stable.
[0125] To construct a reasonable reward function, the system designs various real-time cost items based on historical dispatch data, including electricity purchase cost (peak-hour price 0.9 yuan / kWh, valley-hour price 0.4 yuan / kWh), energy storage loss cost (lithium battery cycle efficiency 90%, lead-acid battery cycle efficiency 85%), energy storage lifetime cost (different loss coefficients corresponding to the number of deep cycles), and load fluctuation penalty (a penalty is added when the load change rate exceeds 10% / hour). The system uses a recurrent neural network with a 3-layer LSTM structure and a hidden layer size of 64 to model the charging and discharging operation sequence over a continuous 24 hours, obtaining long-term reward items that reflect the quality of the long-term strategy. Through an adaptive weight optimization mechanism, the system dynamically adjusts the weights of each reward item according to the current grid load situation. For example, it increases the peak shaving and valley filling reward weight during peak load periods and increases the smoothing output reward weight when renewable energy fluctuations are large, thus combining them to obtain an adaptive hybrid reward function.
[0126] An adaptive hybrid reward function was combined with a hierarchical deep reinforcement learning network to construct a complete hybrid energy storage optimization scheduling model. The model was trained using an experience replay mechanism with a batch size of 256, a learning rate of 0.001, and a discount factor of 0.95. Training was conducted on a training set containing 12 months of historical data, requiring approximately 50,000 iterations to converge. After training, the model achieved a 93% scheduling optimization target achievement rate on the test set, representing a 15% improvement compared to traditional methods.
[0127] In practical applications, the system inputs the current state information into the trained hybrid energy storage optimization scheduling model every 15 minutes. This state information includes: the current grid load level (e.g., 12MW), renewable energy generation power (e.g., 6MW photovoltaic, 2MW wind power), energy storage device status (e.g., lithium battery SOC at 65%, lead-acid battery SOC at 70%), and electricity price information (current electricity price is 0.75 yuan / kWh). The model's action generator, through state encoding and decision generation, outputs initial charge / discharge power commands, such as discharging the lithium battery at 2.5MW and charging the lead-acid battery at 1.2MW, to optimize the overall economic efficiency and stability of the system.
[0128] In one optional implementation, an immediate reward term incorporating multiple costs is constructed based on historical scheduling data. A long-term reward term is obtained by modeling the charge-discharge operation sequence using a recurrent neural network. The immediate reward term is dynamically adjusted through adaptive weight optimization, and the resulting combination yields an adaptive hybrid reward function, including:
[0129] Collect historical scheduling data of hybrid energy storage systems, and construct real-time reward items including multiple costs based on the historical scheduling data;
[0130] A recurrent neural network is used to perform time-series modeling on the charging and discharging operation sequences in the historical scheduling data, extract the long-term dependency features of the charging and discharging operation sequences, and construct a long-term reward item that reflects the long-term operating benefits of the energy storage system based on the long-term dependency features.
[0131] An evaluation index is constructed based on the system operation status sequence and cost-benefit sequence. The scheduling effect score of each group of charging and discharging operations in the historical scheduling data is calculated based on the evaluation index. The scheduling effect score is input into an adaptive weight optimizer. The adaptive weight optimizer dynamically adjusts the weight coefficient of each cost sub-item in the immediate reward item according to the changing trend of the scheduling effect score. When the scheduling effect score corresponding to a specific cost sub-item decreases, the weight coefficient of that cost sub-item is increased.
[0132] The immediate reward and the long-term reward are weighted and combined to obtain an adaptive hybrid reward function.
[0133] like Figure 4 As shown, the method includes:
[0134] In the intelligent scheduling process of the hybrid energy storage system, historical scheduling data is first collected. This data includes, but is not limited to: grid load fluctuation data, electricity price change data, charging and discharging status of energy storage devices, state of charge (SOC) changes of energy storage devices, state of health (SOH) data of various types of energy storage devices, and corresponding scheduling decisions and system revenue data. In practical applications, the system collects data every 5 minutes for 3 consecutive months, collecting approximately 26,000 data samples in total.
[0135] Based on collected historical scheduling data, an immediate reward item is constructed. This immediate reward item comprehensively considers multiple cost factors, including: battery cycle degradation cost, electricity purchase cost, peak-valley electricity price difference revenue, demand-based electricity cost savings, and frequency regulation ancillary service revenue.
[0136] Taking battery cycle degradation cost as an example, this paper establishes a degradation cost model by analyzing the relationship between the number of cycles and capacity degradation of energy storage devices at different depths of charge and discharge (DOD). In practical applications, when a lithium battery operates at 80% DOD, its single cycle degradation cost is approximately 0.02% of the total investment. By converting the depth of charge and discharge of each operation into an equivalent number of complete cycles, the degradation cost of a single dispatch can be calculated.
[0137] To capture the long-term dependency characteristics in energy storage system scheduling, this invention employs a recurrent neural network to model historical charge-discharge operation sequences. Specifically, a Long Short-Term Memory (LSTM) network structure is used, with the input being a continuous 24-hour charge-discharge operation sequence (one operation point every 5 minutes, for a total of 288 time steps). The network contains two LSTM layers, each with 128 hidden units. This structure captures the temporal patterns and long-term dependencies of energy storage system charge-discharge operations.
[0138] The network was trained using operation sequences from a continuous 90-day period of historical data, approximately 7,776 time-series samples. After training, for a new operation sequence, the LSTM network can predict the impact of that sequence on the future performance of the system, thus constructing a reward term that reflects long-term benefits.
[0139] The long-term reward term is constructed based on the prediction output of the LSTM network, which includes the predicted cumulative revenue and device state changes for the next 7 days. By comparing the differences in predicted revenue under different operation sequences, the contribution of the current operation to long-term benefits is quantified. In actual testing, for peak-valley electricity price arbitrage scenarios, the scheduling strategy using the long-term reward term, compared to a strategy that only considers immediate revenue, improves the average monthly revenue by 12.3%, while reducing battery capacity degradation by 8.7%.
[0140] The core innovation of this invention lies in its adaptive weight optimization mechanism. To achieve dynamic adjustment of weights, a feedback mechanism based on scheduling performance evaluation is designed. First, evaluation metrics are defined, including average daily return on investment, battery capacity degradation rate, and peak load reduction ratio. These metrics are calculated for each group of charge / discharge operations in historical scheduling data to obtain a comprehensive scheduling performance score.
[0141] Taking a specific experimental scenario as an example, the initial weights for battery cycle degradation cost, electricity purchase cost, and peak-valley electricity price difference revenue are set at 0.3. During operation, when the system's battery capacity degradation rate exceeds the expected target (e.g., monthly degradation rate exceeds 0.5%), the adaptive optimizer adjusts the weight of battery degradation cost from 0.3 to 0.45, while appropriately reducing the weights of other cost items, making the system focus more on battery life protection.
[0142] The adaptive weight optimizer is implemented based on the gradient ascent method. Daily, the system calculates the overall performance score under the current scheduling strategy, analyzes the sensitivity relationship between the score and the weights of each cost item, and determines the direction and magnitude of weight adjustments. To avoid overly drastic weight adjustments that could lead to system instability, an upper limit is set on the step size of weight changes, limiting each adjustment to no more than 15% of the original weight, and the sum of all weights remains 1. In practical applications, after 30 days of adaptive adjustments, the system's overall scheduling performance score improved by 17.5%, validating the effectiveness of the adaptive weight optimization.
[0143] A complete adaptive hybrid reward function is constructed by weighting and combining immediate and long-term rewards. The initial combined weights are set at 0.6 for immediate reward and 0.4 for long-term reward. As the system runs and learns, this ratio is dynamically adjusted through the aforementioned adaptive mechanism. In a comprehensive application scenario, the scheduling strategy guided by the hybrid reward function achieves a combined effect of 15.8% higher annualized returns and approximately 22.3% longer battery life compared to the traditional fixed-weight method.
[0144] The adaptive hybrid reward function construction method provided by this invention enables hybrid energy storage systems to balance short-term economic benefits with long-term operational efficiency, continuously optimize scheduling strategies in a dynamically changing operating environment, and achieve a balance between the economy, stability, and sustainability of energy storage systems.
[0145] In one optional implementation, based on the operating status data of the hybrid energy storage system and the initial charge / discharge power command, a two-layer model predictive control framework is used for optimization. The corrected charge / discharge power command is output through the synergistic effect of distributed optimization and a robust controller, including:
[0146] Based on the operating status data of the hybrid energy storage system, the energy storage units are grouped to construct a hierarchical system state space. State feature vectors are extracted based on variational autoencoders. A hybrid structure of gated recurrent neural networks and residual convolutional networks is used to extract spatiotemporal features from the state feature vectors to construct a multimodal state prediction model.
[0147] A system dynamic response model is established, the output of the multimodal state prediction model is used to estimate the state through a particle filter algorithm, and the prediction results of the system dynamic response model are fitted with a probability distribution. An adaptive constraint estimator is constructed based on the information entropy criterion.
[0148] A two-layer model predictive control framework is constructed, wherein the long-time domain optimization layer performs prediction optimization based on the multimodal state prediction model, and the short-time domain execution layer dynamically corrects the initial charge and discharge power command according to the prediction error of the long-time domain optimization layer to obtain the initial correction result.
[0149] Based on the initial correction results, a multi-timescale optimization objective function is constructed. The multi-timescale optimization objective function is decomposed using the distributed alternating direction multiplier method, and the sub-problems are solved. The solution results are then input into the robust controller.
[0150] The initial charge / discharge power command is co-optimized with the robust controller through the two-layer model predictive control framework, and the corrected charge / discharge power command is output.
[0151] In the actual operation of a hybrid energy storage system, operational status data is acquired, including the state of charge, temperature, health status, and grid frequency fluctuations of the energy storage units. The system is grouped according to the type and characteristics of the energy storage units to construct a hierarchical system state space. For example, different types of energy storage units, such as lithium batteries, supercapacitors, and flywheel energy storage, are divided into fast-response and large-capacity groups based on their response speed and energy density characteristics.
[0152] The state data of each group is used to extract features through a variational autoencoder. The encoder part of the variational autoencoder contains three layers of neural network with 128, 64 and 32 hidden neurons, respectively. The ReLU activation function is used, and the state feature vector is obtained by training through a loss function that minimizes the reconstruction error and KL divergence.
[0153] For the extracted state feature vectors, a hybrid structure of a gated recurrent neural network (GRU) and a residual convolutional network (ResNet) is used for spatiotemporal feature extraction. The GRU network contains two hidden layers, each with 128 neurons, to capture temporal dependencies; the residual convolutional network uses three residual blocks, each containing two convolutional layers and one skip connection, with a 3×3 kernel size, to extract spatial features. The output of this hybrid structure is fused through fully connected layers to construct a multimodal state prediction model. This model can simultaneously consider the historical state evolution and spatial correlation of energy storage units to predict the system state in future periods. In practical applications, this model achieves a state of charge prediction accuracy of 95.2% for lithium battery energy storage units and a power fluctuation prediction accuracy of 93.8% for supercapacitor banks.
[0154] Based on the physical characteristics of energy storage systems, a dynamic response model incorporating both electrical and thermal characteristics is established. This model considers factors such as the internal resistance, capacity decay, and temperature effects of the energy storage units, describing their dynamic response characteristics under different operating conditions. The output of the multimodal state prediction model is used for state estimation via a particle filtering algorithm. The particle filtering algorithm employs 1000 particle samples, and the system state is probabilistically represented through importance sampling and resampling processes.
[0155] The prediction results of the system's dynamic response model are fitted with a probability distribution, and the uncertainty is represented by a Gaussian mixture distribution. An adaptive constraint estimator is constructed by calculating the information entropy. This estimator dynamically adjusts the system's operational constraints according to the degree of uncertainty in the state prediction; the greater the uncertainty, the more conservative the constraints are applied, ensuring the safety of system operation.
[0156] A two-layer model predictive control framework is constructed, consisting of a long-term optimization layer and a short-term execution layer. The long-term optimization layer uses a 15-minute time step and a 24-hour prediction time domain. It performs long-term optimization based on a multimodal state prediction model. The objective function includes three parts: system operating cost, energy storage lifetime loss cost, and grid frequency regulation revenue. The optimized trajectory is obtained by solving the problem using a quadratic programming algorithm.
[0157] The short-time execution layer operates with a time step of 10 seconds and a prediction time domain of 5 minutes. Based on the real-time system state and the prediction error of the long-time optimization layer, it dynamically corrects the initial charge and discharge power commands. The short-time execution layer employs a rolling time-domain optimization strategy, executing only the control commands for the current time period each time, and updating and resolving the optimization problem in the next control cycle to obtain the initial correction results.
[0158] Based on the initial calibration results, a multi-timescale optimization objective function is constructed. This objective function simultaneously considers the trade-off between short-term system response speed and long-term optimization objectives, and is solved using the Distributed Alternating Directional Multiplier Method (ADMM). The ADMM algorithm decomposes the optimization problem into multiple subproblems, each corresponding to a different energy storage unit group or a different timescale objective. In practical applications, ADMM converges to a relatively optimal solution in 15 iterations, improving computational efficiency by 65% compared to centralized optimization. The ADMM solution is then input into a robust controller, which is designed based on H∞ control theory and can cope with system parameter uncertainties and external disturbances.
[0159] The initial charge / discharge power command is co-optimized using a two-layer model predictive control framework and a robust controller. The two-layer model predictive control framework generates the optimal trajectory, while the robust controller handles uncertainties and disturbances. Together, they output the corrected charge / discharge power command. In practical applications, this method reduces power fluctuations by 15.3%, improves energy storage efficiency by 8.6%, and extends the lifespan of the energy storage system by approximately 12.2% compared to traditional single-layer model predictive control, while ensuring the system's rapid response to grid frequency regulation.
[0160] In one optional implementation, a two-layer model predictive control framework is constructed, wherein the long-time domain optimization layer performs predictive optimization based on the multimodal state prediction model, and the short-time domain execution layer dynamically corrects the initial charge / discharge power command based on the prediction error of the long-time domain optimization layer to obtain an initial correction result, including:
[0161] Construct a multimodal state space and map the multimodal state space into a long-time domain prediction state set and a short-time domain execution state set;
[0162] Based on the long-term predicted state set, a gated jump connection recurrent neural network structure is used to extract multi-scale state temporal features. The multi-scale state temporal features are input into a prediction optimizer constructed by a conditional generative adversarial network, and the predicted state sequence is output.
[0163] A short-term execution layer is established based on the short-term execution state set. The short-term execution layer uses a bidirectional long short-term memory network to calculate the temporal correlation between the predicted state sequence and the actual state sequence, and generates a prediction error feature matrix.
[0164] An adaptive corrector is constructed, which performs multi-dimensional feature extraction on the prediction error feature matrix to obtain multi-dimensional features, and constructs a multi-head cross-attention mechanism based on the multi-dimensional features to generate adaptive correction weights.
[0165] A deep neural network with residual connections is used to perform nonlinear feature fusion of the adaptive correction weights, the multi-scale state time-series features and the initial charge-discharge power command to obtain a nonlinear feature fusion result.
[0166] The nonlinear feature fusion result is input into a probability distribution generation network. The probability distribution generation network samples and generates a set of candidate correction coefficients in the latent variable space based on the distribution characteristics of the prediction error feature matrix. The set of candidate correction coefficients is evaluated, and the optimal correction coefficient is selected based on the Bayesian optimization criterion. The optimal correction coefficient is used to dynamically adjust the initial charge and discharge power command to obtain the initial correction result.
[0167] A two-layer model predictive control framework is constructed, in which the long-term optimization layer performs predictive optimization based on a multimodal state prediction model, and the short-term execution layer dynamically corrects the initial charge and discharge power commands based on the prediction error of the long-term optimization layer. This framework characterizes the operating state of a complex power grid system by constructing a multimodal state space, mapping this state space to a long-term predictive state set and a short-term execution state set. The long-term state set includes hourly variables such as grid load forecasting, renewable energy generation forecasting, and market electricity price forecasting; the short-term state set includes minute- or second-level variables such as battery state of charge, real-time charge and discharge power, and system frequency.
[0168] Based on a long-term predicted state set, the system employs a gated skip connection recurrent neural network structure to extract multi-scale temporal features of states. This network structure contains three layers of recurrent units, each with 64 units, and gating units are set between adjacent layers to control the flow of information. Through the skip connection mechanism, the system achieves the fusion of features from different time scales, effectively capturing long-term temporal patterns.
[0169] For the 24-hour power load forecasting task, the system uses a sliding window of 30 minutes, extracting eight key temporal features at each step to form a feature vector. These multi-scale temporal features are then input into a prediction optimizer constructed using a conditional generative adversarial network. This optimizer consists of a generator and a discriminator. The generator employs a five-layer fully connected network structure with 128, 256, 512, 256, and 128 neurons per layer, respectively; the discriminator uses a three-layer convolutional network with a kernel size of 3 and 16, 32, and 64 channels, respectively. Through adversarial training, the system can output predicted state sequences, such as the predicted grid load and renewable energy generation values for the next 24 hours.
[0170] Based on a short-term execution state set, the system establishes a short-term execution layer. This layer uses a bidirectional long short-term memory network to calculate the temporal correlation between the predicted state sequence and the actual state sequence. The network contains two LSTM layers with a hidden layer dimension of 128. It analyzes the temporal data in both forward and backward directions to generate a prediction error feature matrix. Taking a battery energy management system as an example, the system collects the actual grid state every 5 minutes, compares it with the predicted state, and calculates the relative error rate. These error data are organized into a feature matrix of dimension [24×12, 10], where 24×12 represents the number of 5-minute granular time points within a day, and 10 represents the number of monitored state variables.
[0171] An adaptive corrector is further constructed to extract multi-dimensional features from the prediction error feature matrix. A three-way parallel feature extraction network is employed: the first path uses a one-dimensional convolutional network with a kernel size of 3 to extract local temporal features; the second path uses global average pooling to capture the global error distribution; and the third path uses a densely connected network to learn the correlations between errors. After dimensionality unification, the feature results from the three paths are input into a multi-head cross-attention mechanism with four attention heads, each with a dimension of 32. Through this mechanism, the system generates adaptive correction weights, which reflect the importance of different time points and different state variables.
[0172] A deep neural network with residual connections is used to nonlinearly fuse adaptive correction weights, multi-scale state temporal features, and initial charge / discharge power commands. The network consists of five residual blocks, each containing two convolutional layers and one skip connection. During the fusion process, correction weights are applied to the state features and concatenated with the initial power command. The resulting fused feature vector, with dimensions of 256, is obtained through network processing.
[0173] The nonlinear feature fusion result is input into a probability distribution generation network. This network is based on a variational autoencoder architecture, with both the encoder and decoder employing three fully connected layers. The encoder maps the fused features to a 64-dimensional latent variable space and estimates the mean and variance parameters of the latent variables. Based on the statistical properties of the prediction error feature matrix, such as a mean of -0.023 and a standard deviation of 0.158, the system performs 100 sampling iterations in the latent variable space to generate a set of candidate correction coefficients.
[0174] For each candidate correction coefficient, the system evaluates its performance through simulation, calculating indicators such as energy efficiency and peak-to-valley smoothing effect. Based on the Bayesian optimization criterion, the system selects the correction coefficient with the highest comprehensive score from the candidate set as the optimal correction coefficient. For example, in a practical application, the system selects an optimal correction coefficient of 0.87, which means reducing the initial charge / discharge power command by 13%. The system uses this optimal correction coefficient to dynamically adjust the initial charge / discharge power command, obtaining the initial correction result.
[0175] This technical solution, through multimodal information fusion and deep learning methods, achieves accurate prediction of grid conditions and precise control of battery charging and discharging strategies, significantly improving the adaptability and stability of the energy management system. Practical application results show that, compared to traditional single-layer control methods, this approach improves peak-valley smoothing efficiency by approximately 18.5% and extends battery life by approximately 12.3%.
[0176] Figure 5 A bar chart showing the performance comparison of the two-layer model predictive control framework in this invention:
[0177] This figure compares the performance of three different control models (traditional predictive control, single-layer predictive control, and a two-layer predictive control framework) across five key performance indicators. In terms of prediction accuracy, the two-layer predictive control framework achieves 94.2%, significantly outperforming the single-layer model's 88.7% and the traditional model's 83.5%. Regarding response speed, the two-layer framework also performs best at 92.8%, compared to the single-layer model's 85.4% and the traditional model's only 78.2%. In terms of anti-interference capability, the two-layer framework achieves 89.5%, the single-layer model's 79.3%, and the traditional model's lowest at 71.6%. In terms of energy utilization efficiency, the two-layer framework reaches 93.1%, far exceeding the single-layer model's 84.6% and the traditional model's 80.3%. In terms of regulation stability, the two-layer framework maintains its leading advantage at 91.7%, compared to the single-layer model's 82.1% and the traditional model's 75.8%. The data shows that the two-layer predictive control framework demonstrates a significant performance advantage across all evaluation indicators, exhibiting better overall performance compared to the other two models.
[0178] A second aspect of the present invention provides a data center hybrid energy storage and renewable energy intelligent scheduling and control system, comprising:
[0179] The first unit is used to collect data on the power load of data centers, the operating status of hybrid energy storage systems, and the power generation of renewable energy.
[0180] The second unit is used to extract the temporal features of the power load data using a two-layer temporal memory network, process the temporal features through a multi-head attention mechanism to establish a data center power load prediction model, and predict the data center power load in the future time period based on the power load prediction model to obtain predicted power load data.
[0181] The third unit is used to establish a renewable energy power generation prediction model based on the power generation data of the renewable energy source, and to predict the power generation of renewable energy source in the future time period according to the renewable energy power generation prediction model to obtain the predicted power generation data.
[0182] The fourth unit is used to construct a hybrid energy storage optimization scheduling model after feature enhancement based on the predicted electricity load data and the predicted power generation data, and to output an initial charging and discharging power command based on the hybrid energy storage optimization scheduling model.
[0183] The fifth unit is used to optimize the hybrid energy storage system based on the operating status data and the initial charge and discharge power command using a two-layer model predictive control framework, and output the corrected charge and discharge power command through the synergistic effect of distributed optimization and robust controller.
[0184] The sixth unit is used to control the charging and discharging operation of the hybrid energy storage system according to the corrected charging and discharging power command.
[0185] A third aspect of the present invention provides an electronic device, comprising:
[0186] processor;
[0187] Memory used to store processor-executable instructions;
[0188] The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0189] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0190] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.
[0191] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for intelligent scheduling and control of hybrid energy storage and renewable energy in data centers, characterized in that, include: Collect data on data center power load, hybrid energy storage system operation status, and renewable energy power generation. A two-layer temporal memory network is used to extract the temporal features of the power load data. The temporal features are then processed through a multi-head attention mechanism to establish a data center power load prediction model. Based on the power load prediction model, the data center power load in the future time period is predicted to obtain the predicted power load data. A renewable energy power generation prediction model is established based on the power generation data of the renewable energy source. The power generation of renewable energy in the future time period is predicted based on the power generation prediction model to obtain the predicted power generation data. After feature enhancement based on the predicted electricity load data and the predicted power generation data, a hybrid energy storage optimization scheduling model is constructed, and an initial charging and discharging power command is output based on the hybrid energy storage optimization scheduling model. Based on the operating status data of the hybrid energy storage system and the initial charge and discharge power command, a two-layer model predictive control framework is used for optimization. The corrected charge and discharge power command is output through the synergistic effect of distributed optimization and robust controller. The charging and discharging operation of the hybrid energy storage system is controlled according to the corrected charging and discharging power command. The process of constructing a hybrid energy storage optimal scheduling model after feature enhancement based on the predicted electricity load data and the predicted power generation data, and outputting initial charge and discharge power commands based on the hybrid energy storage optimal scheduling model, includes: A multidimensional spatiotemporal state space is constructed based on the predicted electricity load data and the predicted power generation data. A spatiotemporal attention mechanism is used to enhance the features of the multidimensional spatiotemporal state space to obtain an enhanced state space. A hierarchical deep reinforcement learning network is constructed. The state encoder of the hierarchical deep reinforcement learning network uses a dual-flow graph convolutional neural network to extract the temporal and spatial features of the augmented state space. The action generator of the hierarchical deep reinforcement learning network uses a gated recurrent unit to fuse the temporal and spatial features to generate a charge-discharge power decision. The value evaluator of the hierarchical deep reinforcement learning network uses a dual evaluation network structure to evaluate the value function of the charge-discharge power decision. Based on historical scheduling data, an immediate reward item containing multiple costs is constructed. A long-term reward item is obtained by modeling the charging and discharging operation sequence using a recurrent neural network. The immediate reward item is dynamically adjusted through adaptive weight optimization, and the combination yields an adaptive hybrid reward function. The adaptive hybrid reward function is combined with the hierarchical deep reinforcement learning network to construct a hybrid energy storage optimization scheduling model, and the hybrid energy storage optimization scheduling model is trained. The current state information is input into the trained hybrid energy storage optimization scheduling model, and the initial charging and discharging power command is output through the action generator. The method involves constructing an immediate reward term containing multiple costs based on historical scheduling data, using a recurrent neural network to model the charge / discharge operation sequence to obtain a long-term reward term, and dynamically adjusting the immediate reward term through adaptive weight optimization to obtain an adaptive hybrid reward function, including: Collect historical scheduling data of hybrid energy storage systems, and construct real-time reward items including multiple costs based on the historical scheduling data; A recurrent neural network is used to perform time-series modeling on the charging and discharging operation sequences in the historical scheduling data, extract the long-term dependency features of the charging and discharging operation sequences, and construct a long-term reward item that reflects the long-term operating benefits of the energy storage system based on the long-term dependency features. An evaluation index is constructed based on the system operation status sequence and cost-benefit sequence. The scheduling effect score of each group of charging and discharging operations in the historical scheduling data is calculated based on the evaluation index. The scheduling effect score is input into an adaptive weight optimizer. The adaptive weight optimizer dynamically adjusts the weight coefficient of each cost sub-item in the immediate reward item according to the changing trend of the scheduling effect score. When the scheduling effect score corresponding to a specific cost sub-item decreases, the weight coefficient of that cost sub-item is increased. The immediate reward and the long-term reward are weighted and combined to obtain an adaptive hybrid reward function.
2. The method according to claim 1, characterized in that, A two-layer temporal memory network is used to extract the temporal features of the power load data. A multi-head attention mechanism is then used to process these temporal features to establish a data center power load prediction model. Based on this model, the data center power load for future time periods is predicted, resulting in predicted power load data, including: The electricity load data is preprocessed, and the maximum and minimum value normalization method is used to convert the electricity load data into standardized load data. A two-layer temporal memory network is used to extract the temporal features of the standardized load data. A random deactivation layer is set between the two layers of the two-layer temporal memory network. The temporal features are fused with time features, temperature features, and load features to obtain fused features. The fused features are processed using a multi-head attention mechanism to calculate multi-head attention weights. These weights are determined by attention scores calculated by multiple attention heads, which are obtained from query vectors, key vectors, and value vectors. The multi-head attention weights are weighted and the fused features are weighted to obtain attention features, and a data center power load prediction model is established based on the attention features. A loss function is constructed that includes a weighted combination of the root mean square error term and the mean absolute percentage error term. The data center power load prediction model is then optimized and trained based on the loss function. Input the feature data corresponding to the time period to be predicted into the trained data center power load prediction model to obtain the initial prediction result; The confidence interval of the initial prediction result is calculated based on the historical prediction error distribution. When the initial prediction result exceeds the confidence interval, it is corrected to obtain the predicted electricity load data.
3. The method according to claim 1, characterized in that, A renewable energy power generation prediction model is established based on the power generation data of the aforementioned renewable energy sources. The power generation of renewable energy sources within a future time period is predicted using this model, resulting in predicted power generation data, including: The power generation data of the renewable energy source is preprocessed, and the power generation data is denoised by wavelet transform. The denoised power generation data is then converted into standardized power generation data. A hybrid prediction model is constructed. The periodic decomposition unit of the hybrid prediction model uses the variational mode decomposition method to decompose the standardized power generation data into trend components and fluctuation components. The trend prediction unit of the hybrid prediction model uses a long short-term memory network structure to model the trend components, and the fluctuation prediction unit of the hybrid prediction model uses a gated recurrent neural network structure to model the fluctuation components. Meteorological forecast data is input into the hybrid forecast model, and the hybrid forecast model is optimized and trained based on the combined loss function of root mean square error and mean absolute error. The meteorological forecast data corresponding to the time period to be predicted is input into the trained hybrid forecast model to obtain the predicted power generation data.
4. The method according to claim 1, characterized in that, Based on the operating status data of the hybrid energy storage system and the initial charge / discharge power command, a two-layer model predictive control framework is used for optimization. Through the synergistic effect of distributed optimization and a robust controller, a corrected charge / discharge power command is output, including: Based on the operating status data of the hybrid energy storage system, the energy storage units are grouped to construct a hierarchical system state space. State feature vectors are extracted based on variational autoencoders. A hybrid structure of gated recurrent neural networks and residual convolutional networks is used to extract spatiotemporal features from the state feature vectors to construct a multimodal state prediction model. A system dynamic response model is established, the output of the multimodal state prediction model is used to estimate the state through a particle filter algorithm, and the prediction results of the system dynamic response model are fitted with a probability distribution. An adaptive constraint estimator is constructed based on the information entropy criterion. A two-layer model predictive control framework is constructed, wherein the long-time domain optimization layer performs prediction optimization based on the multimodal state prediction model, and the short-time domain execution layer dynamically corrects the initial charge and discharge power command according to the prediction error of the long-time domain optimization layer to obtain the initial correction result. Based on the initial correction results, a multi-timescale optimization objective function is constructed. The multi-timescale optimization objective function is decomposed using the distributed alternating direction multiplier method, and the sub-problems are solved. The solution results are then input into the robust controller. The initial charge / discharge power command is co-optimized with the robust controller through the two-layer model predictive control framework, and the corrected charge / discharge power command is output.
5. The method according to claim 4, characterized in that, A two-layer model predictive control framework is constructed, wherein the long-time domain optimization layer performs prediction optimization based on the multimodal state prediction model, and the short-time domain execution layer dynamically corrects the initial charge and discharge power command based on the prediction error of the long-time domain optimization layer to obtain the initial correction result, including: Construct a multimodal state space and map the multimodal state space into a long-time domain prediction state set and a short-time domain execution state set; Based on the long-term predicted state set, a gated jump connection recurrent neural network structure is used to extract multi-scale state temporal features. The multi-scale state temporal features are input into a prediction optimizer constructed by a conditional generative adversarial network, and the predicted state sequence is output. A short-term execution layer is established based on the short-term execution state set. The short-term execution layer uses a bidirectional long short-term memory network to calculate the temporal correlation between the predicted state sequence and the actual state sequence, and generates a prediction error feature matrix. An adaptive corrector is constructed, which performs multi-dimensional feature extraction on the prediction error feature matrix to obtain multi-dimensional features, and constructs a multi-head cross-attention mechanism based on the multi-dimensional features to generate adaptive correction weights. A deep neural network with residual connections is used to perform nonlinear feature fusion of the adaptive correction weights, the multi-scale state time-series features and the initial charge-discharge power command to obtain a nonlinear feature fusion result. The nonlinear feature fusion result is input into a probability distribution generation network. The probability distribution generation network samples and generates a set of candidate correction coefficients in the latent variable space based on the distribution characteristics of the prediction error feature matrix. The set of candidate correction coefficients is evaluated, and the optimal correction coefficient is selected based on the Bayesian optimization criterion. The optimal correction coefficient is used to dynamically adjust the initial charge and discharge power command to obtain the initial correction result.
6. A data center hybrid energy storage and renewable energy intelligent scheduling and control system, used to implement the method of any one of claims 1-5, characterized in that, include: The first unit is used to collect data on the power load of data centers, the operating status of hybrid energy storage systems, and the power generation of renewable energy. The second unit is used to extract the temporal features of the power load data using a two-layer temporal memory network, process the temporal features through a multi-head attention mechanism to establish a data center power load prediction model, and predict the data center power load in the future time period based on the power load prediction model to obtain predicted power load data. The third unit is used to establish a renewable energy power generation prediction model based on the power generation data of the renewable energy source, and to predict the power generation of renewable energy source in the future time period according to the renewable energy power generation prediction model to obtain the predicted power generation data. The fourth unit is used to construct a hybrid energy storage optimization scheduling model after feature enhancement based on the predicted electricity load data and the predicted power generation data, and to output an initial charging and discharging power command based on the hybrid energy storage optimization scheduling model. The fifth unit is used to optimize the hybrid energy storage system based on the operating status data and the initial charge and discharge power command using a two-layer model predictive control framework, and output the corrected charge and discharge power command through the synergistic effect of distributed optimization and robust controller. The sixth unit is used to control the charging and discharging operation of the hybrid energy storage system according to the corrected charging and discharging power command.
7. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 5.
8. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 5.