Urban regional integrated energy system multi-agent state prediction method and device and computer storage medium
By using a multi-level agent state feature library, incremental PCA, and multi-agent collaborative reinforcement learning, combined with the CNN-BiGRU-AT model, the problems of low parameter identification accuracy and insufficient real-time performance in urban integrated energy systems are solved, achieving efficient state prediction and scheduling optimization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- STATE GRID SHANGHAI MUNICIPAL ELECTRIC POWER CO
- Filing Date
- 2026-05-14
- Publication Date
- 2026-06-12
Smart Images

Figure CN122198261A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the fields of power system security defense and integrated energy system dispatching technology, and in particular to a multi-agent collaborative optimization dispatching method, device, and computer storage medium for an urban area integrated energy system. Background Technology
[0002] As urban energy systems develop towards multi-energy complementarity and flexible interaction, the collaborative operation of multiple intelligent agents has become the key to improving system resilience and energy efficiency.
[0003] However, existing technologies have the following problems in multi-agent state prediction: Low parameter identification accuracy: Traditional methods struggle to accurately identify key operational parameters of multi-agent systems in dynamic environments, leading to inaccurate prediction model inputs. Poor adaptability of prediction models: Existing prediction models are mostly based on static or single data sources, making it difficult to handle multi-source heterogeneous time series data; Insufficient real-time performance: The prediction process is computationally complex and cannot meet the requirements of online scheduling and real-time control; Lack of hierarchical collaboration: There is a lack of parameter transfer and state collaborative prediction mechanisms from the device layer to the region layer. Summary of the Invention
[0004] This application provides a method, apparatus, device, and computer storage medium for multi-agent state prediction in urban integrated energy systems, which addresses the problems of low accuracy in identifying multi-agent state prediction parameters, insufficient adaptability and real-time performance of prediction models, and lack of hierarchical coordination in current energy systems.
[0005] This application provides the following solution: Firstly, this application provides a multi-agent state prediction method for urban regional integrated energy systems. Based on a multi-level agent operating state feature library, a global sensitivity analysis is performed on the original parameter set to screen a subset of sensitive parameters. Incremental principal component analysis is used to reduce the dimensionality of the subset of sensitive parameters, and a multi-agent collaborative reinforcement learning algorithm is used to dynamically optimize the parameters in the reduced principal component space to obtain an identified set of key parameters. Based on the time-series data of the key parameters in the set of key parameters, a CNN-BiGRU-AT combined prediction model is used to make predictions to obtain the state prediction results. In the CNN-BiGRU-AT combined prediction model, the first convolutional layer of the convolutional neural network uses the SoftSign activation function, and the second convolutional layer uses the Swish activation function.
[0006] Secondly, this application proposes a multi-agent state prediction device for an integrated energy system in an urban area, comprising: an acquisition module, used to perform global sensitivity analysis on the original parameter set based on a multi-level agent operating state feature library to screen a subset of sensitive parameters, use incremental principal component analysis to reduce the dimensionality of the subset of sensitive parameters, and use a multi-agent collaborative reinforcement learning algorithm to dynamically optimize the parameters in the reduced principal component space to obtain an identified set of key parameters; and an execution module, used to perform prediction based on the time-series data of the key parameters in the set of key parameters using a CNN-BiGRU-AT combined prediction model to obtain the state prediction result, wherein the first convolutional layer of the convolutional neural network in the CNN-BiGRU-AT combined prediction model uses the SoftSign activation function, and the second convolutional layer uses the Swish activation function.
[0007] Thirdly, this application provides an electronic device, including: one or more processors; and A memory associated with the one or more processors, the memory being used to store program instructions that, when read and executed by the one or more processors, perform the steps of the method described above.
[0008] Fourthly, this application provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the method.
[0009] According to the technical solution provided in this application, a multi-level agent operating state feature library is used to perform global sensitivity analysis on the original parameter set to screen a subset of sensitive parameters. Incremental principal component analysis (PCA) is then used to reduce the dimensionality of this subset of sensitive parameters. A multi-agent collaborative reinforcement learning algorithm is employed to dynamically optimize the parameters in the reduced principal component space, obtaining a set of identified key parameters. By integrating global sensitivity analysis, incremental PCA dimensionality reduction, and multi-agent collaborative reinforcement learning, key parameters can be screened and optimized online from high-dimensional, time-varying data, making the input of the prediction model closer to the actual dynamics of the system. Based on the time-series data of the key parameters in the key parameter set, a CNN-BiGRU-AT combined prediction model is used for prediction to obtain the state prediction result. In this model, the first convolutional layer of the convolutional neural network uses the SoftSign activation function, and the second convolutional layer uses the Swish activation function. This combines the local feature extraction capability of CNN, the long-term temporal dependency learning capability of BiGRU, and the key information focusing capability of the attention mechanism, and optimizes the activation function, effectively improving the prediction accuracy for the state of complex, non-stationary energy systems.
[0010] Of course, any product implementing this application does not necessarily need to achieve all of the advantages described above at the same time. Attached Figure Description
[0011] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0012] Figure 1 This is a flowchart illustrating the method applicable to the embodiments of this application; Figure 2 This is a schematic diagram of the key parameter tree proposed in the embodiments of this application; Figure 3 This is a schematic diagram of the parameter identification process for fusion sensitivity analysis, MACSARL, and PCA proposed in the embodiments of this application; Figure 4 This is a schematic diagram of the CNN-BiGRU-AT model structure proposed in the embodiments of this application; Figure 5 These are the prediction results of the various models proposed in the embodiments of this application; Figure 6 A schematic block diagram of an electronic device provided in an embodiment of this application. Detailed Implementation
[0013] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of this application are within the scope of protection of this application.
[0014] The terminology used in the embodiments of this invention is for the purpose of describing particular embodiments only and is not intended to limit the invention. The singular forms “a,” “the,” and “the” as used in the embodiments of this invention and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise.
[0015] It should be understood that the term "and / or" used in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Additionally, the character " / " in this article generally indicates that the preceding and following related objects have an "or" relationship.
[0016] Depending on the context, the word "if" as used here can be interpreted as "when," "when," "in response to determination," or "in response to detection." Similarly, depending on the context, the phrase "if determination" or "if detection (of the stated condition or event)" can be interpreted as "when determination," "in response to determination," "when detection (of the stated condition or event)," or "in response to detection (of the stated condition or event)."
[0017] This application proposes a multi-agent state prediction method for an integrated energy system in an urban area, such as... Figure 1 As shown, the specific processing flow is as follows: S101, based on the multi-level agent operation state feature library, performs global sensitivity analysis on the original parameter set to screen a subset of sensitive parameters, uses incremental principal component analysis to reduce the dimensionality of the subset of sensitive parameters, and uses a multi-agent collaborative reinforcement learning algorithm to dynamically optimize the parameters in the reduced principal component space to obtain the identified key parameter set.
[0018] Accurate online identification of multi-agent parameters in urban integrated energy systems is crucial for the efficient operation and optimization of the system. To address the online identification requirements of this system, this application proposes an online identification method that integrates sensitivity analysis, multi-agent collaborative reinforcement learning (MACSARL), and principal component analysis (PCA), based on a multi-level agent operational state feature library.
[0019] By screening key parameters through sensitivity analysis and using multi-agent collaborative reinforcement learning to achieve dynamic optimization and adjustment of parameters, combined with principal component analysis to reduce data dimensionality and noise interference, the accuracy and efficiency of parameter identification are effectively improved, providing strong support for the intelligent operation of urban integrated energy systems.
[0020] First, identify the key parameters that need to be identified.
[0021] See Figure 2 As shown, key parameters may include, but are not limited to, key parameters at the device level, key parameters at the system level, and key parameters at the regional level.
[0022] (1) Key parameters of the equipment layer The equipment layer, as the foundation of the system, encompasses a variety of key parameters. Energy conversion parameters reflect the equipment's ability to convert one form of energy into another, such as the electrical energy conversion efficiency of a generator and the thermal energy conversion efficiency of a heat pump; their accuracy determines the energy conversion utilization rate. Loss characteristic parameters reflect the energy loss during equipment operation, such as the copper and iron losses of a transformer, which directly affect the overall energy consumption of the system. Dynamic response parameters relate to the speed and extent of the equipment's response to external stimuli or changes in internal state, such as the start-up response time of a motor and load regulation response characteristics; these are of great significance to the stability and reliability of the system.
[0023] (2) Key parameters at the system level System-level parameters focus on energy transmission and coupling relationships. Network topology parameters describe the structure and connection methods of the energy transmission network in the system, including node connections and line impedance, affecting energy transmission paths and losses. Power flow coupling parameters involve the interaction and conversion relationships between different energy flows, such as the conversion power and efficiency of electro-thermal coupling equipment in an electric-thermal coupling system, which determine the overall energy utilization efficiency. Operating boundary parameters define the range of safe and stable system operation, such as the upper and lower limits of voltage and the allowable frequency fluctuation range of a power system, and the temperature and pressure limits of a thermal system, which are crucial for ensuring safe system operation.
[0024] (3) Key parameters of the regional layer The regional layer controls the system from a macro perspective. Supply and demand balance parameters measure the degree of matching between energy supply and demand within a region, such as the electricity supply and demand balance coefficient and the deviation between heat supply and demand; these are key indicators for stable system operation. Environmental coupling parameters reflect the interaction between the system and the external environment, such as parameters related to the impact of renewable energy power generation on environmental factors like sunlight and wind, and parameters related to the environmental impact of system emissions. Market response parameters reflect the system's ability to respond to fluctuations in energy market prices and policy changes, such as the energy price elasticity coefficient and demand-side participation rate; these are related to the system's economics and market adaptability. Secondly, determine the parameter transfer relationships between key parameters. The device layer transmits device status parameter confidence levels to the system layer, providing a basis for the system layer to understand device operational reliability and assisting the system layer in making energy allocation and scheduling decisions. The system layer transmits network coupling parameter gradients to the regional layer, helping the regional layer grasp the changing trends of energy transmission and coupling, so as to optimize energy layout from a macro perspective. The regional layer feeds back market incentive signals to the device layer, guiding the device layer to adjust its operating strategies according to market conditions, achieving overall economical and efficient operation of the system.
[0025] Next, the key parameters for fusion sensitivity analysis were selected.
[0026] Sensitivity analysis employed the Monte Carlo sampling method, simulating parameter variations through extensive random sampling to assess their impact on system output. The Sobol exponent S... i This is a measurement indicator, and the calculation formula is shown in Formula 1 below: Formula 1 In Formula 1 above, Var Xi (E ~Xi (Y|X i )) indicates that when other parameters are fixed, parameter X i Var(Y) is the variance of the conditional expectation of the system output Y caused by changes in X. The Sobol exponent reflects the variance of the parameter X. i The greater the exponent, the more significant the parameter's influence on the system output Y.
[0027] A global sensitivity analysis is performed on the original parameter set, calculating the Sobol exponent for each parameter. A threshold (e.g., 0.1) is set, and parameters with Sobol exponents greater than this threshold are grouped into a sensitive parameter subset, while those with exponents less than the threshold form a non-sensitive parameter subset. Sensitive parameters have a significant impact on system output and are the focus of subsequent parameter identification; non-sensitive parameters have a relatively small impact and can be simplified or temporarily disregarded. Through this screening process, key parameters are highlighted, reducing the computational load of parameter identification and improving efficiency.
[0028] Then, the key parameters are dimensionality reduced using principal component analysis.
[0029] Principal component analysis (PCA), based on incremental PCA techniques, aims to map high-dimensional data to a low-dimensional space while preserving the main information. Its core is to identify the main directions of data variation—the principal components—by analyzing the eigenvalues and eigenvectors of the data's covariance matrix. These principal components are orthogonal to each other and ordered according to their contribution to the data variance; the first few principal components often encompass most of the data's information. See Equation 2 below.
[0030] Formula 2 In formula 2 above, x t It is the input data vector at time t, which refers to the high-dimensional sensitive parameter subset data that the system collects and filters in real time.
[0031] W t With W t+1 , is the principal component weight matrix at times t and t+1. In a physical sense, it is the transformation benchmark used by the system to "compress" and map high-dimensional sensitive parameter data to a low-dimensional principal component space.
[0032] η: Represents the update step size. It controls the magnitude of the update to the principal component matrix when the system receives new parameter data.
[0033] The instantaneous covariance matrix represents the input data.
[0034] Specifically, the function of Formula 2 above is to take into account the constantly incoming new data x t This mapping matrix will be continuously updated and corrected.
[0035] In online parameter identification for multi-agent systems, the collected parameter data is high-dimensional and contains noise and redundant information. Incremental PCA continuously updates the principal component space based on new data, achieving high-dimensional data compression. The sensitive parameter subset data is processed by incremental PCA to reduce the dimensionality to the principal component space PC1-PC3. The reduced dimensionality significantly decreases the computational load, while removing noise and redundancy, improving data quality, and making subsequent parameter optimization more accurate and efficient.
[0036] Finally, the key parameters after data dimensionality reduction are optimized.
[0037] In Multi-Agent Collaborative Reinforcement Learning (MACSARL), each agent perceives environmental information in its own state space and selects actions from the action space to execute. The effectiveness of actions is evaluated through a two-layer reward function, incentivizing agents to learn optimal strategies. Agents communicate and collaborate, sharing information and adjusting strategies based on environmental feedback to optimize overall system performance. For example, device-level agents adjust operating parameters based on real-time device status, system-level agents coordinate inter-device cooperation based on energy transmission conditions, and region-level agents optimize energy allocation strategies globally. Agents select actions based on their current state; after executing an action, the system state changes and the agent receives a reward. The reward function is shown in Equation 3.
[0038] Formula 3 In formula 3 above, R t p is the reward function for a multi-agent cooperative reinforcement learning algorithm. t and p t true These are the parameter estimates and the actual values, J, respectively. F T J F Let θ be the Jacobian matrix of the system. t and θ t 1 represents the policy parameters at the current and previous time steps, respectively, ω1MAE(p) t, p t true ω²Tr(J) measures the mean absolute error between the estimated and true values of parameters, focusing on parameter accuracy;F T J F ) Reflects system stability and is evaluated through correlation calculations of the Jacobian matrix; ω3·‖θ t -θ t-1 || F This involves penalizing policy oscillations to prevent the agent from frequently changing its policy. Based on reward feedback, the agent updates its policy using reinforcement learning algorithms, continuously iterating and optimizing parameters to better suit the system's actual operational needs.
[0039] Specifically, see Figure 3 As shown, the parameter identification process integrating sensitivity analysis, MACSARL, and PCA includes the following steps: (1) Initialization phase Construct and improve a multi-level agent operation state feature library, comprehensively collect historical system operation data, and extract and store various features. Determine the initial values and ranges of multi-agent parameters, initialize relevant parameters of the MACSARL algorithm (such as learning rate, discount factor, etc.) and PCA dimensionality reduction parameters (such as the number of principal components to retain), laying the foundation for subsequent parameter identification. (2) Parameter screening stage A global sensitivity analysis is performed on the original parameter set to calculate the Sobol exponent for each parameter, thus identifying sensitive and insensitive parameter subsets. In the embodiments proposed in this application, the focus is on the sensitive parameter subset, which is the set of parameters whose Sobol exponent is greater than a preset threshold. This is because the sensitive parameter subset has a significant impact on the system output and is the core object of parameter identification; the insensitive parameter set can be processed or simplified in subsequent analyses as appropriate.
[0040] (3) Data dimensionality reduction stage The selected subset of sensitive parameters is input into an incremental PCA model for high-dimensional data compression, reducing it to the principal component space PC1-PC3. This reduces data dimensionality, computational complexity, and removes noise and redundant information, improving data quality and providing a high-quality data foundation for parameter optimization.
[0041] (4) Parameter optimization stage The parameters are optimized using MACSARL in the principal component space. The agent selects an action based on the current state, receives a reward according to a two-layer reward function, updates the policy, and iteratively optimizes the parameters. Through continuous interaction and learning with the environment, the parameters gradually converge to the optimal value, adapting to the dynamic changes of the system.
[0042] (5) Evaluation and Feedback Phase The identified parameters are applied to the system model to evaluate system performance indicators (such as energy efficiency, operating costs, and stability). If the performance indicators do not meet expectations, the relevant parameters or learning strategies are adjusted, and the process returns to the parameter selection or optimization stage for further iteration. If the performance indicators meet the expectations, the current parameters are used as system operation control parameters, and the system's operating status is continuously monitored. The parameter identification process is executed iteratively to ensure that the system is always in a good operating state. Finally, the identified set of key parameters is obtained.
[0043] S102, based on the time series data of key parameters in the key parameter set, the state prediction result is obtained by using the CNN-BiGRU-AT combined prediction model.
[0044] In the CNN-BiGRU-AT combined prediction model, the first convolutional layer of the convolutional neural network uses the SoftSign activation function, and the second convolutional layer uses the Swish activation function.
[0045] This application proposes a deep reinforcement learning algorithm incorporating an attention mechanism to address the real-time prediction problem of multi-agent operating parameters and states in urban integrated energy systems, and constructs a CNN-BiGRU-AT combined prediction model. This model takes multi-agent operating parameters as input and leverages deep neural networks to uncover operating patterns, achieving accurate real-time prediction. Theoretical analysis and simulation verification demonstrate the model's effectiveness and advantages in this field, providing strong support for the efficient operation and management of urban integrated energy systems. The CNN-BiGRU-AT combined prediction model sequentially includes: a one-dimensional convolutional neural network layer for extracting local temporal features, a bidirectional gated recurrent unit layer for learning long-term temporal dependencies, and an attention mechanism layer for weighting the output of the bidirectional gated recurrent unit layer.
[0046] For BiGRU, manually extracting feature values from historical data is not conducive to its analysis of the temporal sequence and implicit patterns of load data. In contrast, the unique structure of CNNs can fully exploit the correlations between data points, extract valuable features, and capture the periodicity in historical load data. When load data fluctuates significantly, modeling the features extracted by CNNs with BiGRU helps to better learn the periodic changes and patterns in load demand.
[0047] Based on this, the technical solution proposed in this application uses the AT mechanism to assign different probability weights to the hidden states of BiGRU, further strengthening the influence of important information in the input feature sequence, that is, constructing a short-term power load forecasting model based on the CNN-BiGRU-AT combined model. The structure of the CNN-BiGRU-AT combined model for short-term power load forecasting is as follows: Figure 4As shown, historical information is input, and CNN is used to initially extract features. The BiGRU and AT mechanisms learn the internal variation law of the load from the features extracted by CNN, and finally output the prediction result.
[0048] The CNN is designed as a two-layer 1D CNN for extracting local features from the load data. Using a two-layer CNN for temporal feature extraction is primarily to effectively capture temporal information and features. When dealing with temporal data, a single-layer convolutional operation may not be sufficient to capture the temporal information; using two convolutional layers increases the CNN's ability to perceive temporal features. The BiGRU is also designed as a two-layer CNN, further studying the internal dynamic changes of the feature sequences extracted by the CNN to extract more complex global features. The main purpose of using two layers is to increase the model's representational and learning capabilities. By adding two BiGRU layers, the model's complexity is increased, providing more non-linear transformations and abstract representations, allowing the model to better capture the patterns in the input data. The AT mechanism calculates the probabilities corresponding to different feature vectors based on the weight allocation principle for the feature sequence vectors processed by BiGRU, continuously updating and iterating to obtain a better weight parameter matrix. The output of the AT mechanism is mapped using a fully connected layer to output the model's load prediction results.
[0049] The technical solution proposed in this application improves upon the existing model.
[0050] The loss function of the model is the mean square error (MSE) function, and its formula is shown in Equation 4.
[0051] Formula 4 In Equation 4 above, Loss is the overall error value of the model's prediction. It reflects the overall deviation between the CNN-BiGRU-AT model's prediction of the integrated energy system's state and the actual situation. The smaller this value, the higher the model's prediction accuracy and the better the model parameters fit.
[0052] n is the total number of time series data samples.
[0053] y i , is the true value of the system state at time i.
[0054] , is the predicted system state value at time i. That is, the load prediction result at the corresponding time is inferred from the time series data of key parameters identified in the early stage through the CNN-BiGRU-AT combined model.
[0055] Activation functions, as a crucial part of neural networks, perform nonlinear transformations on input vectors, thereby enhancing the network's ability to solve complex problems. In CNNs, the widely used activation function is ReLU, as shown in Equation 5.
[0056] Formula 5 Formula 6 Formula 7 In Equations 5-7 above, x (input variable) represents the feature input value received by the current neuron. Specifically, it is the local feature value obtained by the previous one-dimensional convolutional kernel after extracting features (weighted summation with bias) from the system's temporal operating parameters (such as load, temperature, state, etc.).
[0057] ReLU(x): Represents the output feature value of the ReLU function. Its physical mechanism is: when the extracted feature x is positively beneficial to the prediction (greater than 0), the information is fully retained; when it is negative, it is forcibly set to 0 (filtering out the information). `max` represents the maximum value.
[0058] Swish(x): Represents the output feature value of the Swish function. Unlike ReLU, its physical meaning lies in providing a smooth mapping and not fully saturating (truncating) when the input is less than 0. This allows the system to allow a small portion of negative (weaker) energy feature information to flow through the network. In the model, it is applied to the second convolutional layer, where the input x already has a higher level of abstract features, allowing Swish to better activate and learn these nonlinear relationships.
[0059] SoftSign(x): Represents the output feature value of the SoftSign function. Its physical meaning is to smoothly compress and normalize the unbounded input feature x to a valid range of (-1, 1). As the activation function of the first convolutional layer, it effectively avoids the initial input data of the integrated energy system from taking values that are too large or too small under extreme conditions, thus preventing gradient explosion and enabling the feature extraction process to converge faster and remain stable.
[0060] ReLU, as a non-linear activation function, helps CNNs learn non-linear features, improving the model's expressive power. Compared to traditional activation functions like Sigmoid and Tanh, it avoids the problem of the function's derivative approaching zero when the input is too large or too small, leading to gradient vanishing during backpropagation and preventing the model from updating. However, ReLU has an unavoidable issue: when the gradient is relatively large, neurons using the ReLU activation function may exhibit a "dead neuron" phenomenon, meaning they are not activated when data passes through them, causing the gradient to remain at zero.
[0061] Therefore, SoftSign and Swish activation functions are introduced into the CNN model, as shown in Equations 6 and 7. The SoftSign function is continuously differentiable, while ReLU is not differentiable at 0. This means that when using the backpropagation algorithm, SoftSign can provide smoother gradient information, thus training the neural network more stably. Furthermore, the output range of the SoftSign function is (-1, 1), compared to [0, ∞] of ReLU. The output value of SoftSign will not increase indefinitely, thus avoiding the gradient explosion problem in some cases. At the same time, the SoftSign function can converge to saturation faster when the input is a large positive or negative number, thus stopping the neural network computation more quickly and improving computational efficiency. The Swish activation function has a smooth gradient, unlike the ReLU activation function which has a discontinuous gradient problem. Therefore, it can also converge faster. The Swish activation function also has higher gradient information utilization because it does not completely saturate when the input is greater than 0, allowing more information to flow through the neural network and improving the model's performance.
[0062] To verify the effectiveness of different activation functions in this model, simulation experiments were conducted using data from the January 2014 Electrical Engineering Cup competition. With other model parameters fixed, the selection of activation functions for the CNN convolutional layers was adjusted. The prediction accuracy under different activation functions was determined by the magnitude of the mean square error (MSE). As shown in Table 1, compared to other combinations, when the first layer used the SoftSign activation function, the model using the ReLU or Swish activation function in the second layer had lower error, with the Swish activation function showing better results. Therefore, in the CNNBiGRU-AT combined model, the first layer of the CNN convolutional layers uses the SoftSign activation function, and the second layer uses the Swish activation function.
[0063] Table 1 Comparison of errors of models with different activation functions
[0064] Based on the comprehensive analysis of the above experimental results, the SoftSign activation function can effectively normalize the output for most inputs, thus it can be used as the activation function for the first layer of a CNN convolutional layer to reduce the risk of oversaturation and improve feature extraction performance. The Swish activation function can produce some non-linear behavior when activating neurons, which is helpful for learning non-linear relationships. Using the Swish activation function in the second convolutional layer of a CNN can further improve the model's representational power because the input to the second convolutional layer is already the output features of the first convolutional layer. These features have a higher level of abstraction, and using Swish can better activate these features and further improve the model's accuracy.
[0065] To verify the effectiveness of the constructed CNN-BiGRU-AT combined prediction model, historical data from 2007 in the Global Energy Prediction Competition dataset was selected as the data sample, totaling 8760 data sets. The last week's data was chosen as the test data. The iteration count for each model was set to 50, the learning rate to 0.01, the number of BiGRU and BiLSTM nodes to 16, the batch size to 64, and m to 24. Figure 5 As shown, comparing the prediction results of various models on December 26, 2007, the prediction results of several models are not very good, and none of them can reflect the load change trend on that day well. However, compared with other models, the CNN-BiGRU-AT combined model has a fitting curve that coincides with the actual value fitting curve at some times (the example real in the figure), and is also closer to the actual value fitting curve at other times, so the prediction effect is better.
[0066] Table 2 compares the prediction errors of each model on the test set. On this dataset, the CNN-BiGRUAT combined model shows better prediction results than the other models. Its MAPE value is 2.00%, RMSE value is 2778.89MW, MAE value is 2052.69MW, and R... 2 The efficiency was 94.18%. Compared to the worst-performing BiGRU model in the table, its MAPE decreased by 0.26%, RMSE decreased by 229.4 MW, MAE decreased by 282.14 MW, and R... 2 It increased by 1%.
[0067] Table 2. Prediction Errors of Each Model
[0068] Compared to the combined models CNN-BiGRU and CNN-BiLSTM, the BiGRU model achieves higher MAPE, RMSE, and MAE scores. 2 The results are lower than those of two combined models, indicating that using CNN for feature extraction is effective. Compared to the combined model BiGRU-AT, the BiGRU-AT combined model shows better prediction performance in terms of MAPE and MAE metrics, and better RMSE and R-value. 2 Looking at the metrics, BiGRU performs slightly better in prediction, but overall, the two are not significantly different.
[0069] Specifically, compared to the single-model BiGRU, the combined models CNN-BiGRU and CNN-BiLSTM reduced MAPE by 0.19% and 0.01%, respectively; RMSE by 172.68MW and 15.69MW, respectively; and MAE by 199.58MW and 11.91MW, respectively; R 2 They increased by 0.76% and 0.07% respectively.
[0070] Comparing the CNN-BiGRU model and the BiGRU-AT model, it can be found that the CNN-BiGRU model performs slightly better in prediction. Specifically, MAPE is reduced by 0.1%, RMSE by 190.11MW, MAE by 112.67MW, and R... 2 It improved by 0.84%. Comparing the CNN-BiLSTM model and the BiLSTM-AT model, it can be found that the BiLSTM-AT model has a slightly better prediction performance. Specifically, MAPE is reduced by 0.2%, RMSE is reduced by 105.48MW, MAE is reduced by 200.45MW, and R... 2 It increased by 0.47%.
[0071] Comparing the prediction metrics of CNN-BiGRU, CNN-BiLSTM, BiGRU-AT, and BiLSTM-AT models, the combined CNN-BiGRU and BiLSTM-AT model performs slightly better. Specifically, the CNN-BiGRU combined model reduces MAPE by 0.18%, RMSE by 156.99MW, MAE by 187.67MW, and R... 2This represents an improvement of 0.69%. Specifically, compared to the BiGRU-AT combined model, the BiLSTM-AT model reduced MAPE by 0.12%, RMSE by 138.6 MW, MAE by 125.45 MW, and R... 2 It improved by 0.62%. Compared with the prediction errors of the CNN-BiGRUAT ensemble model and several other models, the CNN-BiGRUAT ensemble model has lower MAPE, RMSE, and MAE values, and R... 2 The higher index value verifies that the CNN-BiGRU-AT combined model has higher prediction accuracy and better prediction performance.
[0072] S103, determine the state of the multi-agent system of the urban area integrated energy system based on the state prediction results.
[0073] This paper describes the establishment process of a combined CNN-BiGRU-AT model for short-term power load forecasting, consisting of CNN layers and a BiGRU with an embedded attention mechanism. The selection of the activation function is improved by replacing the ReLU activation functions of the first and second layers of the CNN with SoftSign and Swish activation functions, respectively. Next, simulation examples are conducted to test the proposed CNN-BiGRU-AT combined model and verify its predictive performance.
[0074] Accordingly, this application also proposes a multi-agent state prediction method and apparatus for an integrated energy system in an urban area, comprising: an acquisition module, used to perform global sensitivity analysis on the original parameter set based on a multi-level agent operating state feature library to screen a subset of sensitive parameters, use incremental principal component analysis to reduce the dimensionality of the subset of sensitive parameters, and use a multi-agent collaborative reinforcement learning algorithm to dynamically optimize the parameters in the reduced principal component space to obtain an identified set of key parameters; and an execution module, used to perform prediction based on the time series data of the key parameters in the set of key parameters using a CNN-BiGRU-AT combined prediction model to obtain the state prediction result, wherein the first convolutional layer of the convolutional neural network in the CNN-BiGRU-AT combined prediction model adopts the SoftSign activation function, and the second convolutional layer adopts the Swish activation function.
[0075] The various embodiments in this specification are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, for system or system embodiments, since they are basically similar to method embodiments, the description is relatively simple, and relevant parts can be referred to the descriptions in the method embodiments. The systems and system embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Those skilled in the art can understand and implement this without creative effort.
[0076] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, use and processing of the relevant data must comply with the relevant laws, regulations and standards of the relevant countries and regions, and corresponding operation entry points are provided for users to choose to authorize or refuse.
[0077] In addition, embodiments of this application also provide a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the method described in any of the foregoing method embodiments.
[0078] And an electronic device, comprising: One or more processors; and A memory associated with the one or more processors, the memory being used to store program instructions that, when read and executed by the one or more processors, perform the steps of the method described in any of the foregoing method embodiments.
[0079] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the method described in any of the foregoing method embodiments.
[0080] in, Figure 6An exemplary architecture of an electronic device is shown, which may include a processor X10, a video display adapter X11, a disk drive X12, an input / output interface X13, a network interface X14, and a memory X20. The processor X10, video display adapter X11, disk drive X12, input / output interface X13, network interface X14, and memory X20 can communicate with each other via a communication bus X30.
[0081] The processor X10 can be implemented using a general-purpose CPU, microprocessor, application-specific integrated circuit (ASIC), or one or more integrated circuits to execute relevant programs and implement the technical solution provided in this application.
[0082] The memory X20 can be implemented in the form of ROM (Read Only Memory), RAM (Random Access Memory), static storage device, dynamic storage device, etc. The memory X20 can store the operating system X21 for controlling the operation of the electronic device X00, and the basic input / output system (BIOS) X22 for controlling the low-level operations of the electronic device X00. Additionally, it can store a web browser X23, a data storage management system X24, and an icon font processing system X25, etc. The aforementioned icon font processing system X25 can be the application program that specifically implements the aforementioned steps in this embodiment. In summary, when implementing the technical solution provided in this application through software or firmware, the relevant program code is stored in the memory X20 and is called and executed by the processor X10.
[0083] The input / output interface X13 is used to connect input / output modules to enable information input and output. Input / output modules can be configured as components within the device (not shown in the figure) or externally connected to the device to provide corresponding functions. Input devices may include keyboards, mice, touchscreens, microphones, various sensors, etc., while output devices may include displays, speakers, vibrators, indicator lights, etc.
[0084] The network interface X14 is used to connect the communication module (not shown in the figure) to enable communication between this device and other devices. The communication module can communicate via wired means (such as USB, Ethernet cable, etc.) or wireless means (such as mobile network, WIFI, Bluetooth, etc.).
[0085] Bus X30 includes a pathway for transmitting information between various components of the device, such as processor X10, video display adapter X11, disk drive X12, input / output interface X13, network interface X14, and memory X20.
[0086] It should be noted that although the above-described device only shows the processor X10, video display adapter X11, disk drive X12, input / output interface X13, network interface X14, memory X20, bus X30, etc., in specific implementations, the device may also include other components necessary for normal operation. Furthermore, those skilled in the art will understand that the above-described device may only include the components necessary for implementing the solution of this application, and does not necessarily need to include all the components shown in the figures.
[0087] As can be seen from the above description of the embodiments, those skilled in the art can clearly understand that this application can be implemented by means of software plus necessary general-purpose hardware platforms. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a computer program product. This computer program product can be stored in a storage medium, such as ROM / RAM, magnetic disk, optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute the methods described in various embodiments or some parts of the embodiments of this application.
[0088] The technical solutions provided in this application have been described in detail above. Specific examples have been used to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. Furthermore, those skilled in the art will recognize that, based on the ideas of this application, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A multi-agent state prediction method for an integrated energy system in an urban area, characterized in that, include: Based on the multi-level agent operation state feature library, global sensitivity analysis is performed on the original parameter set to screen the sensitive parameter subset. The dimensionality of the sensitive parameter subset is reduced by incremental principal component analysis. Then, a multi-agent collaborative reinforcement learning algorithm is used to dynamically optimize the parameters in the dimensionality-reduced principal component space to obtain the identified key parameter set. Based on the key parameter time series data in the key parameter set, a prediction is made using the CNN-BiGRU-AT combined prediction model to obtain the state prediction result. In the CNN-BiGRU-AT combined prediction model, the first convolutional layer of the convolutional neural network adopts the SoftSign activation function, and the second convolutional layer adopts the Swish activation function.
2. The method according to claim 1, characterized in that, The reward function R of the multi-agent cooperative reinforcement learning algorithm t for: Among them, R t p is the reward function for a multi-agent cooperative reinforcement learning algorithm. t and p t true These are the parameter estimates and the actual values, J, respectively. F T J F W is the Jacobian matrix of the system. t and W t 1 represents the policy parameters at the current and previous time steps, respectively; ω1, ω2, and ω3 are the weight coefficients; ω1 represents the MAE(p) value. t p t true ω²Tr(J) measures the mean absolute error between the estimated and true values of parameters, focusing on parameter accuracy; F T J F The stability of the system is reflected by the Jacobian matrix and evaluated through related calculations; ω3·‖Wt -Wt-1‖ F It is a penalty strategy oscillation.
3. The method according to claim 1, characterized in that, The global sensitivity analysis uses the Sobol exponent method, and the subset of sensitive parameters is the set of parameters whose Sobol exponent is greater than a preset threshold.
4. The method according to any one of claims 1-3, characterized in that: The CNN-BiGRU-AT combined prediction model sequentially includes: a one-dimensional convolutional neural network layer for extracting local temporal features, a bidirectional gated recurrent unit layer for learning long-term temporal dependencies, and an attention mechanism layer for weighting the output of the bidirectional gated recurrent unit layer.
5. The method according to claim 4, characterized in that, The first convolutional layer of the CNN-BiGRU-AT combined prediction model uses the SoftSign activation function, and the second convolutional layer uses the Swish activation function.
6. The method according to claim 1, characterized in that, The dimensionality reduction of the sensitive parameter subset is performed using incremental principal component analysis, including: The selected subset of sensitive parameters is input into the incremental PCA model for high-dimensional data compression, reducing it to the principal component space PC1-PC3.
7. The method according to claim 1, characterized in that, The sensitivity analysis employed the Monte Carlo sampling method, simulating parameter variations through random sampling to assess their impact on the system output. The Sobol exponent S... i This is a metric, and the calculation formula is shown below: Among them, Var Xi (E ~Xi (Y|X i )) indicates that when other parameters are fixed, parameter X i Var(Y) is the variance of the conditional expectation of the system output Y caused by the change, and Var(Y) is the total variance of the system output Y.
8. A multi-agent state prediction device for an integrated energy system in an urban area, characterized in that, include: The acquisition module is used to perform global sensitivity analysis on the original parameter set based on the multi-level intelligent agent running state feature library to screen a subset of sensitive parameters, use incremental principal component analysis to reduce the dimensionality of the subset of sensitive parameters, and use a multi-agent collaborative reinforcement learning algorithm to dynamically optimize the parameters in the dimensionality-reduced principal component space to obtain the identified key parameter set. The execution module is used to make predictions based on the key parameter time series data in the key parameter set, using the CNN-BiGRU-AT combined prediction model to obtain the state prediction result. In the CNN-BiGRU-AT combined prediction model, the first convolutional layer of the convolutional neural network adopts the SoftSign activation function, and the second convolutional layer adopts the Swish activation function.
9. An electronic device, characterized in that, include: One or more processors; as well as A memory associated with the one or more processors, the memory being used to store program instructions that, when read and executed by the one or more processors, perform the steps of the method according to any one of claims 1 to 7.
10. A computer program product, comprising a computer program, characterized in that, When executed by a processor, the computer program implements the steps of the method according to any one of claims 1 to 7.