Virtual power plant dynamic aggregation method and system based on transformer and adaptmlp

By constructing a multi-agent system for a virtual power plant using Transformer and AdaptMLP, the problem of aggregation and scheduling of virtual power plants under changing equipment conditions is solved, achieving efficient adaptive equipment scheduling and generalization capabilities.

CN121457746BActive Publication Date: 2026-06-19SHANGHAI JIAOTONG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHANGHAI JIAOTONG UNIV
Filing Date
2025-12-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing virtual power plant methods based on deep reinforcement learning are insufficient in handling variable-length inputs and extracting features from multiple coupled devices. They are difficult to adapt to changes in device type and scale, lack adaptive network architecture, and thus have poor generalization ability.

Method used

A Transformer encoder is used to extract global coupling features between devices, and an AdaptMLP module is used to achieve adaptive parameter adjustment, thereby constructing a virtual power plant multi-agent system that dynamically aggregates various distributed energy resources.

Benefits of technology

It achieves efficient aggregation and scheduling under conditions of heterogeneous equipment and dynamic changes in scale, can automatically adapt to the access of new producers and consumers, does not require retraining the model, and improves generalization ability.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121457746B_ABST
    Figure CN121457746B_ABST
Patent Text Reader

Abstract

This invention provides a method and system for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP, relating to the field of power system optimization and scheduling. The method includes: using a Transformer encoder to extract global coupling features between devices through a multi-head self-attention mechanism; using an AdaptMLP module to achieve adaptive parameter adjustment through a gating mechanism and a multi-expert network; and combining the MAPPO algorithm to solve a multi-agent reinforcement learning problem. This invention can adapt to heterogeneous equipment and dynamic changes in the number of prosumers, supports the immediate access of new prosumers without retraining, and is applicable to the aggregation and optimized scheduling of virtual power plant resources, including various types of equipment such as energy storage, electric boilers, electric chillers, and interruptible loads.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of power system optimization and dispatching, specifically involving a method and system for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP. Background Technology

[0002] Virtual power plants (VPPs) provide ancillary services in the electricity market by aggregating distributed energy resources. With the large-scale integration of resources such as distributed photovoltaics, energy storage systems, and controllable loads, the number of prosumers and equipment configurations in VPPs change dynamically. New prosumers may join or leave the VPP at any time, leading to continuous adjustments in the system's scale and structure. Existing deep reinforcement learning-based methods typically employ fully connected neural networks or recurrent neural networks, which are insufficient in handling variable-length inputs (changing prosumer numbers) and extracting features from multiple coupled devices. They also struggle to adapt to new device integration scenarios, requiring retraining. Furthermore, there is a lack of adaptive network architectures for the dynamic aggregation scenarios of VPPs, resulting in poor generalization ability when facing changes in device type, scale, and operating environment. Summary of the Invention

[0003] The purpose of this invention is to overcome the shortcomings of the prior art and provide a method and system for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP. The method extracts global coupling features between devices through Transformer encoder and realizes adaptive parameter adjustment through AdaptMLP module, thereby solving the problem of efficient aggregation and scheduling of virtual power plants under the conditions of heterogeneous equipment and dynamic changes in scale.

[0004] Technical solution

[0005] To achieve the above objectives, the present invention adopts the following technical solution:

[0006] A method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP includes the following steps:

[0007] Step 1: Construct a virtual power plant multi-agent system, including a VPP operator agent and multiple prosumer agents, with each prosumer agent configured with various distributed energy devices;

[0008] Step 2: Collect physical observations of the equipment of each producer and consumer to form an observation matrix. The physical observations of the equipment include the energy storage state of charge, the electric boiler temperature, the chiller temperature, and the interruptible load status;

[0009] Step 3: Observation matrix Input a Transformer encoder, extract inter-device coupling features using a multi-head self-attention mechanism, and calculate the query matrix. Key matrix Sum matrix The global interaction feature vector is obtained by calculating attention weights. ;

[0010] Step 4: Construct context vectors This includes information on equipment type, capacity, ambient temperature, and electricity price;

[0011] Step 5: Output features of Transformer With context vector The AdaptMLP module is input, and four sets of adaptive weights are generated through a gating mechanism. The weights are applied to the outputs of the four expert networks, and the adaptively adjusted decision features are obtained through weighted fusion.

[0012] Step 6: Output the scheduling actions of each device based on the decision characteristics, including energy storage charging and discharging power, electric boiler power, chiller power, and interruptible load reduction amount;

[0013] Step 7: Interact with the environment to obtain reward feedback and update the network parameters of Transformer and AdaptMLP;

[0014] Step 8: Repeat steps S2-S7 until the model converges.

[0015] As a further aspect of the present invention, the method for constructing the physical observations of the equipment in step S2 is as follows:

[0016] For the first type of load including energy storage systems, electric boilers, electric chillers, and interruptible loads Individual consumers at any time observation matrix include:

[0017] Energy storage system: State of charge ;

[0018] Electric boiler: water tank temperature Normalization ;

[0019] Electric chiller: chilled water temperature Normalization ;

[0020] Interruptible load: Current state ;

[0021] The dimension of the observation matrix is ,in For the number of devices, The feature dimensions for each device.

[0022] As a further aspect of the present invention, the multi-head self-attention mechanism of the Transformer encoder in step S3 is specifically as follows:

[0023]

[0024] in:

[0025]

[0026]

[0027] in: The dimension of the key vector. For the number of attention heads, , , , This is a learnable weight matrix.

[0028] As a further aspect of the present invention, the context vector in step S4 The construction includes:

[0029] Equipment type vector: One-hot encoding is used to represent energy storage, electric boilers, chillers, and interruptible loads;

[0030] Equipment capacity: Rated power or capacity after normalization;

[0031] Environmental information: outdoor temperature, solar radiation intensity;

[0032] Market information: real-time electricity prices, ancillary service prices;

[0033] The above information is pieced together to form the dimension. The context vector.

[0034] As a further aspect of the present invention, the adaptive adjustment process of the AdaptMLP module in step S5 is specifically as follows:

[0035] The feature transformation formula is:

[0036]

[0037] The formula for calculating the gating weight is:

[0038]

[0039] The adaptive output formula is:

[0040]

[0041] in: , For the first The weights and biases of an expert network, These are the learnable parameters of the gated network. This is the context vector.

[0042] As a further aspect of the present invention, the output constraint of the device scheduling action in step S6 is:

[0043] Energy storage charging and discharging power: ;

[0044] Electric boiler power: ;

[0045] Electric chiller power: ;

[0046] Interruptible load reduction: ;

[0047] The action is mapped to the corresponding constraint range through the sigmoid activation function.

[0048] As a further aspect of the present invention, the network parameter update in step S7 employs the Asynchronous Advantageous Actor-Critic Algorithm (MAPPO), which includes:

[0049] Policy network: Outputs the probability distribution of device scheduling actions;

[0050] Network evaluation: Estimating the value function of the current state-action pair;

[0051] Advantage function: ;

[0052] Policy gradient update: Maximize the truncated objective function.

[0053] As a further aspect of the present invention, the method further includes a dynamic expansion step:

[0054] When new consumers connect to the virtual power plant:

[0055] The Transformer encoder automatically adapts to changes in the length of the input sequence through a positional encoding mechanism;

[0056] The AdaptMLP module uses the context vector of the new pro-consumer. Dynamically adjust the weights of the expert network;

[0057] There is no need to retrain the entire model; only the gating network parameters need to be fine-tuned.

[0058] Furthermore, the present invention also provides a virtual power plant dynamic aggregation system based on Transformer and AdaptMLP, comprising:

[0059] Data acquisition module: Collects real-time operating status and physical parameters of producer-consumer equipment;

[0060] Feature extraction module: Employs a Transformer encoder to extract coupling features between devices, including a multi-head self-attention calculation unit, a residual connection unit, and a layer normalization unit;

[0061] Adaptive adjustment module: Adopts the AdaptMLP structure, including 4 parallel expert networks, a gating mechanism unit and a feature fusion unit;

[0062] Decision output module: Generates device scheduling actions based on fusion features;

[0063] Parameter update module: Updates network parameters based on environmental feedback.

[0064] As a further embodiment of the present invention, the feature extraction module adopts a multi-layer Transformer encoder stacked structure, each layer including a multi-head self-attention sub-layer, a feedforward neural network sub-layer, and residual connections.

[0065] Beneficial effects

[0066] Compared with the prior art, the present invention has the following beneficial effects:

[0067] 1) It can compute global dependencies between devices in parallel and effectively capture the complex coupling characteristics of multiple energy flows such as electricity, heat and cold;

[0068] 2) It can dynamically adjust network parameters according to changes in device type, capacity, and environment, and can adapt to new consumer access and device configuration changes without retraining, demonstrating strong generalization ability;

[0069] 3) It can be applied to virtual power plants that include various types of equipment such as energy storage, electric boilers, electric chillers, and interruptible loads, meeting the real-time scheduling needs of the ancillary services market. Attached Figure Description

[0070] Figure 1 This is an overall flowchart of the method of the present invention.

[0071] Figure 2 This is a diagram showing the bidding results for ancillary services that aggregate different numbers of producers and consumers in an embodiment of the present invention. Detailed Implementation

[0072] The present invention will now be described in detail with reference to the accompanying drawings and embodiments. It should be noted that, unless otherwise specified, the embodiments and features described in this application can be combined with each other.

[0073] The following detailed description is exemplary and intended to provide further detailed explanation of the invention. Unless otherwise specified, all technical terms used in this invention have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains. The terminology used in this invention is for the purpose of describing particular embodiments only and is not intended to limit the scope of exemplary embodiments according to the invention.

[0074] As a preferred embodiment of the present invention, a method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP includes the following steps:

[0075] Step 1: Construct a virtual power plant multi-agent system, including a VPP operator agent and multiple prosumer agents, with each prosumer agent configured with various distributed energy devices;

[0076] Step 2: Collect physical observations of the equipment of each producer and consumer to form an observation matrix. The physical observations of the equipment include the energy storage state of charge, the electric boiler temperature, the chiller temperature, and the interruptible load status;

[0077] Step 3: Observation matrix Input a Transformer encoder, extract inter-device coupling features using a multi-head self-attention mechanism, and calculate the query matrix. Key matrix Sum matrix The global interaction feature vector is obtained by calculating attention weights. ;

[0078] Step 4: Construct context vectors This includes information on equipment type, capacity, ambient temperature, and electricity price;

[0079] Step 5: Output features of Transformer With context vector The AdaptMLP module is input, and four sets of adaptive weights are generated through a gating mechanism. The weights are applied to the outputs of the four expert networks, and the adaptively adjusted decision features are obtained through weighted fusion.

[0080] Step 6: Output the scheduling actions of each device based on the decision characteristics, including energy storage charging and discharging power, electric boiler power, chiller power, and interruptible load reduction amount;

[0081] Step 7: Interact with the environment to obtain reward feedback and update the network parameters of Transformer and AdaptMLP;

[0082] Step 8: Repeat steps S2-S7 until the model converges.

[0083] The method for constructing the physical observations of the equipment in step S2 is as follows:

[0084] For the first type of load including energy storage systems, electric boilers, electric chillers, and interruptible loads Individual consumers at any time observation matrix include:

[0085] Energy storage system: State of charge ;

[0086] Electric boiler: water tank temperature Normalization ;

[0087] Electric chiller: chilled water temperature Normalization ;

[0088] Interruptible load: Current state ;

[0089] The dimension of the observation matrix is ,in For the number of devices, The feature dimensions for each device.

[0090] The multi-head self-attention mechanism of the Transformer encoder in step S3 is as follows:

[0091]

[0092] in:

[0093]

[0094]

[0095] in: The dimension of the key vector. For the number of attention heads, , , , This is a learnable weight matrix.

[0096] Context vector in step S4 The construction includes:

[0097] Equipment type vector: One-hot encoding is used to represent energy storage, electric boilers, chillers, and interruptible loads;

[0098] Equipment capacity: Rated power or capacity after normalization;

[0099] Environmental information: outdoor temperature, solar radiation intensity;

[0100] Market information: real-time electricity prices, ancillary service prices;

[0101] The above information is pieced together to form the dimension. The context vector.

[0102] The adaptive adjustment process of the AdaptMLP module in step S5 is as follows:

[0103] The feature transformation formula is:

[0104]

[0105] The formula for calculating the gating weight is:

[0106]

[0107] The adaptive output formula is:

[0108]

[0109] in: , For the first The weights and biases of an expert network, These are the learnable parameters of the gated network. This is the context vector.

[0110] The output constraint for the equipment scheduling action in step S6 is:

[0111] Energy storage charging and discharging power: ;

[0112] Electric boiler power: ;

[0113] Electric chiller power: ;

[0114] Interruptible load reduction: ;

[0115] The action is mapped to the corresponding constraint range through the sigmoid activation function.

[0116] The network parameter update in step S7 uses the Asynchronous Advantageous Actor-Critic Algorithm (MAPPO), which includes:

[0117] Policy network: Outputs the probability distribution of device scheduling actions;

[0118] Network evaluation: Estimating the value function of the current state-action pair;

[0119] Advantage function: ;

[0120] Policy gradient update: Maximize the truncated objective function.

[0121] The method also includes a dynamic expansion step:

[0122] When new consumers connect to the virtual power plant:

[0123] The Transformer encoder automatically adapts to changes in the length of the input sequence through a positional encoding mechanism;

[0124] The AdaptMLP module uses the context vector of the new pro-consumer. Dynamically adjust the weights of the expert network;

[0125] There is no need to retrain the entire model; only the gating network parameters need to be fine-tuned.

[0126] Furthermore, the present invention also provides a virtual power plant dynamic aggregation system based on Transformer and AdaptMLP, comprising:

[0127] Data acquisition module: Collects real-time operating status and physical parameters of producer-consumer equipment;

[0128] Feature extraction module: Employs a Transformer encoder to extract coupling features between devices, including a multi-head self-attention calculation unit, a residual connection unit, and a layer normalization unit;

[0129] Adaptive adjustment module: Adopts the AdaptMLP structure, including 4 parallel expert networks, a gating mechanism unit and a feature fusion unit;

[0130] Decision output module: Generates device scheduling actions based on fusion features;

[0131] Parameter update module: Updates network parameters based on environmental feedback.

[0132] The feature extraction module adopts a multi-layer Transformer encoder stacked structure, with each layer including a multi-head self-attention sub-layer, a feedforward neural network sub-layer, and residual connections.

[0133] The present invention will be further illustrated by an embodiment below.

[0134] Multiple prosumers were selected as dynamic aggregation objects for VPP for case analysis, assuming that the prosumers contain various flexible resources and multiple energy loads. In actual operation, VPP needs to adapt to the scale expansion of prosumers and the characteristics of heterogeneous equipment upgrades. Table 1 shows the dynamic changes in the number of prosumers and equipment configurations under different time periods.

[0135]

[0136] Figure 2 shows the bidding results of VPPs participating in the grid peak-shaving ancillary services market under different prosumer aggregation scales, with three typical days corresponding to dynamic aggregation scenarios of 5, 10, and 20 prosumers, respectively. Positive values ​​represent peak-shaving ancillary services, and negative values ​​represent valley-filling ancillary services. Comparing the bidding results of the three typical days, it can be found that despite significant changes in the number of prosumers and equipment configuration, VPPs are always able to meet the minimum bidding requirement of ≥1000kWh for the grid.

[0137] From the temporal distribution of bidding capacity, the bidding strategy of VPP exhibits a clear peak-valley characteristic. During peak-shaving periods (1:00-8:00 and 20:00-24:00), VPP responds to the grid's valley-filling demand by coordinating producers and consumers to increase electricity consumption or decrease generation. During peak-shaving periods (8:00-20:00), VPP organizes producers and consumers to reduce electricity consumption or increase generation, providing peak-shaving services to the grid. As the number of producers and consumers increases from 5 to 20, the bidding capacity of VPP shows a significant growth trend. This is because the complementary effect between heterogeneous resources is enhanced; for example, the newly added CHP unit on Day 2 forms a good synergy with the absorption chiller. Furthermore, the energy mutual assistance effect among producers and consumers under the P2P trading mechanism makes the overall peak-shaving capacity exceed the simple sum of individual capacities. As the aggregation scale expands, the impact of the uncertainty of individual producers and consumers on the overall bidding capacity is effectively dispersed, enabling VPP to provide more stable ancillary services.

Claims

1. A method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP, characterized in that, Includes the following steps: S1: Construct a virtual power plant multi-agent system, including a VPP operator agent and multiple prosumer agents, with each prosumer agent configured with a variety of distributed energy devices; S2: Collect physical observations of the equipment of each producer and consumer to form an observation matrix. The physical observations of the equipment include the energy storage state of charge, the electric boiler temperature, the chiller temperature, and the interruptible load status; S3: Observation matrix Input a Transformer encoder and extract inter-device coupling features using a multi-head self-attention mechanism: Calculate the query matrix Key matrix Sum matrix ; The global interaction feature vector is obtained by calculating attention weights. ; S4: Constructing Context Vectors This includes information on equipment type, capacity, ambient temperature, and electricity price; S5: Transformer output features With context vector Input the AdaptMLP module: Four sets of adaptive weights are generated using a gating mechanism. ; The weights are applied to the outputs of the four expert networks; The adaptively adjusted decision features are obtained through weighted fusion. S6: Output the scheduling actions of each device based on decision characteristics, including energy storage charging and discharging power, electric boiler power, chiller power, and interruptible load reduction amount; S7: Interact with the environment to obtain reward feedback and update the network parameters of Transformer and AdaptMLP; S8: Repeat steps S2-S7 until the model converges.

2. The method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP according to claim 1, characterized in that, The method for constructing the physical observations of the equipment in step S2 is as follows: For the first type of load including energy storage systems, electric boilers, electric chillers, and interruptible loads Individual consumers at any time observation matrix include: Energy storage system: State of charge ; Electric boiler: water tank temperature Normalization ; Electric chiller: chilled water temperature Normalization ; Interruptible load: Current state ; The dimension of the observation matrix is ,in For the number of devices, The feature dimensions for each device.

3. The method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP according to claim 1, characterized in that, Context vector in step S4 The construction includes: Equipment type vector: One-hot encoding is used to represent energy storage, electric boilers, chillers, and interruptible loads; Equipment capacity: Rated power or capacity after normalization; Environmental information: outdoor temperature, solar radiation intensity; Market information: real-time electricity prices, ancillary service prices; The above information is pieced together to form the dimension. The context vector.

4. The method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP according to claim 1, characterized in that, The adaptive adjustment process of the AdaptMLP module in step S5 is as follows: The feature transformation formula is: The formula for calculating the gating weight is: The adaptive output formula is: in: , For the first The weights and biases of an expert network, These are the learnable parameters of the gated network. This is the context vector.

5. The method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP according to claim 1, characterized in that, The output constraint for the equipment scheduling action in step S6 is: Energy storage charging and discharging power: ; Electric boiler power: ; Electric chiller power: ; Interruptible load reduction: ; The action is mapped to the corresponding constraint range through the sigmoid activation function.

6. The method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP according to claim 1, characterized in that, The network parameter update in step S7 uses the Asynchronous Advantageous Actor-Critic Algorithm (MAPPO), which includes: Policy network: Outputs the probability distribution of device scheduling actions; Network evaluation: Estimating the value function of the current state-action pair; Advantage function: ; Policy gradient update: Maximize the truncated objective function.

7. The method for dynamic aggregation of virtual power plants based on Transformer and AdaptMLP according to claim 1, characterized in that, The method also includes a dynamic expansion step: When new consumers connect to the virtual power plant: The Transformer encoder automatically adapts to changes in the length of the input sequence through a positional encoding mechanism; The AdaptMLP module uses the context vector of the new pro-consumer. Dynamically adjust the weights of the expert network; There is no need to retrain the entire model; only the gating network parameters need to be fine-tuned.

8. A system for implementing the virtual power plant dynamic aggregation method based on Transformer and AdaptMLP as described in any one of claims 1-7, characterized in that, include: Data acquisition module: Collects real-time operating status and physical parameters of producer-consumer equipment; Feature extraction module: Employs a Transformer encoder to extract coupling features between devices, including a multi-head self-attention calculation unit, a residual connection unit, and a layer normalization unit; Adaptive adjustment module: Adopts the AdaptMLP structure, including 4 parallel expert networks, a gating mechanism unit and a feature fusion unit; Decision output module: Generates device scheduling actions based on fusion features; Parameter update module: Updates network parameters based on environmental feedback.

9. The system according to claim 8, characterized in that, The feature extraction module adopts a multi-layer Transformer encoder stacked structure, with each layer including a multi-head self-attention sub-layer, a feedforward neural network sub-layer, and residual connections.