A new energy station active and reactive power hierarchical coordination control method and device
By employing a two-layer control strategy combining deep reinforcement learning and model predictive control, the problems of slow response and low regulation accuracy in the coordinated control of active and reactive power in renewable energy power plants have been solved, achieving efficient and precise power scheduling and improved stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIFANG UNIV OF NATITIES
- Filing Date
- 2026-04-09
- Publication Date
- 2026-06-19
AI Technical Summary
Existing active and reactive power coordination control methods for new energy power plants suffer from slow response, poor adaptability, and low regulation accuracy, especially under complex disturbances, making it difficult to meet real-time and multi-objective requirements.
A two-layer control strategy of deep reinforcement learning (DRL) and model predictive control (MPC) is adopted. The system state variables are obtained through the state awareness layer, the pre-trained deep reinforcement learning controller is used for active and reactive power allocation at the plant level, and the equipment-level model predictive controller is combined for real-time power regulation to achieve precise power scheduling.
It has improved the operational stability and power dispatch accuracy of new energy power plants, shortened response time, enhanced the system's adaptability and utilization efficiency of wind and solar power, and reduced wind and solar curtailment.
Smart Images

Figure CN122246908A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of power control technology for new energy power plants, and in particular to a hierarchical coordinated control method and equipment for active and reactive power in new energy power plants. Background Technology
[0002] With the increasing penetration rate of new energy sources, traditional power systems dominated by synchronous generators are transforming into systems primarily based on power electronic devices such as photovoltaics, wind power, and energy storage. The number of various types of converters within power plants has increased significantly, and their mutual coupling and complementary effects make voltage / frequency stability more complex, requiring new collaborative control strategies to maintain system stability. Existing active and reactive power coordination control in new energy power plants mainly relies on traditional methods such as droop control, proportional allocation, or centralized optimization, but all have shortcomings: droop control and fixed allocation have simple structures and fast responses but generate steady-state deviations and cannot be dynamically adjusted according to actual power generation conditions; centralized optimization, while theoretically able to find the global optimum, requires establishing accurate models and collecting large amounts of data, resulting in high computational complexity and response lag, making it difficult to meet the real-time requirements of actual systems; furthermore, traditional control often ignores the randomness of wind and solar power output and the multi-objective needs of the system, leading to low adjustment accuracy under complex disturbances. Summary of the Invention
[0003] The purpose of this application is to provide a hierarchical coordinated control method and equipment for active and reactive power in new energy power plants, which can improve the operational stability of new energy power plants.
[0004] To achieve the above objectives, this application provides the following solution: Firstly, this application provides a hierarchical coordinated control method for active and reactive power in a new energy power plant. The method is applied to a hierarchical coordinated control architecture for active and reactive power in a new energy power plant, the architecture comprising: Status awareness layer, station-level control layer, and equipment-level control layer; The state awareness layer is used to periodically acquire the system state quantities of the new energy power station; The station-level control layer is equipped with a pre-trained deep reinforcement learning controller. The equipment-level control layer is equipped with an equipment-level model prediction controller; the equipment-level model prediction controller includes multiple model prediction control modules; each model prediction control module is configured in a one-to-one correspondence with a power generation unit in a new energy power station. The method includes: Obtain the system status variables of the new energy power station; The system state variables are input into the pre-trained deep reinforcement learning controller to obtain the target active / reactive power of each power generation unit. The model predictive control module performs power regulation based on the target active / reactive power of the corresponding power generation unit.
[0005] Optionally, the state awareness layer includes multiple measurement units.
[0006] Optionally, the system status quantities include: real-time output power of each power generation unit in the new energy power station, grid connection point voltage, grid load status, and meteorological data.
[0007] Optionally, the power generation unit is a photovoltaic, energy storage, or dynamic reactive power compensation device.
[0008] Optionally, the reward function for training the deep reinforcement learning controller is: ; in, ; ; ; ; ; ; ; In the formula; This is a voltage deviation penalty term; This is a penalty term for total active and reactive power tracking deviation. This is a penalty for abandoning wind and solar power. Penalty item for energy storage lifespan protection; , , and These are the weighting coefficients for different penalty items; The measured voltage at the grid connection point; The rated voltage at the grid connection point; The actual total active power output of the station; This is a reference command for total active power at the station level; This represents the actual total reactive power output of the power station. This is the reference command for total reactive power at the station level; Contribute to wind power; Contributing to photovoltaic power generation; Contribute to energy storage; For reactive power output from wind power; Contributing reactive power to photovoltaic power generation; It generates reactive power for energy storage; For SVG, no reactive power is generated; This represents the current maximum power output of the wind turbine unit. This represents the current maximum power output of the photovoltaic unit. Real-time state of charge of energy storage; This represents the center value of the optimal operating range for energy storage. This is the rated state of charge for energy storage.
[0009] Optionally, the objective function of the model prediction control module is: ; Where J is the objective function; The target active power value given to the station-level controller; This refers to the actual active power output of the equipment. The target reactive power value given by the station-level controller; This refers to the actual reactive power output of the equipment.
[0010] Optionally, the method further includes: The hierarchical coordinated control method for active and reactive power of the new energy power plant is repeatedly executed in each control cycle.
[0011] Secondly, this application provides a computer device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the hierarchical coordinated control method for active and reactive power of new energy power plants as described above.
[0012] Thirdly, this application provides a computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the steps of the hierarchical coordinated control method for active and reactive power of new energy power plants described above.
[0013] Fourthly, this application provides a computer program product, including a computer program that, when executed by a processor, implements the steps of the hierarchical coordinated control method for active and reactive power in new energy power plants as described above.
[0014] According to the specific embodiments provided in this application, the following technical effects are disclosed: This application provides a hierarchical coordinated control method and equipment for active and reactive power in renewable energy power plants. It primarily optimizes the scheduling and allocation of active and reactive power through a two-layer control structure. The upper layer uses a deep reinforcement learning (DRL) algorithm for power plant-level active and reactive power allocation, outputting target active and reactive power adjustment amounts based on power plant information and scheduling instructions. The lower layer uses a model predictive control (MPC) algorithm, using the target active and reactive power output from the power plant-level DRL as reference data for closed-loop control. This achieves power regulation of the static var generator (SVG) and renewable energy generation unit converters, ensuring stable operation of the grid connection point and the power plant. Power plant-level DRL control: The deep reinforcement learning algorithm schedules the power of the entire power plant, calculating the required active and reactive power allocation based on grid scheduling instructions and real-time data. Equipment-level MPC control: Each generation unit adjusts its output power in real-time using the MPC algorithm based on the active and reactive power reference values issued by the power plant-level DRL, ensuring the stability of grid connection point voltage and frequency, and maximizing the utilization of renewable energy.
[0015] This application proposes a two-layer control strategy integrating deep reinforcement learning and model predictive control. Deep reinforcement learning, as a data-driven method that does not rely on an accurate system model, can progressively optimize the control strategy through online interactive learning, making it suitable for highly nonlinear and time-varying scenarios such as renewable energy power plants. Model predictive control, on the other hand, utilizes predictive information and equipment constraints for rolling optimization, ensuring the accuracy of local control. The combination of these two approaches, implemented in engineering through layered deployment of controller software and hardware integration, fully leverages their respective advantages while effectively addressing the problems of slow response, poor adaptability, and low regulation accuracy in existing technologies, significantly improving the performance of active and reactive power coordinated control in renewable energy power plants. Attached Figure Description
[0016] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0017] Figure 1 This is a flowchart of a hierarchical coordinated control method for active and reactive power in a new energy power station according to one embodiment of this application. Detailed Implementation
[0018] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0019] To make the above-mentioned objectives, features and advantages of this application more apparent and understandable, the application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0020] In one exemplary embodiment, such as Figure 1 As shown, a hierarchical coordinated control method for active and reactive power in a renewable energy power plant is provided. This method is applied to a hierarchical coordinated control architecture for active and reactive power in a renewable energy power plant, which includes a state awareness layer, a plant-level control layer, and an equipment-level control layer. The state awareness layer is used to periodically acquire the system state variables of the renewable energy power plant. The plant-level control layer is equipped with a pre-trained deep reinforcement learning controller. The equipment-level control layer is equipped with an equipment-level model predictive controller. The equipment-level model predictive controller includes multiple model predictive control modules. Each model predictive control module corresponds one-to-one with a power generation unit in the renewable energy power plant. The state awareness layer includes multiple measurement units. The plant-level control layer, composed of a deep reinforcement learning (DRL) controller, is responsible for outputting the active / reactive power dispatch target for the entire plant based on real-time measurement data and grid dispatch instructions. Its inputs include the real-time output power of each unit such as wind power, photovoltaic, energy storage, and SVG, grid load status, and meteorological information. The output is active and reactive power reference instructions for each device. Equipment-level control layer: Composed of equipment-level MPC controllers. Each device (wind power generation unit, photovoltaic power generation unit, energy storage system, SVG) contains an MPC algorithm module, which receives the target active / reactive power from the site-level DRL controller, calculates the specific power adjustment amount within the device based on this, generates inverter control signals, and performs real-time power regulation.
[0021] like Figure 1 The hierarchical coordinated control method for active and reactive power in new energy power plants includes: Step 101: Obtain the system state variables of the new energy power station.
[0022] Step 102: Input the system state variables into the pre-trained deep reinforcement learning controller to obtain the target active / reactive power of each power generation unit.
[0023] Step 103: The model predictive control module completes power regulation based on the target active / reactive power of the corresponding power generation unit.
[0024] The hierarchical coordinated control method for active and reactive power of new energy power plants is repeatedly executed in each control cycle.
[0025] Design of a site-level DRL controller: State variables include the real-time output power of each power generation unit (photovoltaic, energy storage, dynamic reactive power compensation device SVG) in the power station, grid connection point voltage, grid load status, meteorological data, etc.
[0026] Action: Outputs the active and reactive power distribution of each power generation unit, i.e., the power value that each power generation unit needs to adjust.
[0027] Reward function: includes objectives such as voltage stability, power point tracking accuracy, and reduction of wind and solar curtailment.
[0028] The reward function is optimized with multiple objectives, including voltage stability, accurate power tracking, minimizing wind and solar curtailment, and extending energy storage lifespan. It adopts a normalized full-penalty structure, and the overall reward function formula is as follows: .
[0029] , , , The weight coefficients of each sub-penalty term satisfy the following conditions: It can be adaptively adjusted according to the power grid dispatching needs and the operating conditions of the station; This is a voltage deviation penalty term; This is a penalty term for total active and reactive power tracking deviation. This is a penalty for abandoning wind and solar power. This is a penalty item for energy storage lifespan protection.
[0030] 1. Voltage deviation penalty Balancing voltage stability with the hard constraints of grid security: .
[0031] In the formula: The measured voltage at the grid connection point; This is the rated voltage at the grid connection point.
[0032] 2. Power tracking deviation penalty Improve power tracking accuracy: .
[0033] In the formula: , The reference command for total active and reactive power at the station level; The actual total active power output of the station; This represents the actual total reactive power output of the power station. , , These are active power outputs from wind power, photovoltaic power, and energy storage, respectively. , , , These are wind power, photovoltaic, energy storage, and SVG reactive power output, respectively.
[0034] 3. Penalties for abandoning wind and solar power Strengthen constraints on the utilization rate of new energy sources: . In the formula: This represents the current maximum power output of the wind turbine unit. The maximum power output of the photovoltaic unit is calculated from real-time meteorological data.
[0035] 4. Energy storage lifespan protection penalty items Precisely constrain energy storage To the optimal interval: .
[0036] In the formula: Real-time state of charge of energy storage; The center value of the optimal operating range for energy storage is (0.4~0.6). This is the rated state of charge for energy storage.
[0037] Device-level MPC controller design: Each device-level controller (PV, energy storage, SVG) uses MPC for power regulation. The MPC control algorithm takes the active and reactive power reference values issued by the site-level DRL as input, and aims to minimize the difference between the actual power at the grid connection point and the reference value. Specifically, the objective is: Where J is the objective function. The target active power value given to the station-level controller. This refers to the actual active power output of the equipment. The target reactive power value given to the station-level controller. This refers to the actual reactive power output of the equipment.
[0038] Control process: 1. Status Acquisition: The system periodically acquires various status variables from the measurement unit, including grid connection point voltage and current, wind turbine generator, photovoltaic generator, energy storage output, SOC, SVG output, as well as weather and dispatch command information.
[0039] 2. Station-level DRL Control: The collected system status is input into the DRL controller. Based on a pre-trained deep policy network, the DRL controller outputs the overall target active and reactive power adjustment amounts for the station (such as the power values that need to be increased or decreased for each device). This target can be the total adjustment amount for the entire station or a task allocation to each device. The DRL controller performs single decisions online, quickly issuing commands with only forward computation.
[0040] 3. Equipment-level MPC Control: After receiving the target power command from the DRL, the equipment-level controller uses it as input for MPC optimization. The model predictive control strategy mainly establishes a mathematical model in the dq rotating coordinate system, samples the instantaneous values of grid voltage and grid-connected current, calculates the active and reactive power values, and predicts the values of active and reactive power at the next time step through model predictive control. Finally, based on the value function, the optimal state is selected for control when the value function is minimized.
[0041] 4. Execution Control: The equipment-level controller generates specific control signals based on the MPC calculation results, adjusting the active and reactive power outputs of each device in real time to ensure that the actual output closely matches the target given by the DRL. After the control signals are processed by the power electronic converter, the power station achieves the required target power distribution.
[0042] 5. Iterative Cycle: The above steps are repeated within each control cycle (e.g., every second or less) to achieve closed-loop adaptive control. As the input state changes, the DRL output and MPC optimization results are updated in real time, ensuring the station's rapid response to dynamic disturbances and scheduling commands.
[0043] Traditional droop control uses fixed coefficients, leading to steady-state voltage / frequency deviations, and the proportional allocation method often struggles to accurately track target power. This application precisely calculates power adjustment amounts through DRL and MPC, dynamically changing the power allocation strategy to achieve fine-grained control of the target power, avoiding deviations caused by fixed allocations, and improving control accuracy and controllability. The lower-level MPC controller solves locally quickly, with a fast response speed, and can track DRL commands in real time; eliminating the need for long-term centralized optimization scheduling calculations significantly reduces response latency. Furthermore, the DRL controller uses neural network forward computation to provide control actions, possessing millisecond-level response capabilities; the DRL controller can learn the dynamic characteristics of the power plant and environmental changes online, exhibiting strong robustness to equipment parameters or external disturbances; compared to traditional setpoint strategies, this solution can automatically optimize the control strategy according to changes in the power plant's operating mode and meteorological conditions, thereby enhancing the system's adaptability; the solution comprehensively considers power allocation and wind / solar curtailment penalties, prioritizing the use of renewable energy generation and actively charging energy storage while ensuring compliance with dispatch commands, thus reducing wind / solar curtailment. Unlike simple proportional allocation or fixed output, this application can maximize the use of renewable energy for power generation and improve the overall output efficiency of the power plant.
[0044] In one exemplary embodiment, a computer device is provided, which may be a server or a terminal. The computer device includes a processor, memory, input / output interfaces (I / O), and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is connected to the system bus via the I / O interfaces. The processor of the computer device provides computing and control capabilities. The memory of the computer device includes non-volatile storage media and internal memory. The non-volatile storage media stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The I / O interfaces of the computer device are used for exchanging information between the processor and external devices. The communication interface of the computer device is used for communicating with external terminals via a network connection.
[0045] In one exemplary embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above-described method embodiments.
[0046] In one exemplary embodiment, a computer-readable storage medium is provided storing a computer program that, when executed by a processor, implements the steps in the above-described method embodiments.
[0047] In one exemplary embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above-described method embodiments.
[0048] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data must comply with relevant regulations.
[0049] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When executed, the computer program can include the processes of the embodiments described above. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).
[0050] The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.
[0051] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.
[0052] This document uses specific examples to illustrate the principles and implementation methods of this application. The descriptions of the above embodiments are only for the purpose of helping to understand the methods and core ideas of this application. Furthermore, those skilled in the art will recognize that, based on the ideas of this application, there will be changes in the specific implementation methods and application scope. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. A hierarchical coordinated control method for active and reactive power in a new energy power plant, characterized in that, The method is applied to a hierarchical coordinated control architecture for active and reactive power in a new energy power plant, the architecture comprising: Status awareness layer, station-level control layer, and equipment-level control layer; The state awareness layer is used to periodically acquire the system state quantities of the new energy power station; The station-level control layer is equipped with a pre-trained deep reinforcement learning controller. The equipment-level control layer is equipped with an equipment-level model prediction controller; the equipment-level model prediction controller includes multiple model prediction control modules; each model prediction control module is configured in a one-to-one correspondence with a power generation unit in a new energy power station. The method includes: Obtain the system status variables of the new energy power station; The system state variables are input into the pre-trained deep reinforcement learning controller to obtain the target active / reactive power of each power generation unit. The model predictive control module performs power regulation based on the target active / reactive power of the corresponding power generation unit.
2. The hierarchical coordinated control method for active and reactive power in new energy power plants according to claim 1, characterized in that, The state awareness layer includes multiple measurement units.
3. The hierarchical coordinated control method for active and reactive power in new energy power plants according to claim 1, characterized in that, The system status parameters include: real-time output power of each power generation unit in the new energy power station, grid connection point voltage, grid load status, and meteorological data.
4. The hierarchical coordinated control method for active and reactive power in new energy power plants according to claim 1, characterized in that, The power generation unit is a photovoltaic, energy storage, or dynamic reactive power compensation device.
5. The hierarchical coordinated control method for active and reactive power in new energy power plants according to claim 1, characterized in that, The reward function for training the deep reinforcement learning controller is: ; in, ; ; ; ; ; ; ; In the formula; This is a voltage deviation penalty term; This is a penalty term for total active and reactive power tracking deviation. This is a penalty for abandoning wind and solar power. Penalty item for energy storage lifespan protection; , , and These are the weighting coefficients for different penalty items; The measured voltage at the grid connection point; The rated voltage at the grid connection point; The actual total active power output of the station; This is a reference command for total active power at the station level; This represents the actual total reactive power output of the power station. This is the reference command for total reactive power at the station level; Contribute to wind power; Contributing to photovoltaic power generation; Contribute to energy storage; For reactive power output from wind power; Contributing reactive power to photovoltaic power generation; It generates reactive power for energy storage; For SVG, no reactive power is generated; This represents the current maximum power output of the wind turbine unit. This represents the current maximum power output of the photovoltaic unit. Real-time state of charge of energy storage; This represents the center value of the optimal operating range for energy storage. This is the rated state of charge for energy storage.
6. The hierarchical coordinated control method for active and reactive power in new energy power plants according to claim 1, characterized in that, The objective function of the model prediction control module is: ; Where J is the objective function; The target active power value given to the station-level controller; This refers to the actual active power output of the equipment. The target reactive power value given by the station-level controller; This refers to the actual reactive power output of the equipment.
7. The hierarchical coordinated control method for active and reactive power in new energy power plants according to claim 1, characterized in that, The method further includes: The hierarchical coordinated control method for active and reactive power of the new energy power plant is repeatedly executed in each control cycle.
8. A computer device, comprising: A memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that the processor executes the computer program to implement the hierarchical coordinated control method for active and reactive power in new energy power plants as described in any one of claims 1-7.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the hierarchical coordinated control method for active and reactive power in new energy power plants as described in any one of claims 1-7.
10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by the processor, it implements the hierarchical coordinated control method for active and reactive power in new energy power plants as described in any one of claims 1-7.