A virtual boundary node driven decoupling method for boundary interaction

By introducing virtual boundary nodes and a deep reconfiguration model into the distribution network, and combining deep reinforcement learning with particle swarm optimization, the problem of real-time interaction dependency of cross-cluster boundary states in the distribution network is solved, achieving efficient local control optimization and real-time performance improvement.

CN122136880BActive Publication Date: 2026-06-30HEFEI UNIV OF TECH +2

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
HEFEI UNIV OF TECH
Filing Date
2026-05-06
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing distributed control methods in power distribution networks suffer from strong real-time interaction dependence across cluster boundary states, long communication links, queuing congestion, and large closed-loop delays, making it difficult to balance control accuracy, robustness, and real-time performance.

Method used

A boundary interaction decoupling method driven by virtual boundary nodes is adopted. By constructing virtual boundary nodes and a local dynamic discrete state space model, boundary state estimation is performed using a state observer and a deep reconstruction model. Local optimization is then performed by combining deep reinforcement learning and particle swarm optimization algorithm to achieve communication culling and boundary state reconstruction.

Benefits of technology

It reduces reliance on cross-domain communication, improves the real-time performance and robustness of control, and achieves rapid response and high-quality local control optimization.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122136880B_ABST
    Figure CN122136880B_ABST
Patent Text Reader

Abstract

This invention discloses a virtual boundary node-driven boundary interaction decoupling method, comprising the following steps: dividing the entire distribution network into M control clusters and identifying physical boundary nodes between adjacent clusters; constructing a corresponding virtual boundary node for each physical boundary node; establishing a local dynamic discrete state space model for each control cluster; calculating the boundary reconstruction error, model uncertainty index, and boundary consistency deviation of the virtual boundary state; constructing a communication blanking criterion based on preset boundary reconstruction error thresholds, consistency deviation thresholds, and minimum confidence thresholds; constructing a local optimization objective function for distribution network voltage control in the rolling time domain; and distributing the optimal control solution to controllable equipment in the distribution network for execution. This invention achieves rapid response and high-quality solution for local control optimization through a joint solution method of "deep reinforcement learning for rapid initial values ​​of actions + particle swarm optimization for fine-grained search."
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of power communication technology, and more specifically to a boundary interaction decoupling method driven by a virtual boundary node. Background Technology

[0002] With the large-scale integration of distributed photovoltaic, energy storage, flexible loads, and power electronic equipment into the distribution network, the system operation exhibits characteristics such as rapid state changes, strong boundary coupling, complex communication environment, and strict control time limits. Traditional centralized control requires continuous feedback of measurements from the entire network to the master station, resulting in long communication links, significant queuing congestion, and large closed-loop delays.

[0003] To improve response speed, existing methods typically divide the distribution network into multiple control clusters and configure edge controllers within each cluster. While this can shorten local solution links, adjacent clusters still need to frequently exchange voltage, power, current, or power flow information at boundary nodes. In other words, existing distributed control methods primarily transform "global communication" into "boundary communication," without fundamentally eliminating cross-domain real-time interaction dependencies.

[0004] Therefore, the core problem of this technical solution can be summarized as: how to transform the "real-time acquisition problem of cross-cluster boundary state" into the "online reconstruction problem of boundary state based on local information", and achieve a balance between control accuracy, robustness and real-time performance. Summary of the Invention

[0005] The present invention proposes a virtual boundary node-driven boundary interaction decoupling method, which can at least solve one of the technical problems in the background art.

[0006] To achieve the above objectives, the present invention adopts the following technical solution:

[0007] A virtual boundary node-driven boundary interaction decoupling method, applied to a distributed control scenario of a distribution network cluster containing distributed power sources, includes the following steps:

[0008] S1. Based on the electrical coupling strength, geographical proximity, communication coverage and controller computing power of the distribution network nodes, the nodes of the entire distribution network are divided into M control clusters, and the physical boundary nodes between adjacent clusters are identified.

[0009] S2. Construct a corresponding virtual boundary node for each physical boundary node. The virtual boundary node is a digital substitute in the control model, used to replace the direct call to the real-time boundary data of adjacent clusters during the online control process.

[0010] S3. Establish a local dynamic discrete state space model for each control cluster, construct a state observer based on the local dynamic discrete state space model, and output the prior estimates of the local state and boundary state of the corresponding cluster through the state observer.

[0011] S4. Construct an observation history window of length H. Based on the observation data in the observation history window, the local state prior estimate output by the state observer, the control input at the previous control time, and the topological embedding features of the corresponding cluster, obtain the virtual boundary state estimate by mapping through a pre-trained deep reconstruction model.

[0012] S5. Calculate the boundary reconstruction error, model uncertainty index, and boundary consistency deviation of the virtual boundary state, and calculate the boundary reconstruction confidence of the corresponding cluster at the current time based on the boundary reconstruction error, model uncertainty index, and boundary consistency deviation.

[0013] S6. Construct communication blanking criteria based on preset boundary reconstruction error threshold, consistency deviation threshold, and minimum confidence threshold; when the communication blanking criteria are met, enter the communication blanking mode, directly use the virtual boundary state estimate to participate in local control solution, and do not initiate real-time boundary data interaction with adjacent clusters; when the communication blanking criteria are not met, trigger supplementary communication, obtain the real boundary data of adjacent clusters, and correct the deep reconstruction model;

[0014] S7. Based on the virtual boundary state estimate, a local optimization objective function for distribution network voltage control in the rolling time domain is constructed. The local optimization objective function is solved using a solution framework combining deep reinforcement learning and particle swarm optimization to obtain the optimal control solution.

[0015] S8. The optimal control solution is sent to the controllable equipment in the distribution network for execution. Steps S3 to S8 are executed again in the next control cycle to complete the control closed-loop iteration.

[0016] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: the state observer constructed in step S3 is a Romberg state observer, and its state update equation is constructed based on the cluster local dynamic discrete state space model; in the communication blanking mode, when the system matrix of the state observer is stable, the reconstruction error of the deep reconstruction model is bounded, and the event-triggered supplementary communication satisfies the maximum allowable step loss interval constraint, both the boundary state estimation error and the local state estimation error remain bounded, satisfying:

[0017]

[0018] in, Indicates the boundary estimation error; This represents the local state estimation error; This indicates the upper bound of the boundary state estimation error; This represents the upper bound of the local state estimation error.

[0019] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: the deep reconstruction model in step S4 adopts a combination architecture of graph neural network and gated recurrent unit, or a combination architecture of graph attention network and temporal convolutional network; the deep reconstruction model simultaneously extracts the topological correlation features of the distribution network and the temporal evolution features of the boundary state, and the mapping relationship expression is:

[0020]

[0021] In the formula, Let be the estimated virtual boundary state value of the c-th cluster at time t. The parameter is The deep reconstruction model Indicates length is Historical observation window For the local state estimation of the observer output, This is the input value from the previous control time step. For the first Topological embedding features of each cluster.

[0022] As a preferred embodiment of the boundary interaction decoupling method driven by virtual boundary nodes described in this invention, in step S5, the model uncertainty index is calculated based on the output variance of the integrated deep reconstruction model, and the boundary consistency deviation is calculated based on the residual function of local physical constraints.

[0023] The expression for the model uncertainty index is:

[0024]

[0025] In the formula, Indicates the number of parallel reconstructed sub-models. Indicates the first Boundary estimates output by each sub-model express Mean estimate of the output of each sub-model This represents the uncertainty index defined by the dispersion output of the ensemble model;

[0026] The expression for the boundary consistency deviation is:

[0027]

[0028] In the formula, This represents the residual function composed of power flow balance, nodal power conservation, and boundary coupling constraints. This indicates the consistency deviation between the virtual boundary state and the local physical model.

[0029] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: in step S5, the expression for the reconstruction confidence is:

[0030]

[0031] In the formula, , and These are non-negative scaling factors, which control the reconstruction error respectively. Consistency deviation and uncertainty The strength of the impact on confidence level Indicates the first Each cluster at time The confidence level of the boundary estimation satisfy .

[0032] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: in step S6, the expression for the communication blanking criterion is:

[0033]

[0034] In the formula, Indicates the boundary reconstruction error threshold. Indicates the consistency deviation threshold. This represents the minimum confidence threshold. Indicates the communication blanking criterion;

[0035] when At that time, the controller directly adopts Participating in local control solutions, it no longer requests real-time boundary data from adjacent clusters; when At this time, the system triggers supplementary communication to obtain the true boundary information and correct the model;

[0036] The trigger function expression for the supplementary communication is:

[0037]

[0038] In the formula, Indicates a supplementary communication trigger flag. This indicates the abrupt change in the output of new energy sources or the amount of load injected. Indicates the power mutation threshold. This indicates a topology switching event.

[0039] As a preferred embodiment of the boundary interaction decoupling method driven by virtual boundary nodes described in this invention, in step S6, after triggering supplementary communication to obtain real boundary data, the parameters of the deep reconstruction model are supervisedly fine-tuned using the real boundary data as labels and the current observation history window, local state prior estimates, and topological embedding features as inputs; at the same time, the prior estimates of the state observer are corrected based on the real boundary data to ensure the accuracy of the prior estimates in the next control cycle.

[0040] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: in step S7, the local optimization objective function is used to roll the time domain. The optimization objectives are to minimize node voltage deviation, optimize control smoothness, and optimize boundary coupling consistency. The expression for this is:

[0041]

[0042] In the formula, Indicates the first Local control objectives of a cluster in the rolling time domain. For time-domain indexing in scrolling optimization, Indicates time The node voltage vector, Represents the reference voltage vector. This represents the control increment between adjacent control cycles. Indicates boundary coupling residuals, , and These are weighted matrices for voltage deviation, control increment, and boundary residual, respectively. , and These correspond to the weights of the three types of target items, respectively.

[0043] Local control satisfies the following constraints:

[0044]

[0045]

[0046]

[0047] In the formula, and These are the lower and upper limits of the allowable voltage, respectively. Represents a node At any moment The voltage amplitude; and These are the upper and lower bounds of the control input, respectively. This indicates the upper limit of single-step action variation, used to limit the ramp rate of reactive power compensation, energy storage regulation, or tap changer actions; Indicates the introduction of virtual boundary states The subsequent local system dynamic constraints.

[0048] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: in step S7, the solution framework jointly implemented by deep reinforcement learning and particle swarm optimization includes an offline training phase and an online solution phase, the specific steps of which are as follows:

[0049] S71. Model the cluster voltage control problem as a Markov decision process. Define the system state at time t as the local state of the cluster, the observation vector, and the estimated value of the virtual boundary state. The action space is the adjustment amount of the controllable equipment. The reward function is the negative mapping of the local optimization objective function. Maximizing the cumulative reward is equivalent to minimizing the local optimization objective function.

[0050] S72. The policy network and value network are trained using a deep reinforcement learning algorithm based on the actor-critic architecture. An experience replay pool is constructed to store the state, action, reward, and next state data of the control process. The parameters of the value network and policy network are updated based on mini-batch sampled data to complete offline training.

[0051] S73. Input the current system state into the trained policy network and output the action prior value; construct the initial position of the particle swarm in its neighborhood with the action prior value as the center, and perform a fine search in the local feasible region through the particle swarm algorithm to obtain the optimal control solution that satisfies the constraints.

[0052] S74. During the low-frequency online update phase, new running data is sampled from the experience replay pool to incrementally update the parameters of the policy network and the value network, thereby optimizing the output accuracy of the action prior values.

[0053] As a preferred embodiment of the virtual boundary node-driven boundary interaction decoupling method of the present invention, wherein: in step S73, the construction expression for the initial position of the particle swarm is:

[0054]

[0055] In the formula, Indicates the first The initial position of each particle. This represents the prior action value output by the policy network. This represents a small perturbation vector that satisfies the action constraints. Indicates the number of particles.

[0056] The beneficial effects of this invention are:

[0057] This invention transforms the "real-time communication acquisition" of cross-cluster boundary states into "local observation + deep model reconstruction", thereby reducing the rigid dependence on cross-domain communication from the problem definition level.

[0058] This invention uses a three-element joint criterion of "error-consistency-uncertainty" to determine whether to enter the communication blanking mode, thus avoiding the risk of misjudgment caused by judging based on a single error threshold.

[0059] This invention achieves rapid response and high-quality solution for local control optimization by combining "deep reinforcement learning to quickly provide initial values ​​for actions + particle swarm optimization for fine-grained search". Attached Figure Description

[0060] Figure 1 This is a flowchart illustrating the steps of the boundary interaction decoupling method driven by virtual boundary nodes according to the present invention. Detailed Implementation

[0061] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are some embodiments of the present invention, but not all embodiments.

[0062] like Figure 1 As shown, a virtual boundary node-driven boundary interaction decoupling method is applied to a distributed control scenario of a distribution network cluster containing distributed power sources, and includes the following steps:

[0063] S1. Based on the electrical coupling strength, geographical proximity, communication coverage and controller computing power of the distribution network nodes, the nodes of the entire distribution network are divided into M control clusters, and the physical boundary nodes between adjacent clusters are identified.

[0064] S2. Construct a corresponding virtual boundary node for each physical boundary node. The virtual boundary node is a digital substitute in the control model, used to replace the direct call to the real-time boundary data of adjacent clusters during the online control process.

[0065] S3. Establish a local dynamic discrete state space model for each control cluster, construct a state observer based on the local dynamic discrete state space model, and output the prior estimates of the local state and boundary state of the corresponding cluster through the state observer.

[0066] S4. Construct an observation history window of length H. Based on the observation data within the observation history window, the local state prior estimate output by the state observer, the control input at the previous control time, and the topological embedding features of the corresponding cluster, obtain the virtual boundary state estimate through mapping using a pre-trained deep reconstruction model.

[0067] S5. Calculate the boundary reconstruction error, model uncertainty index, and boundary consistency deviation of the virtual boundary state. Based on the boundary reconstruction error, model uncertainty index, and boundary consistency deviation, calculate the boundary reconstruction confidence of the corresponding cluster at the current moment.

[0068] S6. Based on preset boundary reconstruction error threshold, consistency deviation threshold, and minimum confidence threshold, a communication blanking criterion is constructed. When the communication blanking criterion is met, the communication blanking mode is entered, and the virtual boundary state estimate is directly used to participate in the local control solution without initiating real-time boundary data interaction with adjacent clusters. When the communication blanking criterion is not met, supplementary communication is triggered to obtain the real boundary data of adjacent clusters and to correct the deep reconstruction model.

[0069] S7. Based on the virtual boundary state estimate, a local optimization objective function for distribution network voltage control in the rolling time domain is constructed. The local optimization objective function is solved using a solution framework combining deep reinforcement learning and particle swarm optimization to obtain the optimal control solution.

[0070] S8. Send the optimal control solution to the controllable equipment in the distribution network for execution. In the next control cycle, repeat steps S3 to S8 to complete the control closed-loop iteration.

[0071] In step S1, to reduce the dimensionality of large-scale distribution network control problems, the entire network nodes are first divided into groups based on electrical coupling strength, geographical proximity, communication coverage, and controller computing power. A control cluster. Assume nodes... The cluster number is The cluster partitioning can then be expressed in the following optimized form:

[0072]

[0073] In the formula, Represents a node The control cluster number to which it belongs. Represents the distribution network topology The set of branches in the middle, Indicates that by node With nodes A candidate coupling edge is formed. This indicates the electrical coupling weight or communication-related weight of that edge. For the characteristic function, when ,when , i.e., node With nodes The value is 1 if the data is assigned to a different cluster, and 0 otherwise. Indicates the first The overall task load of the cluster Indicates the expected average load. and Here are the weighting coefficients, where Used to prevent strongly coupled edges from being split into different clusters. Used to constrain load balancing across clusters.

[0074] Furthermore, in step S1, the physical boundary node set is defined as follows:

[0075]

[0076] In the formula, Indicates the first The set of physical boundary nodes of a cluster. This represents the set of nodes within the cluster. Represents the entire set of network nodes. Indicates all that do not belong to the first A set of external nodes for a cluster. This indicates the existence of a node outside the cluster. With this cluster node Connected.

[0077] In step S2, for each physical boundary node, a corresponding virtual boundary node is constructed to replace the direct call to real-time boundary data of adjacent clusters in online control. A cluster, its boundary state vector Defined as:

[0078]

[0079] In the formula, Indicates the first The number of boundary nodes in a cluster. Indicates the first Each boundary node at time... The state subvector, Indicates the first Each boundary node at time... , where are state subvectors , , and These represent the voltage magnitude, active power, reactive power, and equivalent injected current at the boundary node, respectively. This represents the true boundary state vector formed by stacking all boundary node state subvectors column-wise.

[0080]

[0081] In the formula, This represents an estimate of the true boundary state vector. Indicates the first Each boundary node at time... The state estimate subvector, the virtual boundary node does not correspond to an additional physical device, but is a digital substitute in the control model, whose role is to replace the online cross-cluster boundary data with local estimates;

[0082] In step S3, to achieve boundary state reconstruction, it is necessary to establish the first... A local dynamic model of a cluster. Let its discrete state-space expression be:

[0083]

[0084]

[0085] In the formula, Indicates the first Each cluster at time The local state vector typically contains internal states such as node voltage, phase angle, power injection, and energy storage charge state. This is the local state transition matrix. To control the input matrix, To control the input vector, This is the boundary coupling matrix, used to characterize the influence of the boundary state on the local dynamics. This is the process disturbance term. In equation (6), Represents the local observation vector. This is the mapping matrix from state to observation. This is the mapping matrix from the boundary to the observation. For measuring noise.

[0086] To ensure physical interpretability, local observation vectors It can be further written as:

[0087]

[0088] In the formula, , , and These represent the voltage, current, active power, and reactive power that can be directly measured in this cluster, respectively. Indicates the energy storage unit at time The state of charge, Indicates the switch position or network topology status. This indicates the real-time latency characteristics measured locally on the control execution link.

[0089] Furthermore, in S4, without directly obtaining the real-time boundary states of neighboring cells, a state observer is first constructed using a local dynamic model to provide prior estimates of the local and boundary states. Here, an extended observer form with a gain injection term is adopted:

[0090]

[0091]

[0092] In the formula, This represents the local state estimate. This represents the observed output calculated from the estimated state. This is the observer gain matrix, whose function is to reduce the observation error. The state update process is injected to continuously correct the local state estimate; It enters the local dynamic update mode in the form of boundary proxy, providing a priori trajectory for subsequent data-driven reconstruction.

[0093] Furthermore, based on the observer's priors, a boundary reconstruction model implemented by a deep spatiotemporal network is constructed. Let the nearest... The observation history window at each moment is The virtual boundary state can then be given by the following mapping:

[0094]

[0095] In the formula, The parameter is The deep reconstruction model Indicates length is Historical observation window For the local state estimation of the observer output, This is the input value from the previous control time step. For the first The model extracts the topological embedding features of each cluster. It can employ a combination of graph neural networks and gated recurrent units, or graph attention networks and temporal convolutional networks, to simultaneously extract network topological relationships and temporal evolution features.

[0096] In S5, to avoid blindly entering the communication culling mode when the model confidence is insufficient, it is necessary to jointly evaluate the uncertainty of the boundary estimation and its consistency with the local physical model. First, an uncertainty index based on the variance of the ensemble model output is defined:

[0097]

[0098] In the formula, Indicates the number of parallel reconstructed sub-models. Indicates the first Boundary estimates output by each sub-model express Mean estimate of the output of each sub-model This represents the uncertainty index defined by the dispersion output of the ensemble model. The larger the value, the higher the dispersion of the model output and the greater the uncertainty.

[0099] Further define boundary consistency deviation:

[0100]

[0101] In the formula, This represents the residual function composed of power flow balance, nodal power conservation, and boundary coupling constraints. This represents the consistency deviation between the virtual boundary state and the local physical model. If Smaller indicates , and It exhibits a good matching relationship under physical constraints.

[0102] Based on uncertainty and consistency, the reconstruction confidence level is defined as follows:

[0103]

[0104] In the formula, , and These are non-negative scaling factors, which control the reconstruction error respectively. Consistency deviation and uncertainty The strength of the impact on confidence level Indicates the first Each cluster at time The confidence level of the boundary estimation. satisfy The closer the value is to 1, the more reliable the virtual boundary state is.

[0105] Furthermore, in step S6, within each control cycle, a decision is made on whether to enter the communication blanking mode based on the reconstruction error, model consistency, and estimation confidence. The communication blanking criterion is defined as:

[0106]

[0107] In the formula, Indicates the boundary reconstruction error threshold. Indicates the consistency deviation threshold. This represents the minimum confidence threshold. This represents the communication blanking criterion. When... At that time, the controller directly adopts Participating in local control solutions, it no longer requests real-time boundary data from adjacent clusters; when At that time, the system triggers supplementary communication to obtain the true boundary information and correct the model.

[0108] The trigger function for supplementary communication can also be written as:

[0109]

[0110] In the formula, Indicates a supplementary communication trigger flag. This indicates the abrupt change in the output of new energy sources or the amount of load injected. Indicates the power mutation threshold. This indicates a topology switching event. A low-frequency boundary synchronization is triggered whenever any of the following occurs: reconstruction error exceeds a threshold, physical consistency deteriorates, confidence decreases, power surges, or topology changes.

[0111] Furthermore, in step S7, after obtaining the virtual boundary state, a local optimization objective for a single control cluster is constructed. Considering that the main requirements for voltage control are small voltage deviation, smooth control actions, and consistent boundary coupling, a rolling time domain is defined. The optimization objective function within is:

[0112]

[0113] In the formula, Indicates the first Local control objectives of a cluster in the rolling time domain. For time-domain indexing in scrolling optimization, Indicates time The node voltage vector, Represents the reference voltage vector. This represents the control increment between adjacent control cycles. Indicates boundary coupling residuals, , and These are weighted matrices for voltage deviation, control increment, and boundary residual, respectively. , and These correspond to the weights of the three types of target items, respectively.

[0114] Local control also needs to satisfy the following constraints:

[0115]

[0116]

[0117]

[0118] In the formula, and These are the lower and upper limits of the allowable voltage, respectively. Represents a node At any moment The voltage amplitude; and These are the upper and lower bounds of the control input, respectively. This indicates the upper limit of single-step action variation, used to limit the ramp rate of reactive power compensation, energy storage regulation, or tap changer actions; Indicates the introduction of virtual boundary states The subsequent local system dynamic constraints.

[0119] Specifically, in step S7, the joint solution framework of deep reinforcement learning and particle swarm optimization includes an offline training phase and an online solution phase. The specific steps are as follows:

[0120] S71. Model the cluster voltage control problem as a Markov decision process. Define the system state at time t as the local state of the cluster, the observation vector, and the estimated value of the virtual boundary state. The action space is the adjustment amount of the controllable equipment. The reward function is the negative mapping of the local optimization objective function. Maximizing the cumulative reward is equivalent to minimizing the local optimization objective function.

[0121] S72. The policy network and value network are trained using a deep reinforcement learning algorithm based on the actor-critic architecture. An experience replay pool is constructed to store the state, action, reward, and next state data of the control process. The parameters of the value network and policy network are updated based on mini-batch sampled data to complete offline training.

[0122] S73. Input the current system state into the trained policy network and output the action prior value; construct the initial position of the particle swarm in its neighborhood with the action prior value as the center, and perform a fine search in the local feasible region through the particle swarm algorithm to obtain the optimal control solution that satisfies the constraints.

[0123] S74. During the low-frequency online update phase, new running data is sampled from the experience replay pool to incrementally update the parameters of the policy network and the value network, thereby optimizing the output accuracy of the action prior values.

[0124] Among them, the cluster control problem is modeled as a Markov decision process, and time is defined. The status is:

[0125]

[0126] In the formula, This indicates that the reinforcement learning agent is at time... The observed system state is specifically determined by the local system state. Local observation Virtual boundary state Confidence level of boundary estimation Event trigger flag Control quantity at the previous moment Composition; Action Definition That is, the control input vector that needs to be applied at the current control moment.

[0127] To align the reinforcement learning objective with the optimization objective in step eight, the immediate reward is defined as:

[0128]

[0129] In the formula, Indicates time Instant rewards This represents a constraint penalty term, which takes a positive value when the voltage exceeds the limit, the action exceeds the limit, or the dynamic constraint mismatch occurs. This is the penalty factor, used to adjust the intensity of the penalty when constraints are violated. With this definition, maximizing the cumulative reward is equivalent to minimizing the local control objective in step eight.

[0130] Furthermore, the offline training phase of deep reinforcement learning:

[0131] The control policy is trained using an actor-critic architecture. The policy network output is... Value network output is Taking the soft actor-critic idea as an example, its Bellman objective can be written as:

[0132]

[0133]

[0134]

[0135] In the formula, This indicates Bellman's goal. As a discount factor, This represents the expectation operation. For the target value network, The strategy entropy temperature coefficient, This represents the action distribution given by the policy network in the next state. In equation (24) Let the value network loss function be... For the value network parameters; in equation (25) Let the policy network loss function be... These are the policy network parameters.

[0136] Furthermore, the particle swarm optimization online fine-grained search phase:

[0137] When running online, the policy network is first used to output the initial action prior. The initial positions of the particle swarm are then constructed using this action and its neighborhood samples:

[0138]

[0139] In the formula, Indicates the first The initial position of each particle. This represents a small perturbation vector that satisfies the action constraints. Indicates the number of particles. (This is achieved by using...) Generating an initial particle swarm around the center allows particle search to be initiated directly from the high-quality action neighborhood provided by deep reinforcement learning.

[0140]

[0141]

[0142] In the formula, and They represent the first The particle in the first Velocity and position at the next iteration For inertial weights, and As a learning factor, and To meet Uniform random numbers, This represents the best historical position of an individual particle. This represents the global optimal position of the group. In equation (28)... Represents the projection operator. It represents the feasible region of the control input, which is used to remap the updated action to the set that satisfies the constraints.

[0143] For each particle position Calculate the fitness using the objective function from step eight:

[0144]

[0145] In the formula, Indicates the first The particle in the first Fitness at the next iteration Indicates the particle position The rolling time-domain target value obtained after considering it as a candidate control input sequence. Indicates a constraint or penalty item. Used to adjust the impact of penalty terms on fitness. Particle swarm optimization is achieved through continuous updates. and The final output is the online refined control solution. .

[0146] In step S8, after obtaining the optimal control solution, the control quantities such as reactive power regulation, energy storage power regulation and tap changer action are sent to the field equipment for execution, and the system re-enters the "observation-reconstruction-decision-optimization" closed loop in the next control cycle.

[0147] In the communication blanking mode, the boundary estimation error is defined as: The local state estimation error is defined as .when When the stable, deeply reconstructed model error is bounded and event-triggered correction satisfies the maximum allowable step-out interval constraint, there exists a normal number. and , so that:

[0148]

[0149] This indicates the boundary estimation error. Local state estimation error All remain bounded, among which This indicates the upper bound of the boundary state estimation error. This indicates the upper bound of the local state estimation error. This result further demonstrates that, provided the reconstruction accuracy and confidence thresholds are met, communication blanking does not compromise the closed-loop stability of the system.

[0150] This invention transforms the "real-time communication acquisition" of cross-cluster boundary states into "local observation + deep model reconstruction", thereby reducing the rigid dependence on cross-domain communication from the problem definition level.

[0151] This invention uses a three-element joint criterion of "error-consistency-uncertainty" to determine whether to enter the communication blanking mode, thus avoiding the risk of misjudgment caused by judging based on a single error threshold.

[0152] This invention achieves rapid response and high-quality solution for local control optimization by combining "deep reinforcement learning to quickly provide initial values ​​for actions + particle swarm optimization for fine-grained search".

[0153] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0154] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0155] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A virtual boundary node-driven boundary interaction decoupling method, applied to a distributed control scenario of a distribution network cluster containing distributed power sources, characterized in that, Includes the following steps: S1. Based on the electrical coupling strength, geographical proximity, communication coverage and controller computing power of the distribution network nodes, the nodes of the entire distribution network are divided into M control clusters, and the physical boundary nodes between adjacent clusters are identified. S2. Construct a corresponding virtual boundary node for each physical boundary node. The virtual boundary node is a digital substitute in the control model, used to replace the direct call to the real-time boundary data of adjacent clusters during the online control process. S3. Establish a local dynamic discrete state space model for each control cluster, construct a state observer based on the local dynamic discrete state space model, and output the prior estimates of the local state and boundary state of the corresponding cluster through the state observer. S4. Construct an observation history window of length H. Based on the observation data in the observation history window, the local state prior estimate output by the state observer, the control input at the previous control time, and the topological embedding features of the corresponding cluster, obtain the virtual boundary state estimate by mapping through a pre-trained deep reconstruction model. S5. Calculate the boundary reconstruction error, model uncertainty index, and boundary consistency deviation of the virtual boundary state, and calculate the boundary reconstruction confidence of the corresponding cluster at the current time based on the boundary reconstruction error, model uncertainty index, and boundary consistency deviation. S6. Construct communication blanking criteria based on preset boundary reconstruction error threshold, consistency deviation threshold, and minimum confidence threshold; when the communication blanking criteria are met, enter the communication blanking mode, directly use the virtual boundary state estimate to participate in local control solution, and do not initiate real-time boundary data interaction with adjacent clusters; when the communication blanking criteria are not met, trigger supplementary communication, obtain the real boundary data of adjacent clusters, and correct the deep reconstruction model; S7. Based on the virtual boundary state estimate, a local optimization objective function for distribution network voltage control in the rolling time domain is constructed. The local optimization objective function is solved using a solution framework combining deep reinforcement learning and particle swarm optimization to obtain the optimal control solution. S8. The optimal control solution is sent to the controllable equipment in the distribution network for execution. Steps S3 to S8 are executed again in the next control cycle to complete the control closed-loop iteration.

2. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 1, characterized in that: The state observer constructed in step S3 is a Romberg state observer, whose state update equation is constructed based on the cluster local dynamic discrete state space model. In the communication hidden state mode, when the system matrix of the state observer is stable, the reconstruction error of the deep reconstruction model is bounded, and the event-triggered supplementary communication satisfies the maximum allowable step-out interval constraint, both the boundary state estimation error and the local state estimation error remain bounded, satisfying: in, Indicates the boundary estimation error; This represents the local state estimation error; This indicates the upper bound of the boundary state estimation error; This represents the upper bound of the local state estimation error.

3. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 1, characterized in that: The deep reconstruction model in step S4 adopts a combination architecture of graph neural network and gated recurrent unit, or a combination architecture of graph attention network and temporal convolutional network; the deep reconstruction model simultaneously extracts the topological correlation features of the distribution network and the temporal evolution features of the boundary states, and the mapping relationship expression is as follows: In the formula, Let be the estimated virtual boundary state value of the c-th cluster at time t. The parameter is The deep reconstruction model Indicates length is Historical observation window For the local state estimation of the observer output, This is the input value from the previous control time step. For the first Topological embedding features of each cluster.

4. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 3, characterized in that: In step S5, the model uncertainty index is calculated based on the output variance of the integrated deep reconstruction model, and the boundary consistency deviation is calculated based on the residual function of the local physical constraints. The expression for the model uncertainty index is: In the formula, This indicates the number of parallel reconstructed sub-models. Indicates the first Boundary estimates output by each sub-model express Mean estimate of the output of each sub-model This represents the uncertainty index defined by the dispersion output of the ensemble model; The expression for the boundary consistency deviation is: In the formula, This represents the residual function composed of power flow balance, nodal power conservation, and boundary coupling constraints. This indicates the consistency deviation between the virtual boundary state and the local physical model.

5. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 1, characterized in that: In step S5, the expression for reconstructing the confidence level is: In the formula, , and These are non-negative scaling factors, which control the reconstruction error respectively. Consistency deviation and uncertainty The strength of the impact on confidence level Indicates the first Each cluster at time The confidence level of the boundary estimation satisfy .

6. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 5, characterized in that: In step S6, the expression for the communication blanking criterion is: In the formula, Indicates the boundary reconstruction error threshold. Indicates the consistency deviation threshold. This represents the minimum confidence threshold. Indicates the communication blanking criterion; when At that time, the controller directly adopts Participating in local control solutions, it no longer requests real-time boundary data from adjacent clusters; when At this time, the system triggers supplementary communication to obtain the true boundary information and correct the model; The trigger function expression for the supplementary communication is: In the formula, Indicates a supplementary communication trigger flag. This indicates the abrupt change in the output of new energy sources or the amount of load injected. Indicates the power mutation threshold. This indicates a topology switching event.

7. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 1, characterized in that: In step S6, after triggering supplementary communication to obtain real boundary data, the parameters of the deep reconstruction model are supervisedly fine-tuned using the real boundary data as labels and the current observation history window, local state prior estimate, and topological embedding features as inputs. At the same time, the prior estimate of the state observer is corrected based on the real boundary data to ensure the accuracy of the prior estimate in the next control cycle.

8. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 1, characterized in that: In step S7, the local optimization objective function is used to roll the time domain. The optimization objectives are to minimize node voltage deviation, optimize control smoothness, and optimize boundary coupling consistency. The expression for this is: In the formula, Indicates the first Local control objectives of a cluster in the rolling time domain. For time-domain indexing in scrolling optimization, Indicates time The node voltage vector, Represents the reference voltage vector. This represents the control increment between adjacent control cycles. Indicates boundary coupling residuals, , and These are weighted matrices for voltage deviation, control increment, and boundary residual, respectively. , and These correspond to the weights of the three types of target items, respectively. Local control satisfies the following constraints: In the formula, and These are the lower and upper limits of the allowable voltage, respectively. Represents a node At any moment The voltage amplitude; and These are the upper and lower bounds of the control input, respectively. This indicates the upper limit of single-step action variation, used to limit the ramp rate of reactive power compensation, energy storage regulation, or tap changer actions; Indicates the introduction of virtual boundary states Subsequent local system dynamic constraints, This is the local state transition matrix. To control the input matrix, To control the input vector, The boundary coupling matrix, This represents the set of nodes within the cluster.

9. The boundary interaction decoupling method driven by virtual boundary nodes according to claim 8, characterized in that: In step S7, the joint solution framework of deep reinforcement learning and particle swarm optimization includes an offline training phase and an online solution phase, and the specific steps are as follows: S71. Model the cluster voltage control problem as a Markov decision process. Define the system state at time t as the local state of the cluster, the observation vector, and the estimated value of the virtual boundary state. The action space is the adjustment amount of the controllable equipment. The reward function is the negative mapping of the local optimization objective function. Maximizing the cumulative reward is equivalent to minimizing the local optimization objective function. S72. The policy network and value network are trained using a deep reinforcement learning algorithm based on the actor-critic architecture. An experience replay pool is constructed to store the state, action, reward, and next state data of the control process. The parameters of the value network and policy network are updated based on mini-batch sampled data to complete offline training. S73. Input the current system state into the trained policy network and output the action prior value; construct the initial position of the particle swarm in its neighborhood with the action prior value as the center, and perform a fine search in the local feasible region through the particle swarm algorithm to obtain the optimal control solution that satisfies the constraints. S74. During the low-frequency online update phase, new running data is sampled from the experience replay pool to incrementally update the parameters of the policy network and the value network, thereby optimizing the output accuracy of the action prior values.

10. The virtual boundary node-driven boundary interaction decoupling method according to claim 9, characterized in that: In step S73, the construction expression for the initial position of the particle swarm is: In the formula, Indicates the first The initial position of each particle. This represents the prior action value output by the policy network. This represents a small perturbation vector that satisfies the action constraints. Indicates the number of particles.