A power distribution network voltage prediction method based on a space-time graph neural network
By constructing a spatiotemporal graph neural network model and combining the power grid topology and electrical coupling relationship, the spatiotemporal correlation problem in distribution network voltage prediction is solved, achieving high-precision and robust voltage prediction and risk identification, which is suitable for complex distribution network environments with a high proportion of distributed energy access.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- STATE GRID JIANGSU ELECTRIC POWER CO LTD NANTONG POWER SUPPLY BRANCH
- Filing Date
- 2026-05-26
- Publication Date
- 2026-06-23
AI Technical Summary
Existing voltage prediction methods for distribution networks are unable to effectively handle spatiotemporal correlations, cannot accurately obtain the complex evolution of voltage state over time, and lack high-precision, robust, and interpretable advanced sensing decision support in distribution networks with a high proportion of renewable energy.
A spatiotemporal graph neural network-based approach is adopted. By constructing a weighted directed graph model and combining it with the relationships between power grid nodes and electrical connections, the improved graph neural network and time series model are used to capture the topology and electrical coupling relationships of the power grid. Deep learning is then performed to predict voltage, and physical consistency regularization terms are incorporated for training to output future voltage predictions and risk identification.
It significantly improves the accuracy and robustness of power grid dynamic forecasting, provides high-precision voltage forecasting and risk warning, adapts to topology changes, has strong generalization ability and engineering credibility, and supports power grid optimization decision-making in environments with a high proportion of distributed energy resources.
Smart Images

Figure CN122267751A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent analysis and artificial intelligence technology for power systems, specifically a method for predicting distribution network voltage based on spatiotemporal graph neural networks. Background Technology
[0002] With the deepening of energy transition, the penetration rate of distributed energy sources, represented by photovoltaics and wind power, in distribution networks continues to rise. The intermittent, random, and volatile output of these distributed power sources leads to a significant increase in voltage fluctuations in the distribution network, especially at the end nodes of the lines, and a sharp increase in the risk of voltage exceeding limits, posing a severe challenge to the safe, high-quality, and economical operation of the power grid.
[0003] Currently, distribution network voltage prediction methods are mainly divided into three categories: 1. Physical model-based methods: These methods rely on accurate power grid physical models (such as power flow equations) to predict voltage through state estimation and power flow calculation. However, these methods are computationally complex, require high accuracy of model parameters, and are difficult to adapt to the strong uncertainty and real-time requirements brought about by the high proportion of distributed energy access; 2. Traditional data-driven methods: These include time series analysis (such as Autoregressive Integral Moving Average (ARIMA) and Kalman filtering) and classical machine learning methods (such as Support Vector Machines (SVR) and Gradient Boosting Decision Trees (XGBoost). These methods typically treat each node as an independent time series, failing to effectively model the complex spatial correlations and electrical coupling relationships between power grid nodes, resulting in limited prediction accuracy and weak generalization ability; 3. Deep learning-based single-point / sequence prediction methods: Such as Long Short-Term Memory (LSTM) networks and temporal convolutional networks. These models can effectively capture the temporal dependencies of single-point voltages, but their network structures (such as fully connected and one-dimensional convolution) essentially treat the power grid as a regular Euclidean space or an independent sequence of nodes, without explicitly utilizing the inherent graph topology information of the power grid. Therefore, their ability to model network spatial correlations, topological dynamic changes, and deep electrical coupling patterns is insufficient, resulting in limited prediction accuracy and robustness when dealing with complex power grid dynamics.
[0004] To address the above issues, Chinese Patent Publication No. CN117709526A discloses a method for predicting distribution network voltage based on a Long Short-Term Neural Network (LSTM). This method includes: acquiring a distribution network voltage dataset; dividing the dataset into training, validation, and test sets; training the LSTM using the training set and validating the model results using the validation set; optimizing the validation set using a Bayesian optimization algorithm and saving the trained LSTM model; inputting the test set to validate the LSTM test set; optimizing the test set using a Bayesian optimization algorithm; outputting a distribution network voltage prediction model; and predicting the distribution network voltage to obtain a voltage prediction map. This method can accurately predict distribution network voltage. Furthermore, LSTM can better capture long-term dependencies in time series data, providing a more comprehensive understanding of the relationships between series, thereby improving model performance.
[0005] For example, Chinese patent CN112564098B discloses a high-proportion photovoltaic distribution network voltage prediction method based on a temporal convolutional neural network, including: Step 1, data preprocessing of the original load data: Based on multiple time scales, the voltage time series data is normalized using the maximum-minimum interval scaling method to obtain a complete voltage series; Step 2, constructing an input feature vector set: Feature selection is performed based on the extreme gradient boosting tree algorithm of decision trees, a training sample set is constructed, the weights of each feature are output, and different feature subsets are selected by combining the weight magnitude and the voltage prediction model; Step 3, establishing a voltage prediction framework based on a high-proportion photovoltaic distribution network, training a temporal convolutional network prediction model, and obtaining the voltage prediction result. This invention significantly improves the accuracy of distribution network voltage prediction by combining the extracted features with time and inputting them into different channels of the temporal convolutional neural network model to obtain prediction results.
[0006] Currently, existing voltage prediction technologies for distribution networks still have shortcomings: while both patents improve the reliability of grid operation to some extent, they struggle to directly address the spatiotemporally correlated voltage prediction problem. They cannot accurately capture the complex evolution of voltage state over time, and therefore cannot provide decision support for high-precision, robust, and interpretable advanced sensing of voltage state in distribution networks with a high proportion of renewable energy. Existing technologies still require improvement. Summary of the Invention
[0007] To address the shortcomings of existing technologies, this invention provides a distribution network voltage prediction method based on spatiotemporal graph neural networks to solve the aforementioned problems.
[0008] To achieve the above objectives, the present invention provides the following technical solution: a distribution network voltage prediction method based on a spatiotemporal graph neural network, mainly comprising the following steps: S1. Distribution network diagram structure modeling: The target distribution network is abstracted into a weighted directed graph G=(V, E, A, X).
[0009] Where V is the set of nodes, corresponding to the bus or key monitoring point in the distribution network; E is the set of edges, corresponding to the electrical connection relationships such as feeders and transformer branches in the distribution network; A is the adjacency matrix of the graph, and its element a_ij is used to characterize the electrical connection strength between node i and node j.
[0010] Furthermore, a_ij can take the per-unit value of the corresponding branch admittance, the reciprocal of the impedance, or a similarity function calculated based on electrical distance (such as impedance magnitude).
[0011] X is a node feature matrix, where each row corresponds to a feature vector of a node at a specific time. The feature vector should contain at least the following four types of information: voltage electrical quantity, node injected power, node attributes, and environmental and temporal context.
[0012] S2. Historical Operation Data Acquisition and Preprocessing: Collect the node feature matrix sequence [X_{t-T+1},..., X_t] for T consecutive historical moments of the distribution network from the distribution network monitoring system, advanced measurement system, simulation platform, or historical database. Simultaneously, obtain the adjacency matrix A_t corresponding to the network operation state at each historical moment. If the network topology is fixed, A_t is a constant matrix; if the topology is variable (e.g., due to network reconfiguration), A_t is a dynamic matrix that changes over time.
[0013] S3. Voltage Prediction Based on Spatiotemporal Graph Neural Network: The sequence of historical node feature matrices and the corresponding adjacency matrix sequences obtained in step S2 are input into a pre-trained spatiotemporal graph neural network model. This model, through deep learning of historical spatiotemporal graph data, outputs a sequence of predicted voltage amplitude values for all nodes over the next H time steps [{Y}_{t+1}, ..., {Y}_{t+H}]. The spatiotemporal graph neural network model includes the following four cascaded modules: S3-1. Spatial Feature Encoding Module: An improved graph neural network layer is used to aggregate information about the target node and its multi-order neighbor nodes at each individual time step, thereby extracting spatial correlation features containing the power grid topology and electrical coupling relationships. The improvement is reflected in the use of a hybrid layer that integrates a multi-relationship graph convolutional network and an electrically guided graph attention network to enhance the accuracy and physical rationality of spatial information extraction.
[0014] S3-2. Temporal Feature Encoding Module: Employs sequence models such as gated recurrent units and temporal convolutional networks to capture the dynamic evolution of each node's features over time and extract time-dependent features.
[0015] S3-3. Spatiotemporal Fusion Module: Employing a spatiotemporal cross-attention mechanism, it dynamically and adaptively fuses "spatial features" from the spatial feature encoding module and "temporal features" from the temporal feature encoding module to generate a spatiotemporal joint embedding representation for each node that can simultaneously characterize the spatiotemporal evolution law.
[0016] S3-4. Decoding and Prediction Module: Based on the aforementioned spatiotemporal joint embedding representation, a fully connected neural network is used for decoding and mapping, ultimately outputting the predicted node voltage values for multiple future steps. Furthermore, this module can also be designed to output an uncertainty quantification index for the predicted values.
[0017] S4. Voltage Over-Limit Risk Identification and Early Warning: Based on the node voltage prediction values for the next H time points output by the model, compare them with the preset upper and lower voltage safety operation thresholds. Automatically identify nodes with voltage over-limit (too high or too low) risks, the predicted over-limit time, the over-limit magnitude, and other information, and generate a structured voltage risk early warning report for operators to make decisions or trigger automatic control strategies.
[0018] Furthermore, in step S1, the node feature vector specifically includes, but is not limited to: voltage amplitude, voltage phase angle, injected active power, injected reactive power, load type code, rated capacity of distributed power source, ambient temperature, and light intensity.
[0019] Furthermore, the spatiotemporal graph neural network model is trained by minimizing the composite loss function L_total, which is: L_total = α_1 L_mse + α_2 L_phy + α_3 L_smooth + α_4 L_nll Where L_mse is the mean square error loss between the predicted voltage and the actual voltage, L_phy is the physical consistency regularization loss based on power flow equation constraints, L_smooth is the regularization loss to ensure the time smoothness of the prediction results, L_nll is the negative log-likelihood loss used to quantify the prediction uncertainty; α_1, α_2, α_3, α_4 are weighting coefficients.
[0020] Compared to existing technologies, the advantages of this invention are as follows: A distribution network voltage prediction method based on a Spatio-Temporal Graph Neural Network (ST-GNN) innovates the ST-GNN architecture, models the topological spatial correlation and temporal evolution of the distribution network's operating state, and deeply integrates node relationships from multiple dimensions including physical connections, electrical coupling, and data statistics, greatly enhancing the ability to characterize complex power grid dynamics and significantly improving prediction accuracy. Using a dynamic graph structure as the model's basic input and inductive bias, it naturally adapts to different power grid topologies. The model can dynamically adjust the input adjacency matrix according to topological changes such as switching operations and network reconfiguration, avoiding the failure of traditional fixed models due to topological changes. When dealing with practical scenarios such as missing measurement data and noise interference, the model, with its graph structure and... The spatiotemporal co-modeling exhibits excellent generalization ability and robustness. A creative approach introduces a physical consistency regularization term into the loss function, embedding the fundamental power flow equations of the power system as prior physical knowledge into the data-driven model. The model's learning process not only strives for high fitting to historical data but also requires its output to conform to fundamental physical laws, effectively avoiding the "ill-conditioned" or physically unreliable predictions that may arise from pure black-box models, significantly enhancing the rationality and engineering credibility of the prediction results. This invention not only provides high-precision voltage point prediction but also quantifies prediction uncertainty, offering risk probability information to operators and supporting risk-based early warning and decision-making. It can identify voltage exceedance risks minutes to hours in advance, providing a forward-looking, quantifiable, and highly reliable decision-making basis for advanced applications such as active voltage control, reactive power optimization, and active / reactive power dispatching of distributed energy resources. It is suitable for complex distribution network environments with a high proportion of distributed energy resources, addressing the problems of fragmented modeling of spatiotemporal coupling characteristics, insufficient utilization of grid diagram structure and electrical coupling, and weak modeling of dynamic evolution and spatiotemporal interaction. Attached Figure Description
[0021] Figure 1 This is an overall flowchart of an embodiment of a distribution network voltage prediction method based on a spatiotemporal graph neural network according to the present invention; Figure 2 This is a schematic diagram of the core structure of the ST-GNN model in an embodiment of the distribution network voltage prediction method based on spatiotemporal graph neural network of the present invention.
[0022] Figure 3 Comparison of MAE simulations with different prediction step sizes.
[0023] Figure 4 Comparison of RMSE simulations under different prediction step sizes.
[0024] Figure 5 Simulation diagram of robustness test with 30% measurement missing. Detailed Implementation
[0025] The technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments.
[0026] This invention provides a technical solution: a distribution network voltage prediction method based on a spatiotemporal graph neural network. This embodiment uses the IEEE 33-bus distribution test system as an example to specifically illustrate the implementation steps of this invention, as follows: Figure 1 As shown.
[0027] S1: Graph structure modeling.
[0028] In the IEEE 33-bus system, the 33 buses are considered as graph nodes, and the 32 lines are considered as directed edges, constructing a directed graph of the distribution network. The element a_ij of the adjacency matrix A takes the per-unit admittance value of branch ij. If a line is disconnected, the corresponding element is 0.
[0029] The feature vector of each node is designed as a 12-dimensional x_i = [V_i, θ_i, P_i, Q_i, L_type_i, DG_cap_i, T_i, R_i, t_hour, t_week, sin(t_hour), cos(t_hour)].
[0030] in: V_i, θ_i: Node voltage magnitude (pu) and phase angle (rad).
[0031] P_i, Q_i: Net active / reactive power (pu) injected into the node, negative for load and positive for power supply.
[0032] L_type_i: Load type (e.g., residential, commercial, industrial), using 3D unique thermal encoding.
[0033] DG_cap_i: Rated capacity of distributed photovoltaic power (pu), or 0 if not specified.
[0034] T_i, R_i: Ambient temperature (°C) and light intensity (W / m²) of the region where the node is located.
[0035] t_hour, t_week: The hour (0-23) and week (0-6) corresponding to the timestamp.
[0036] sin(t_hour), cos(t_hour): Encode the hours periodically to better capture daily cyclical patterns.
[0037] S2: Data generation and preprocessing.
[0038] A detailed simulation model was built in MATLAB / Simulink to simulate system operation under different seasons and date types throughout the year. A large-scale, diverse historical operation dataset was generated by setting varying load curves (daily load curves superimposed with random fluctuations), photovoltaic output curves (based on actual meteorological data), and random network topology changes (simulating switching operations). After data standardization, sample pairs were constructed: the input was the graph sequence {G_{t-59}, ..., G_t} for the past 60 minutes (T=60), and the output was the true voltage sequence {Y_{t+1}, ..., Y_{t+60}} labeled for the next 60 minutes (H=60). Ultimately, over 300,000 valid samples were obtained, which were divided into training, validation, and test sets in an 8:1:1 ratio.
[0039] S3: ST-GNN Model Construction and Training.
[0040] Build as Figure 2 The ST-GNN model shown has the following specific parameters: S3-1. Spatial Feature Encoding Module: Employs a two-layer electrically guided multi-relationship graph attention network. The first layer aggregates first-order neighbors, defined as nodes... i The set of nodes directly connected by physical branches N 1( i ); defined as a directly connected node of a first-order neighbor and not i The set of its own nodes N 2( i In the dynamic diagram: topology at the current time. Real-time updates N 1( i ), N 2( i The second layer aggregates second-order neighbors. Each layer contains 8 independent attention heads. Attention calculation formula: , ,in, This is the physical adjacency matrix (1 for connected areas, 0 for others). , The magnitude of the branch impedance; Normalize to [0,1]; finally output 128-dimensional node features.
[0041] S3-2. Temporal Feature Encoding Module: Employs a two-layer bidirectional gated recurrent unit, with each layer having a hidden state dimension of 64. This module processes the spatially encoded feature sequence of each node along the temporal dimension, outputting node temporal features with a dimension of 128.
[0042] S3-3. Spatiotemporal Fusion Module: Employs a 4-head cross-attention layer. Temporal features are used as queries, and spatial features as keys and values. Spatiotemporal association weights are dynamically calculated and fused to generate a 256-dimensional spatiotemporal joint embedding.
[0043] S3-4. Decoding and Prediction Module: Consists of two fully connected layers, with a ReLU activation function in between. The final output layer is a linear layer, outputting the voltage prediction values for all 33 nodes over the next 60 time steps. Simultaneously, a parallel variance prediction head is designed to output the uncertainty (variance) of each prediction value.
[0044] Model training: The Adam optimizer was used with an initial learning rate of 1e-3 and a learning rate decay strategy. The batch size was 64. A composite loss function was used. Among them, physical residual loss Based on the simplified DistFlow equation: , , residual , The computation is performed on all nodes, and the entire process is differentiable and backpropagation is possible. An early stopping strategy is used on the validation set, and the training process lasts for approximately 300 epochs.
[0045] S4: Online forecasting and risk warning.
[0046] In practical applications, power grid measurement data (SCADA / AMI) for the past 60 minutes (T=60) is acquired every minute through a data acquisition interface, and a real-time adjacency matrix A_t is generated based on the current switching state. This data is then constructed into a graph sequence and input into a pre-trained ST-GNN model. The model can complete inference within seconds on a GPU, outputting predicted voltage values for all nodes for the next 5 to 60 minutes (H=60).
[0047] The safe voltage range is set to [0.95 pu, 1.05 pu]. Automatic scan prediction results: If the predicted voltage of a node exceeds the safe range at any future time, it will be immediately marked as a "risk node".
[0048] Generate an early warning report, which includes: risk node number, predicted start / end time of exceeding the limit, maximum exceedance range, and prediction confidence level (based on uncertainty quantification).
[0049] On the visualization interface, the risk node is highlighted and flashing on the geographic map, and the portion of its voltage prediction curve that exceeds the limit is marked in red, while a warning information box pops up.
[0050] Performance verification and analysis: To verify the effectiveness of the method of the present invention, a comprehensive comparative experiment was conducted on the distribution network topology of IEEE 33-node and IEEE 123-node networks. The simulation generated time series data of load, photovoltaic and switch states, with a sampling interval of 5 minutes, and divided into training set, validation set and test set in a ratio of 8:1:1. Test scenario: distributed power penetration rate of 10%, 20% and 30%; prediction step size of 5 min, 15 min, 30 min and 60 min. The comparison methods and parameters include: (1) linear extrapolation: two-point linear fitting is used, with no hyperparameters; (2) Kalman filtering: the state vector is the voltage amplitude, the observation noise R=0.01, and the process noise Q=0.001; (3) support vector machine: the kernel function is RBF, the penalty coefficient C=1.0, gamma=0.1, and 5-fold cross-validation. (4) XGBoost: Number of decision trees 100, maximum depth 6, learning rate 0.1, subsample=0.8; (4) LSTM: 2-layer bidirectional LSTM, hidden dimension 64, learning rate 1e-3, batch size=64, dropout=0.2; (5) TCN: 2-layer temporal convolution, kernel size 3, hidden dimension 128, learning rate 1e-3; (6) Standard GCN: 2-layer graph convolution, output dimension 128, learning rate 1e-3, no physical constraints.
[0051] The evaluation index is the mean absolute error. Root mean square error and mean absolute percentage error , in, and These represent the predicted and actual voltage values, respectively, and N is the total number of test samples.
[0052] A complete robustness testing scheme is implemented to verify the model's robustness to measurement missingness, including the following standardized tests: Randomly mask 30% of the node measurement data (voltage, power) to simulate a data loss scenario; keep the model structure and parameters unchanged, and re-infer the data under missing data; independently repeat the experiment 10 times to eliminate random errors; calculate the performance degradation rate: Performance degradation rate (%) = [(complete data index − missing data index) / complete data index] × 100%. The method of this invention exhibits a performance degradation of less than 13% under 30% data loss, demonstrating strong robustness and fault tolerance.
[0053] To ensure the reliability of the results, a paired t-test was used for statistical verification: the ST-GNN of this invention was used as the experimental group, and LSTM, TCN, and standard GCN were used as the control groups; the paired t-test was performed on the RMSE sequences of the same test set and the same step length.
[0054] Test results show that, under different distributed power source penetration rates (10%, 20%, 30%) and different prediction step sizes (5 min, 15 min, 30 min, 60 min), the ST-GNN model proposed in this invention achieves optimal performance on all evaluation metrics. Figure 3 and Figure 4 As shown, with a 30% photovoltaic penetration rate, the predicted MAE for the voltage over the next 15 minutes is as low as 0.0038 pu, significantly outperforming the best comparative method, LSTM (MAE = 0.0065 pu). As the prediction duration increases, the accuracy decline of the proposed method is significantly slower than all comparative methods, demonstrating strong long-term prediction capabilities. In robustness tests simulating the random loss of 30% of measurement data, the performance degradation of the proposed method is less than 13%, as shown in the figure. Figure 5 The results demonstrate its good fault tolerance for incomplete data. Paired t-tests show that this method significantly outperforms traditional time series models and standard graphical neural networks in terms of MAE, RMSE, and MAPE, and the experimental results are stable and reliable.
[0055] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. A distribution network voltage prediction method based on spatiotemporal graph neural network, characterized in that, Includes the following steps: S1. Distribution network diagram structure modeling: The target distribution network is abstracted as a weighted directed graph G=(V, E, A, X), where V is the set of nodes, E is the set of edges, A is the adjacency matrix representing the electrical connection strength between nodes, and X is the node feature matrix; S2. Historical operation data acquisition and preprocessing: Acquire the node feature matrix sequence [X_{t-T+1}, ..., X_t] and the corresponding adjacency matrix A_t for T consecutive historical moments of the distribution network; S3. Voltage prediction based on spatiotemporal graph neural network: Input the node feature matrix sequence and adjacency matrix of the historical time points mentioned in S2 into the pre-trained spatiotemporal graph neural network model, and output the node voltage prediction values [{Y}_{t+1},...,{Y}_{t+H}] for the next H time points; the spatiotemporal graph neural network model includes: S3-1. Spatial Feature Encoding Module: An improved graph neural network layer is used to aggregate information of the target node and its multi-order neighbor nodes at each individual time step, and extract spatial correlation features containing the power grid topology and electrical coupling relationship. The improved graph neural network layer adopts a hybrid layer that integrates a multi-relation graph convolutional network and an electrically guided graph attention network. S3-2. Temporal Feature Encoding Module: Employing a gated recurrent unit and a temporal convolutional network sequence model, this module captures the dynamic evolution of each node's features over time along the temporal dimension, extracting time-dependent features. S3-3. Spatiotemporal Fusion Module: Employing a spatiotemporal cross-attention mechanism, this module dynamically and adaptively fuses "spatial features" from the spatial feature encoding module and "temporal features" from the temporal feature encoding module to generate a spatiotemporal joint embedding representation for each node that can simultaneously characterize the spatiotemporal evolution pattern. S3-4. Decoding and Prediction Module: Based on the spatiotemporal joint embedding representation, decoding and mapping are performed through a fully connected neural network to finally output the predicted node voltage values for multiple future steps. This module can also be designed to output the uncertainty quantification index of the predicted values. S4. Voltage Over-Limit Risk Identification and Early Warning: Based on the predicted node voltage values for the next H time points, compare them with the preset upper and lower voltage safety operation thresholds to automatically identify nodes with voltage over-limit risks, the predicted over-limit time, and the over-limit magnitude information, and generate a structured voltage risk early warning report for operators to make decisions or trigger automatic control strategies.
2. The method according to claim 1, characterized in that, In step S1, the feature vector of each node in the node feature matrix X includes at least four of the following: voltage amplitude, voltage phase angle, injected active power, injected reactive power, load type code, distributed power supply capacity, ambient temperature, light intensity, hourly time features within a day, and weekly time features within a week.
3. The method according to claim 1 or 2, characterized in that, In step S1, the value of element a_ij in the adjacency matrix A is the per-unit value of the corresponding line admittance, the reciprocal of the impedance, or a similarity function calculated based on electrical distance.
4. The method according to claim 1, characterized in that, The spatial feature encoding module uses a multi-relation graph convolutional network layer, which simultaneously aggregates information based on the physical connection adjacency matrix, the electrical distance adjacency matrix, and the historical data correlation adjacency matrix.
5. The method according to claim 1, characterized in that, The spatial feature encoding module uses an electrically guided graph attention network layer, which introduces a priori weighting factors based on the electrical distance or impedance magnitude between nodes when calculating the attention coefficients between nodes.
6. The method according to claim 1, characterized in that, The spatiotemporal fusion module employs a spatiotemporal cross-attention mechanism to dynamically fuse features from the spatial feature encoding module and the temporal feature encoding module.
7. The method according to claim 1, characterized in that, The spatiotemporal graph neural network model is trained by minimizing the composite loss function L_total, which is: L_total = α_1 L_mse + α_2 L_phy + α_3 L_smooth + α_4 L_nll Where L_mse is the mean square error loss between the predicted voltage and the actual voltage, L_phy is the physical consistency regularization loss based on power flow equation constraints, L_smooth is the regularization loss to ensure the time smoothness of the prediction results, L_nll is the negative log-likelihood loss used to quantify the prediction uncertainty; α_1, α_2, α_3, α_4 are weighting coefficients.
8. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the distribution network voltage prediction method based on spatiotemporal graph neural network as described in any one of claims 1-7.