A power information network sensitive data flow path completion method and system
By employing a dynamic perceptual graph attention mechanism and topology enhancement technology, the problem of missing path completion in power information networks has been solved, achieving higher precision and accuracy of the global topology structure, thereby improving the security operation and maintenance capabilities of power information networks.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NARI INFORMATION & COMM TECH
- Filing Date
- 2026-03-13
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies struggle to accurately fill in missing data flow paths in power information networks, leading to network model distortion and impacting state estimation and fault location. Furthermore, existing methods are deficient in terms of accuracy and global topology perception.
Employing a dynamic perception graph attention mechanism and topology enhancement technology, the system state is estimated using the weighted least squares method to construct a power information network graph. A trained topology-enhanced dynamic perception graph neural network model is then used for path completion, capturing the dynamic correlation characteristics and macroscopic topological features between network nodes.
It enables accurate completion of missing paths in the power information network, improves the accuracy and topology rationality of path completion, and provides more reliable security operation and maintenance support.
Smart Images

Figure CN122287698A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of power information network technology, and in particular to a method and system for completing the sensitive data flow path in power information networks. Background Technology
[0002] As a typical cyber-physical system, the power information network's functions, such as operational status monitoring, energy dispatch management, and fault safety analysis, heavily rely on the integrity of data flow and the accuracy of topological connections. However, in actual operation, factors such as sensor failures, communication interference, data archiving errors, or network attacks often lead to partial loss or damage of topological information recording data flow paths. Such incomplete data distorts the digital twin model of the power information network, triggering a series of chain reactions, including biased state estimation, difficulties in fault location, and even malfunctioning protection systems, seriously threatening the secure and stable operation of the power information network. Therefore, automatically and accurately completing the missing connections in the power information network and restoring the complete topological structure of the data flow paths has become a key technical challenge for improving the intelligent operation and maintenance level of the power information network.
[0003] Current network data completion methods can be mainly divided into two technical approaches: one is the traditional method based on graph theory and statistical learning. This type of method relies on structural heuristics such as the number of common neighbors and the Adamic-Adar exponent to evaluate the connection probability between nodes. Although it has high computational efficiency, it can only utilize local structural statistical features and is difficult to integrate node attribute information or capture complex high-order nonlinear relationships. Its completion accuracy is limited in the power information network environment with strong structural regularity. The other type is the deep learning method based on graph neural networks. It learns the low-dimensional embedding of nodes through the neighborhood information aggregation mechanism, and then realizes link prediction based on node representation similarity or decoder. Although this type of method can utilize both structural and attribute information, it still has obvious limitations in power information network completion tasks. These limitations include the rigidity of the standard GNN aggregation mechanism, difficulty in adapting to the dynamic feature distribution of nodes in different levels of subgraphs, insufficient global topological skeleton representation ability due to its local operation nature, and inconsistency between node representation learning and link prediction task objectives.
[0004] Regarding the development trends of related technologies, graph attention mechanisms, by introducing learnable attention weights, provide a technical foundation for achieving dynamic and soft neighborhood selection, enabling models to focus on key neighbor nodes based on contextual information. Meanwhile, persistent cohomology techniques in topology data analysis capture persistent topological features (such as ring structures) in network evolution through multi-scale filtering processes and transform them into vectorized global topology descriptors, thus providing models with robust representations of macroscopic network connectivity patterns. These advancements indicate that deeply integrating graph neural networks with dynamic neighborhood awareness capabilities with persistent cohomology techniques capable of characterizing global topological invariants to construct an end-to-end link completion framework is a feasible path to overcome existing technological bottlenecks. However, currently, there is no complete solution specifically designed for the data flow characteristics of power information networks. Summary of the Invention
[0005] The purpose of this invention is to overcome the shortcomings of the prior art and provide a method and system for completing the sensitive data flow path in a power information network. By capturing the dynamic correlation characteristics between network nodes through a dynamically perceived graph attention mechanism and extracting the macroscopic topological features of the network using topology enhancement technology, it can accurately complete the missing data flow path.
[0006] To solve the above-mentioned technical problems, the present invention is implemented using the following technical solution:
[0007] In a first aspect, the present invention provides a method for completing the sensitive data flow path in a power information network, comprising:
[0008] Based on the obtained power information network operation data, the power system state is estimated using the weighted least squares method to obtain a standardized system state vector;
[0009] Based on the standardized system state vector and the obtained power information network topology data, a power information network diagram is constructed;
[0010] Based on the power information network diagram, a trained topology-enhanced dynamic perception graph neural network model is used to complete the path, resulting in a power information network diagram with completed paths.
[0011] Optionally, the step of estimating the power system state using the weighted least squares method based on the acquired power information network operation data to obtain a standardized system state vector includes:
[0012] The acquired power information network operation data is preprocessed, and measurement vectors are constructed using the preprocessed power information network operation data. ;
[0013] Initialize system state vector and measurement error covariance matrix The following steps are performed iteratively until the iteration converges:
[0014] According to the measurement function For the The system state vector of the next iteration The partial derivative of is used to construct the first Jacobian matrix of the next iteration ;
[0015] Based on the measurement error covariance matrix and the Jacobian matrix of the next iteration Calculate the first The residual covariance matrix of the next iteration ;in, ;
[0016] According to the measurement vector and the The system state vector of the next iteration The first nonlinear measurement equation is used to calculate the second... Measurement residual vector of the next iteration ;
[0017] According to the The residual covariance matrix of the next iteration and measurement residual vector Calculate the standardized residual vector of the measurement ;
[0018] If the standardized residual vector is measured There exists Rejection quantity measurement ;in, For measurement index, Represents measurement vector The Middle Individual measurement, Represents the standardized residual vector of the measurement In The standardized residuals of the measurement The preset residual detection threshold, , Represents the measurement residual vector middle Measurement residuals Represents the residual covariance matrix middle The residual covariance;
[0019] According to the measurement vector Measurement error covariance matrix and the Jacobian matrix of the next iteration Calculate the first using the corrected equation The system state vector of the next iteration Increment ;
[0020] According to the The system state vector of the next iteration and increment Calculate the first The system state vector of the next iteration ;in, ;
[0021] Calculate the first using the objective function The system state vector of the next iteration target value and the The system state vector of the next iteration target value ;
[0022] like and To determine if the iteration converges, for the ... The system state vector of the next iteration Standardization is performed to obtain the standardized system state vector; where, This represents the vector modulo operation. The preset state convergence threshold, The preset convergence threshold for the objective function;
[0023] The first Jacobian matrix of the next iteration It can be obtained through the following formula:
[0024] ;
[0025] in, Indicates partial derivative;
[0026] The expression for the nonlinear measurement equation is:
[0027] ,
[0028] in, Indicates the system state is The actual observations corresponding to the time;
[0029] The expression for the corrected equation is:
[0030] ;
[0031] The expression for the objective function is:
[0032] ,
[0033] in, Represents the system state vector Target value;
[0034] The standardized system state vector is obtained through the following formula:
[0035] ,
[0036] in, Represents the standardized system state vector. Indicates the first The system state vector of the next iteration The mean, Indicates the first The system state vector of the next iteration The standard deviation.
[0037] Optionally, the expression for the power information network diagram is as follows:
[0038] ,
[0039] in, Represents a power information network diagram. express The set of nodes, , This represents the total number of communication nodes in the power information network. Indicates the first 1 node express The set of edges, , For node indexing, Indicates the first 1 node Indicates the first 1 node express and There is a direct communication connection between them. express The node feature matrix, , Represents a node eigenvectors, , Represents a node eigenvectors, Represents a node The Features in 100 dimensions.
[0040] Optionally, the step of performing path completion using a trained topology-enhanced dynamic perceptual graph neural network model based on the power information network diagram to obtain a path-completed power information network diagram includes:
[0041] The L-hop neighborhood subgraph of each node in the power information network graph is extracted using a subgraph sampling layer;
[0042] Based on the L-hop neighborhood subgraph of each node, the aggregation neighborhood features of each node are obtained using the neighborhood feature extraction module;
[0043] The topology filtering module is used to filter the L-hop neighborhood subgraph of each node to obtain the topological features of each node.
[0044] The topological features and aggregated neighborhood features of each node are fused using a feature fusion layer to obtain the fused features of each node.
[0045] Based on the fusion characteristics of each node, the path completion layer is used to obtain the predicted probability value of a direct communication connection between any two nodes, and the power information network graph is edge-completed according to the predicted probability value to obtain the power information network graph after path completion.
[0046] Optional, any node L-hop neighborhood subgraph The expression is:
[0047] ,
[0048] in, For node indexing, Indicates the first 1 node Indicates the total number of neighborhood levels. express The set of nodes, express The set of edges, express The node feature matrix.
[0049] Optionally, the step of obtaining the aggregated neighborhood features of each node using the neighborhood feature extraction module based on the L-hop neighborhood subgraph of each node includes:
[0050] For any node L-hop neighborhood subgraph Perform the following information aggregation operation:
[0051] for The Middle Any node within the layer , in turn A unified attention head aggregation and its existing edges. Nodes within the layer The neighborhood features are used to obtain the nodes. of Each neighborhood feature; among them Indicates a domain-level index. , express The Middle Node index within a layer express The Middle In-layer and node The node index of the edge exists. The node at level 0 is a node ;
[0052] Node of By concatenating the features of each neighborhood, a node is obtained. Aggregation neighborhood features;
[0053] The expression for the attention head is as follows:
[0054] ,
[0055] in, Represents a node neighborhood characteristics, This represents the ReLU activation function. express The Middle In-layer and node A set of nodes with edges. , Represents a node neighborhood characteristics, Represents a node and Attention scores between express The Middle The layer can learn linear transformation matrices;
[0056] The node and Attention scores between It can be obtained through the following formula:
[0057] ,
[0058] in, Indicates vertical series connection. Indicates the first Layer learnable weight matrix, express The Middle Layer-learnable attention matrix, express Node index in express Any node in, Represents a node eigenvectors, Represents a node The neighborhood characteristics.
[0059] Optionally, the data processing flow of the topology filtering module includes:
[0060] For any node L-hop neighborhood subgraph ,Will All nodes are input into Each filter channel yields... All nodes in The filter value for each filter channel;
[0061] According to all nodes The filter values for each filter channel are calculated, and the persistence difference for each filter channel is determined.
[0062] Based on the persistence difference and maximum filtering value of each filtering channel, linear transformation, trigonometric transformation, Gaussian transformation and top-hat transformation are performed on each filtering channel respectively, and the four transformation results are concatenated to obtain the embedding result of each filtering channel.
[0063] Will The embedding results of each filter channel are concatenated along the channel dimension to obtain a node. Topological features ;
[0064] The filter value is calculated using the following formula:
[0065] ,
[0066] in, express Node index in express Any node in, Indicates the first Nodes in each filtering channel The filter value, This represents the Tanh activation function. This represents the ReLU activation function. This represents the first weight matrix. This represents the second weight matrix. This indicates the first bias value. This indicates the second bias. Represents a node eigenvectors;
[0067] The persistence difference of the filtering channels is obtained by the following formula:
[0068] ,
[0069] in, For filtering channel indexes, , Indicates the first The persistence difference of each filter channel Indicates the first The maximum filter value for each filter channel Represents a node In the The filter value for each filter channel;
[0070] The linear transformation is achieved by the following formula:
[0071] ,
[0072] in, Indicates the first The linear transformation results of each filtering channel Indicates linearly changing weights. This indicates a linearly varying bias. Indicates transpose. Indicates vertical series connection;
[0073] The trigonometric transformation is achieved through the following formula:
[0074] ,
[0075] in, Indicates the first The triangular transformation results of each filtering channel Indicates the learnable peak location parameter;
[0076] The Gaussian transform is achieved by the following formula:
[0077] ,
[0078] in, Indicates the first Gaussian transform results of each filtered channel This indicates the center position of the Gaussian kernel in the dimension of the filtered maximum value. This indicates the central location of the Gaussian kernel on the dimension of persistent difference. For bandwidth;
[0079] The top-hat transformation is achieved through the following formula:
[0080] ,
[0081] in, Indicates the first The top-cap transformation results of each filter channel. Indicates the center of the top-hat transformation. Let be the radius of the top-hat transformation.
[0082] Optionally, the fusion features of each node are obtained using the following formula:
[0083]
[0084] in, Represents a node Topological features, Represents a node The topological features after linear transformation This represents the first learnable fusion weight matrix. This represents the learnable second fusion weight matrix. This represents the first fusion bias vector. This represents the second fusion bias vector. Represents a node The aggregation neighborhood features, Represents a node The fusion characteristics.
[0085] Optionally, the data processing flow of the path completion layer includes:
[0086] Construct a candidate set of node pairs based on the power information network diagram. ;in, , This represents any two nodes in a power information network graph that do not have an edge.
[0087] Traverse the candidate set of nodes Each candidate node pair in Nodes are merged through splicing operations. and nodes The fusion features are used to obtain candidate node pairs. joint eigenvectors ;in, , Represents a node The fusion characteristics Represents a node The fusion characteristics;
[0088] Candidate node pairs joint eigenvectors The input is fed into an MLP classifier for classification decision, resulting in candidate node pairs. Predicted probability value ;
[0089] Iterate through the predicted probability values of all candidate node pairs, if In the power information network diagram, it is a node and nodes Establish connecting edges; where, This is a preset probability threshold.
[0090] In a second aspect, the present invention provides a system for completing the sensitive data flow path of a power information network, used to implement the method for completing the sensitive data flow path of a power information network as described in any one of the first aspects, including:
[0091] The power system state estimation module is used to: estimate the power system state using the weighted least squares method based on the acquired power information network operation data, and obtain a standardized system state vector;
[0092] The graph construction module is used to: construct a power information network graph based on the standardized system state vector and the acquired power information network topology data;
[0093] The path completion module is used to: perform path completion based on the power information network diagram using a trained topology-enhanced dynamic perception graph neural network model, and obtain a path-completed power information network diagram.
[0094] Compared with the prior art, the beneficial effects achieved by the present invention are as follows:
[0095] 1. By capturing the dynamic correlation characteristics between network nodes through a dynamically perceived graph attention mechanism and extracting the macroscopic topological features of the network using topology enhancement technology, it can accurately complete the missing data flow paths.
[0096] 2. It breaks through the fixed neighborhood aggregation mode of traditional graph neural networks, solves the problem of insufficient perception of global topology in existing methods, and significantly improves the accuracy of path completion and topology rationality through end-to-end link prediction task design, providing more reliable technical support for the safe operation and maintenance of power information networks. Attached Figure Description
[0097] Figure 1 This is a flowchart of a method for completing the sensitive data flow path in a power information network according to an embodiment of the present invention.
[0098] Figure 2 This is a schematic diagram of neighborhood subgraph information aggregation provided according to an embodiment of the present invention. Detailed Implementation
[0099] The technical solution of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the embodiments and specific features in the embodiments are detailed descriptions of the technical solution of the present application, rather than limitations thereof. In the absence of conflict, the embodiments and technical features in the embodiments can be combined with each other.
[0100] It should be noted that the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article generally indicates that the preceding and following related objects have an "or" relationship.
[0101] Example 1
[0102] This invention discloses a method for completing the sensitive data flow path in a power information network, with reference to... Figure 1 As shown, the specific steps include the following:
[0103] S1. Based on the obtained power information network operation data, the power system state is estimated using the weighted least squares method to obtain the standardized system state vector;
[0104] S2, construct a power information network diagram based on the standardized system state vector and the obtained power information network topology data;
[0105] S3. Based on the power information network diagram, the trained topology-enhanced dynamic perception graph neural network model is used to complete the path, resulting in a power information network diagram with completed path.
[0106] Specifically, in step S1, the operation data of the power information network during a specific time period is first obtained. This data mainly includes three parts: first, network topology data, including all network nodes such as routers and switches and the communication links connecting them, which constitute the backbone of data transmission paths; second, network performance data, including dynamic indicators such as latency, packet loss rate, and throughput of each link, which reflect the quality of data flow; and third, equipment status data, including operating parameters such as CPU utilization and memory usage of network devices.
[0107] Based on the acquired power information network operation data, the power system state is estimated using the weighted least squares method, resulting in a standardized system state vector, including:
[0108] S1.1, preprocess the acquired power information network operation data, and construct measurement vectors using the preprocessed power information network operation data. The preprocessing includes formatting, initial screening of outliers, and time alignment, all of which are aligned using a unified timestamp to ensure they originate from the same sampling time and maintain data consistency over time. , Indicates transpose. Indicates the total number of measurement types. Indicates the first Individual quantity measurement;
[0109] S1.2, Initialize the system state vector and measurement error covariance matrix ;in, Indicates the total number of system status attributes. Indicates the first A system state value, used to characterize the overall health status of the network, link reachability, node load levels, or device operating performance—state quantities within the system that are not directly observable; measurement error covariance matrix. Constructed based on the statistical characteristics of different measurement sources;
[0110] S1.3, iteratively execute the following steps until the iteration converges:
[0111] S1.3.1, according to the measurement function For the The system state vector of the next iteration The partial derivative of is used to construct the first Jacobian matrix of the next iteration :
[0112] ;
[0113] in, Indicates partial derivative;
[0114] S1.3.2, based on the measurement error covariance matrix and the Jacobian matrix of the next iteration Calculate the first The residual covariance matrix of the next iteration ;in, ;
[0115] S1.3.3, based on the measurement vector and the The system state vector of the next iteration The first nonlinear measurement equation is used to calculate the second... Measurement residual vector of the next iteration :
[0116] ,
[0117] in, Indicates the system state is The actual observations corresponding to the time;
[0118] S1.3.4, To improve data quality and avoid interference from abnormal measurements on the estimation results, this embodiment uses the standardized residual method to detect bad data, according to the... The residual covariance matrix of the next iteration and measurement residual vector Calculate the standardized residual vector of the measurement ;in, , Indicates the first Standardized residuals of individual measurements;
[0119] S1.3.5, after identifying abnormal data, it can be processed by increasing its corresponding covariance (i.e., decreasing the weight) or completely removing the measurement, thereby avoiding the adverse effects of abnormal measurements on the convergence direction and estimation accuracy of subsequent iterations; if the standardized residual vector of the measurement There exists Rejection quantity measurement ;in, For measurement index, Represents measurement vector The Middle Individual measurement, Represents the standardized residual vector of the measurement In The standardized residuals of the measurement The preset residual detection threshold, , Represents the measurement residual vector middle Measurement residuals Represents the residual covariance matrix middle The residual covariance;
[0120] S1.3.6, based on the measurement vector Measurement error covariance matrix and the Jacobian matrix of the next iteration Calculate the first using the corrected equation The system state vector of the next iteration Increment : ;
[0121] S1.3.7, according to the section The system state vector of the next iteration and increment Calculate the first The system state vector of the next iteration ;
[0122] S1.3.8, Calculate the first... The system state vector of the next iteration target value and the The system state vector of the next iteration target value ;
[0123] S1.3.9, if and To determine if the iteration converges, for the ... The system state vector of the next iteration Standardization is performed to obtain the standardized system state vector; where, This represents the vector modulo operation. The preset state convergence threshold, This is the preset convergence threshold for the objective function.
[0124] The expression for the objective function is:
[0125] ,
[0126] in, Represents the system state vector Target value;
[0127] The standardized system state vector is obtained through the following formula:
[0128] ,
[0129] in, Represents the standardized system state vector. Indicates the first The system state vector of the next iteration The mean, Indicates the first The system state vector of the next iteration The standard deviation.
[0130] Through the data processing flow of S1.3.3 to S1.3.5, not only can random errors caused by measurement noise be corrected, but serious erroneous data caused by equipment failure or communication anomalies can also be effectively identified and eliminated; the quality and reliability of input data are significantly improved, providing an accurate data foundation for subsequent graph structure modeling, while also ensuring the rationality and credibility of the final completion result;
[0131] This embodiment can accurately depict the operating status of the power information network and identify potential abnormal measurements or faulty links through a bad data detection mechanism, thereby providing a reliable data foundation for network monitoring, performance analysis, risk warning and subsequent scheduling optimization.
[0132] In step S2, the expression for the power information network diagram is as follows:
[0133] ,
[0134] in, Represents a power information network diagram. express This is a set of nodes that comprehensively covers all node devices in the power information network. , This represents the total number of communication nodes in the power information network. Indicates the first 1 node express The set of edges, where each edge corresponds precisely to an actual connection in the power information network. , For node indexing, Indicates the first 1 node Indicates the first 1 node express and There is a direct communication connection between them. express The node feature matrix, , Represents a node eigenvectors, , Represents a node eigenvectors, Represents a node The Features in 100 dimensions.
[0135] In step S3, the step of performing path completion using a trained topology-enhanced dynamic perceptual graph neural network model based on the power information network diagram to obtain a path-completed power information network diagram includes:
[0136] S3.1, use the subgraph sampling layer to extract the L-hop neighborhood subgraph of each node in the power information network graph;
[0137] S3.2, Based on the L-hop neighborhood subgraph of each node, use the neighborhood feature extraction module to obtain the aggregated neighborhood features of each node;
[0138] S3.3, use the topology filtering module to filter the L-hop neighborhood subgraph of each node to obtain the topology features of each node;
[0139] S3.4, The topological features and aggregated neighborhood features of each node are fused using the feature fusion layer to obtain the fused features of each node;
[0140] S3.5, based on the fusion characteristics of each node, the path completion layer is used to obtain the predicted probability value of a direct communication connection between any two nodes, and the power information network graph is edge-completed according to the predicted probability value to obtain the power information network graph after path completion.
[0141] In step S3.1, to systematically capture multi-scale topological dependencies from local direct connections to global indirect associations, this embodiment extracts the L-hop neighborhood subgraph for all nodes. For any node... L-hop neighborhood subgraph The expression is:
[0142] ,
[0143] in, For node indexing, Indicates the first 1 node Indicates the total number of neighborhood levels. express The set of nodes, express The set of edges, express The node feature matrix.
[0144] This design is based on the hierarchical nature of power information network topology: in shallow neighborhoods (such as...) In this model, the focus is on direct physical connections, such as communication lines between adjacent nodes. As the neighborhood hierarchy deepens, the model gradually perceives a broader range of network structures, including regional interconnection patterns, ring network structures, and transmission paths. Through this hierarchical perception mechanism, the final representation of each node can simultaneously contain local connection details and global topological context, providing a comprehensive information foundation for accurately identifying potential connections between nodes.
[0145] In step S3.2, this embodiment employs a hierarchical dynamic multi-head attention mechanism specifically designed to extract multi-level feature information from L-hop neighborhood subgraphs. Its core lies in using a structurally unified dynamic multi-head attention unit to perform information aggregation operations within different levels of a node's neighborhood. Each level is based on the same attention computation framework but independently learns attention weight parameters unique to that level, thereby achieving differentiated modeling of neighborhood influences at different topological scales.
[0146] refer to Figure 2 As shown, where Layer represents a layer, the step of obtaining the aggregated neighborhood features of each node using the neighborhood feature extraction module based on the L-hop neighborhood subgraph of each node includes:
[0147] For any node L-hop neighborhood subgraph Perform the following information aggregation operation:
[0148] for The Middle Any node within the layer , in turn A unified attention head aggregation and its existing edges. Nodes within the layer The neighborhood features are used to obtain the nodes. of Each neighborhood feature; among them Indicates a domain-level index. , express The Middle Node index within a layer express The Middle In-layer and node The node index of the edge exists. The node at level 0 is a node ;
[0149] Node of By concatenating the features of each neighborhood, a node is obtained. The aggregation neighborhood features.
[0150] The expression for the attention head is as follows:
[0151] ,
[0152] in, Represents a node neighborhood characteristics, This represents the ReLU activation function. express The Middle In-layer and node A set of nodes with edges. , Represents a node neighborhood characteristics, Represents a node and Attention scores between express The Middle The layer can learn linear transformation matrices;
[0153] The node and Attention scores between It can be obtained through the following formula:
[0154] ,
[0155] in, Indicates vertical series connection. Indicates the first Layer learnable weight matrix, express The Middle Layer-learnable attention matrix, express Node index in express Any node in, Represents a node eigenvectors, Represents a node The neighborhood characteristics.
[0156] Through this hierarchical attention mechanism, the model can dynamically adjust the information aggregation strategy of each neighborhood layer, forming a fine-grained representation of the power information network neighborhood. This provides strong feature support for subsequent data flow path completion tasks.
[0157] In step S3.3, in order to extract deep structural information beyond local connectivity patterns from power information network data, learning its topological features is crucial. Topological features can reveal the ring structure, connected components and their hierarchical organization patterns in the network. This information reflects the overall connectivity robustness of the network, potential data flow paths and the distribution of key hub nodes. These unique structural features provide an important basis for model completion of the network.
[0158] Persistent homology, as a learning method for topological features, has traditional methods that can capture various topological features, but the predefined filtering functions it relies on have obvious limitations: on the one hand, it cannot adapt to the semantic differences of different relational subgraphs, and on the other hand, a single filtering function is difficult to fully capture diverse topological structural features.
[0159] Based on the above problems, this embodiment constructs a multilayer perceptron (MLP) to address these issues. A learnable filtering function filters the neighborhood subgraph. The MLP network consists of multiple fully connected layers and gradually extracts the deep interaction features between node pairs through nonlinear transformation.
[0160] The data processing flow of the topology filtering module includes:
[0161] For any node L-hop neighborhood subgraph ,Will All nodes are input into Each filter channel yields... All nodes in Filter values for each filter channel:
[0162] ,
[0163] in, express Node index in express Any node in, Indicates the first Nodes in each filtering channel The filter value, This represents the Tanh activation function. This represents the ReLU activation function. This represents the first weight matrix. This represents the second weight matrix. This indicates the first bias value. This indicates the second bias. Represents a node eigenvectors;
[0164] According to all nodes The filter values for each filter channel are used to calculate the persistence difference for each filter channel:
[0165] ,
[0166] in, For filtering channel indexes, , Indicates the first The persistence difference of each filter channel Indicates the first The maximum filter value for each filter channel Represents a node In the The filter value for each filter channel;
[0167] Based on the persistence difference and maximum filtering value of each filter channel, linear transformation, trigonometric transformation, Gaussian transformation, and top-hat transformation are performed on each filter channel. The four transformation results are then concatenated to obtain the embedding result for each filter channel.
[0168] The linear transformation is achieved by the following formula:
[0169] ,
[0170] in, Indicates the first The linear transformation results of each filtering channel Indicates linearly changing weights. This indicates a linearly varying bias. Indicates transpose. Indicates vertical series connection;
[0171] The trigonometric transformation is achieved through the following formula:
[0172] ,
[0173] in, Indicates the first The triangular transformation results of each filtering channel Indicates the learnable peak location parameter;
[0174] The Gaussian transform is achieved by the following formula:
[0175] ,
[0176] in, Indicates the first Gaussian transform results of each filtered channel This indicates the center position of the Gaussian kernel in the dimension of the filtered maximum value. This indicates the central location of the Gaussian kernel on the dimension of persistent difference. For bandwidth;
[0177] The top-hat transformation is achieved through the following formula:
[0178] ,
[0179] in, Indicates the first The top-cap transformation results of each filter channel. Indicates the center of the top-hat transformation. The radius of the top-hat transformation;
[0180] Will The embedding results of each filter channel are concatenated along the channel dimension to obtain a node. Topological features .
[0181] This embodiment replaces the traditional non-differentiable algorithm with a parameterized filtering function, enabling topological feature computation to be embedded in an end-to-end training framework. While maintaining the sensitivity of persistent homology to graph structural mutations, it reduces computational complexity from... Reducing the embedding function to linear level significantly improves the model's applicability to large-scale graph data; the combination of four types of embedding functions provides the model with multi-level topological feature representation capabilities. Among them, the triangular transformation accurately captures salient strip features in persistent graphs through learnable peak position parameters, making it particularly suitable for identifying key topological structures; the Gaussian transformation utilizes the smoothing properties of radial basis functions to achieve robust modeling of multi-scale topological density; the linear transformation, as a basic operator, ensures efficient transfer of simple topological features; and the top-hat transformation enhances the sensitivity to key points in complex topologies through the combination of rational functions.
[0182] In step S3.4, the fusion features of each node are obtained using the following formula:
[0183]
[0184] in, Represents a node Topological features, Represents a node The topological features after linear transformation This represents the first learnable fusion weight matrix. This represents the learnable second fusion weight matrix. This represents the first fusion bias vector. This represents the second fusion bias vector. Represents a node The aggregation neighborhood features, Represents a node The fusion characteristics.
[0185] In step S3.5, the data processing flow of the path completion layer includes:
[0186] Construct a candidate set of node pairs based on the power information network diagram. ;in, , This represents any two nodes in a power information network graph that do not have an edge.
[0187] Traverse the candidate set of nodes Each candidate node pair in Nodes are merged through splicing operations. and nodes The fusion features are used to obtain candidate node pairs. joint eigenvectors ;in, , Represents a node The fusion characteristics Represents a node The fusion characteristics;
[0188] Candidate node pairs joint eigenvectors The input is fed into an MLP classifier for classification decision, resulting in candidate node pairs. The predicted probability value between 0 and 1 ;
[0189] To obtain the final binary prediction result, an adjustable probability threshold is set. Iterate through the predicted probability values of all candidate node pairs. In the power information network diagram, it is a node and nodes Establishing connecting edges; this link prediction mechanism based on full node pairing, combined with a powerful MLP classifier, can accurately identify all potential missing connections in the power information network, thereby achieving complete network topology completion.
[0190] The training process for a topology-enhanced dynamic perceptual graph neural network model includes:
[0191] Based on the complete topology connection information of the existing power information network, all connection edges are randomly divided into three mutually exclusive subsets according to a preset ratio: training set (70%), validation set (15%), and test set (15%). During the partitioning process, a completely random sampling strategy is adopted to ensure that each edge is assigned to only one subset, while maintaining the consistency of the distribution of each subset in terms of connection type and topology characteristics.
[0192] The training set samples are input into the topology-enhanced dynamic perceptual graph neural network model for forward propagation to obtain the predicted probability values of each candidate node pair in each sample.
[0193] The predicted probability values of each candidate node pair in each sample are used to calculate the model loss by minimizing the cross-entropy loss function. Backpropagation is then performed based on the model loss, and the parameters of the topology-enhanced dynamic perception graph neural network model are updated using an optimizer (such as Adam).
[0194] During the iteration process, the model training effect is verified using a validation set. When the validation set loss tends to stabilize over several training cycles, the model is deemed to have converged and the final parameters are saved.
[0195] To verify the effectiveness and applicability of the sensitive data flow path completion method for power information networks proposed in this embodiment, its effectiveness was verified on multiple network datasets; and an accuracy ( ), recall rate ) and The value is used as an evaluation index, and the calculation formula is as follows:
[0196] ,
[0197] ,
[0198] ,
[0199] in, The number of missing edges that were correctly predicted. This represents the number of edges that are incorrectly predicted even though there are no missing edges. This represents the number of missing edges that were incorrectly predicted.
[0200] Example 2:
[0201] Based on the same inventive concept as Embodiment 1, this embodiment provides a method for completing the sensitive data flow path in a power information network, including:
[0202] S1. Based on the obtained power information network operation data, the power system state is estimated using the weighted least squares method to obtain the standardized system state vector;
[0203] S2, construct a power information network diagram based on the standardized system state vector and the obtained power information network topology data;
[0204] S3. Based on the power information network diagram, the trained topology-enhanced dynamic perception graph neural network model is used to complete the path, resulting in a power information network diagram with completed path.
[0205] Taking a municipal-level power information network as an example, this network contains 300 network nodes and 878 links. The method for completing the sensitive data flow path in the power information network according to this embodiment is as follows:
[0206] (1) Construct a power information network diagram. The node characteristics include 10 features such as node degree, latency, packet loss rate, throughput, and CPU utilization. Node feature matrix The edges are randomly divided into a training set (70%), a validation set (15%), and a test set (15%).
[0207] (2) Construct a topology filtering module with 2 sampling layers for the neighborhood subgraph, 4 heads for the dynamic multi-head attention mechanism, and 16 filtering functions to obtain the representation of each node in the graph and predict whether there is an edge relationship between all pairs of nodes.
[0208] (3) Using the Adam optimizer, set the learning rate to 0.001, train for 3000 epochs, and conduct experiments on the test set.
[0209] The comparative experimental results of the topology-enhanced dynamic perception graph neural network model proposed in this embodiment with the Dynamic-Aware Graph Neural Network (D-GNN) and Graph Neural Network (GNN) are shown in Table 1 below. The power information network sensitive data flow path completion method proposed in this embodiment effectively improves the performance of the graph neural network, raising its F1 score for completion task to 0.400.
[0210] Table 1. Experimental Results of Municipal Power Information Network
[0211]
[0212] The topology-enhanced dynamic perceptual graph neural network model proposed in this embodiment outperforms dynamic perceptual graph neural networks and graph neural networks in terms of performance. This performance improvement mainly stems from the model's deep understanding and utilization of the inherent physical structure of the circuit network: the topology enhancement mechanism enables the model to explicitly learn the inherent connection rules in the network, thereby making inferences that are more consistent with physical reality even when information is missing. This directly leads to higher accuracy—that is, stronger reliability of the completed path and effectively reduces the risk of false alarms. At the same time, this mechanism can also uncover implicit associations in the data that are not directly reflected but are structurally highly reasonable, thereby activating more potential true paths. This directly translates into higher recall—that is, the model can discover more missed connections and achieve more complete network reconstruction. Therefore, by deeply integrating topological prior knowledge into the dynamic learning process, this invention not only achieves breakthroughs in quantitative indicators but also demonstrates significant value in terms of the actual credibility and completeness of the completed results.
[0213] Example 3:
[0214] Based on the same inventive concept as Embodiment 1, this embodiment provides a method for completing the sensitive data flow path in a power information network, including:
[0215] S1. Based on the obtained power information network operation data, the power system state is estimated using the weighted least squares method to obtain the standardized system state vector;
[0216] S2, construct a power information network diagram based on the standardized system state vector and the obtained power information network topology data;
[0217] S3. Based on the power information network diagram, the trained topology-enhanced dynamic perception graph neural network model is used to complete the path, resulting in a power information network diagram with completed path.
[0218] Taking a provincial-level power information network as an example, this network contains 5715 network nodes and 19794 links. The method for completing the sensitive data flow path in the power information network according to this embodiment is as follows:
[0219] (1) Construct a power information network diagram. The node characteristics include 10 features such as node degree, latency, packet loss rate, throughput, and CPU utilization. Node feature matrix The edges are randomly divided into a training set (70%), a validation set (15%), and a test set (15%).
[0220] (2) Construct a topology-enhanced dynamic perceptual graph neural network model with 2 sampling layers in the neighborhood subgraph, 4 heads in the dynamic multi-head attention mechanism, and 16 filtering functions to obtain the representation of each node in the graph and predict whether there are edge relationships between all pairs of nodes.
[0221] (3) Using the Adam optimizer, set the learning rate to 0.001, train for 3000 epochs, and conduct experiments on the test set.
[0222] The comparative experimental results of the topology-enhanced dynamic perception graph neural network model proposed in this embodiment with the Dynamic-Aware Graph Neural Network (D-GNN) and Graph Neural Network (GNN) are shown in Table 2. The power information network sensitive data flow path completion method proposed in this embodiment effectively improves the performance of graph neural networks, increasing its F1 score for completion tasks to 0.362.
[0223] Table 2 Experimental Results of Provincial Power Communication Backbone Network
[0224]
[0225] This method maintains its core advantages even in larger-scale power information networks, achieving an F1 score of 0.362, continuing to significantly outperform dynamic perceptual graph neural networks and graph neural networks. This performance confirms the effectiveness and robustness of the topology-enhanced dynamic perceptual graph neural network model: although larger-scale networks bring an exponential increase in topological complexity, leading to a reasonable decline in the absolute performance of all models, this method, with its ability to explicitly model the network's physical structure, can still most effectively constrain the search space and identify the most reliable path from complex connections. This ensures that its completion results still have high practical value in highly complex scenarios, providing a crucial basis for its application to large-scale circuit systems in the real world.
[0226] Example 4:
[0227] Based on the same inventive concept as Embodiment 1, this embodiment provides a power information network sensitive data flow path completion system, including:
[0228] The power system state estimation module is used to: estimate the power system state using the weighted least squares method based on the acquired power information network operation data, and obtain a standardized system state vector;
[0229] The graph construction module is used to: construct a power information network graph based on the standardized system state vector and the acquired power information network topology data;
[0230] The path completion module is used to: perform path completion based on the power information network diagram using a trained topology-enhanced dynamic perception graph neural network model, and obtain a path-completed power information network diagram.
[0231] The specific functions of each module described above are explained in the relevant content of the method in Embodiment 1, and will not be repeated here.
[0232] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0233] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0234] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0235] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0236] The embodiments of the present invention have been described above with reference to the accompanying drawings. However, the present invention is not limited to the specific embodiments described above. The specific embodiments described above are merely illustrative and not restrictive. Those skilled in the art can make many other forms under the guidance of the present invention without departing from the spirit and scope of the claims. All of these forms are within the protection scope of the present invention.
Claims
1. A method for completing the sensitive data flow path in a power information network, characterized in that, include: Based on the obtained power information network operation data, the power system state is estimated using the weighted least squares method to obtain a standardized system state vector; Based on the standardized system state vector and the obtained power information network topology data, a power information network diagram is constructed; Based on the power information network diagram, a trained topology-enhanced dynamic perception graph neural network model is used to complete the path, resulting in a power information network diagram with completed paths.
2. The method for completing the sensitive data flow path in a power information network according to claim 1, characterized in that, The process of estimating the power system state using weighted least squares based on the acquired power information network operation data to obtain a standardized system state vector includes: The acquired power information network operation data is preprocessed, and measurement vectors are constructed using the preprocessed power information network operation data. ; Initialize system state vector and measurement error covariance matrix The following steps are performed iteratively until the iteration converges: According to the measurement function For the The system state vector of the next iteration The partial derivative of is used to construct the first Jacobian matrix of the next iteration ; Based on the measurement error covariance matrix and the Jacobian matrix of the next iteration Calculate the first The residual covariance matrix of the next iteration ;in, ; According to the measurement vector and the The system state vector of the next iteration The first nonlinear measurement equation is used to calculate the second... Measurement residual vector of the next iteration ; According to the The residual covariance matrix of the next iteration and measurement residual vector Calculate the standardized residual vector of the measurement ; If the standardized residual vector is measured There exists Rejection quantity measurement ;in, For measurement index, Represents measurement vector The Middle Individual measurement, Represents the standardized residual vector of the measurement In The standardized residuals of the measurement The preset residual detection threshold, , Represents the measurement residual vector middle Measurement residuals Represents the residual covariance matrix middle The residual covariance; According to the measurement vector Measurement error covariance matrix and the Jacobian matrix of the next iteration Calculate the first using the corrected equation The system state vector of the next iteration Increment ; According to the The system state vector of the next iteration and increment Calculate the first The system state vector of the next iteration ;in, ; Calculate the first using the objective function The system state vector of the next iteration target value and the The system state vector of the next iteration target value ; like and To determine if the iteration converges, for the ... The system state vector of the next iteration Standardization is performed to obtain the standardized system state vector; where, This represents the vector modulo operation. The preset state convergence threshold, The preset convergence threshold for the objective function; The first Jacobian matrix of the next iteration It can be obtained through the following formula: ; in, Indicates partial derivative; The expression for the nonlinear measurement equation is: , in, Indicates the system state is The actual observations corresponding to the time; The expression for the corrected equation is: ; The expression for the objective function is: , in, Represents the system state vector Target value; The standardized system state vector is obtained through the following formula: , in, Represents the standardized system state vector. Indicates the first The system state vector of the next iteration The mean, Indicates the first The system state vector of the next iteration The standard deviation.
3. The method for completing the sensitive data flow path in a power information network according to claim 1, characterized in that, The expression for the power information network diagram is as follows: , in, Represents a power information network diagram. express The set of nodes, , This represents the total number of communication nodes in the power information network. Indicates the first 1 node express The set of edges, , For node indexing, Indicates the first 1 node Indicates the first 1 node express and There is a direct communication connection between them. express The node feature matrix, , Represents a node eigenvectors, , Represents a node eigenvectors, Represents a node The Features in 100 dimensions.
4. The method for completing the sensitive data flow path in a power information network according to claim 1, characterized in that, The step of performing path completion based on the power information network diagram using a trained topology-enhanced dynamic perception graph neural network model to obtain a path-completed power information network diagram includes: The L-hop neighborhood subgraph of each node in the power information network graph is extracted using a subgraph sampling layer; Based on the L-hop neighborhood subgraph of each node, the aggregation neighborhood features of each node are obtained using the neighborhood feature extraction module; The topology filtering module is used to filter the L-hop neighborhood subgraph of each node to obtain the topological features of each node. The topological features and aggregated neighborhood features of each node are fused using a feature fusion layer to obtain the fused features of each node. Based on the fusion characteristics of each node, the path completion layer is used to obtain the predicted probability value of a direct communication connection between any two nodes, and the power information network graph is edge-completed according to the predicted probability value to obtain the power information network graph after path completion.
5. The method for completing the sensitive data flow path in a power information network according to claim 4, characterized in that, any node L-hop neighborhood subgraph The expression is: , in, For node indexing, Indicates the first 1 node Indicates the total number of neighborhood levels. express The set of nodes, express The set of edges, express The node feature matrix.
6. The method for completing the sensitive data flow path in a power information network according to claim 5, characterized in that, The step of obtaining the aggregated neighborhood features of each node based on its L-hop neighborhood subgraph using a neighborhood feature extraction module includes: For any node L-hop neighborhood subgraph Perform the following information aggregation operation: for The Middle Any node within the layer , in turn A unified attention head aggregation and its existing edges. Nodes within the layer The neighborhood features are used to obtain the nodes. of Each neighborhood feature; among them Indicates a domain-level index. , express The Middle Node index within a layer express The Middle In-layer and node The node index of the edge exists. The node at level 0 is a node ; Node of By concatenating the features of each neighborhood, a node is obtained. Aggregation neighborhood features; The expression for the attention head is as follows: , in, Represents a node neighborhood characteristics, This represents the ReLU activation function. express The Middle In-layer and node A set of nodes with edges. , Represents a node neighborhood characteristics, Represents a node and Attention scores between express The Middle The layer can learn linear transformation matrices; The node and Attention scores between It can be obtained through the following formula: , in, Indicates vertical series connection. Indicates the first Layer learnable weight matrix, express The Middle Layer-learnable attention matrix, express Node index in express Any node in, Represents a node eigenvectors, Represents a node The neighborhood characteristics.
7. The method for completing the sensitive data flow path in a power information network according to claim 5, characterized in that, The data processing flow of the topology filtering module includes: For any node L-hop neighborhood subgraph ,Will All nodes are input into Each filter channel yields... All nodes in The filter value for each filter channel; According to all nodes The filter values for each filter channel are calculated, and the persistence difference for each filter channel is determined. Based on the persistence difference and maximum filtering value of each filtering channel, linear transformation, trigonometric transformation, Gaussian transformation and top-hat transformation are performed on each filtering channel respectively, and the four transformation results are concatenated to obtain the embedding result of each filtering channel. Will The embedding results of each filter channel are concatenated along the channel dimension to obtain a node. Topological features ; The filter value is calculated using the following formula: , in, express Node index in express Any node in, Indicates the first Nodes in each filtering channel The filter value, This represents the Tanh activation function. This represents the ReLU activation function. This represents the first weight matrix. This represents the second weight matrix. This indicates the first bias value. This indicates the second bias. Represents a node eigenvectors; The persistence difference of the filtering channels is obtained by the following formula: , in, For filtering channel indexes, , Indicates the first The persistence difference of each filter channel Indicates the first The maximum filter value for each filter channel Represents a node In the The filter value for each filter channel; The linear transformation is achieved by the following formula: , in, Indicates the first The linear transformation results of each filtering channel Indicates linearly changing weights. This indicates a linearly varying bias. Indicates transpose. Indicates vertical series connection; The trigonometric transformation is achieved through the following formula: , in, Indicates the first The triangular transformation results of each filtering channel Indicates the learnable peak location parameter; The Gaussian transform is achieved by the following formula: , in, Indicates the first Gaussian transform results of each filtered channel This indicates the center position of the Gaussian kernel in the dimension of the filtered maximum value. This indicates the central location of the Gaussian kernel on the dimension of persistent difference. For bandwidth; The top-hat transformation is achieved through the following formula: , in, Indicates the first The top-cap transformation results of each filter channel. Indicates the center of the top-hat transformation. Let be the radius of the top-hat transformation.
8. The method for completing the sensitive data flow path in a power information network according to claim 4, characterized in that, The fusion features of each node are obtained by the following formula: in, Represents a node Topological features, Represents a node The topological features after linear transformation This represents the first learnable fusion weight matrix. This represents the learnable second fusion weight matrix. This represents the first fusion bias vector. This represents the second fusion bias vector. Represents a node The aggregation neighborhood features, Represents a node The fusion characteristics.
9. The method for completing the sensitive data flow path in a power information network according to claim 4, characterized in that, The data processing flow of the path completion layer includes: Construct a candidate set of node pairs based on the power information network diagram. ;in, , This represents any two nodes in a power information network graph that do not have an edge. Traverse the candidate set of nodes Each candidate node pair in Nodes are merged through splicing operations. and nodes The fusion features are used to obtain candidate node pairs. joint eigenvectors ;in, , Represents a node The fusion characteristics Represents a node The fusion characteristics; Candidate node pairs joint eigenvectors The input is fed into an MLP classifier for classification decision, resulting in candidate node pairs. Predicted probability value ; Iterate through the predicted probability values of all candidate node pairs, if In the power information network diagram, it is a node and nodes Establish connecting edges; where, This is a preset probability threshold.
10. A system for completing the sensitive data flow path in a power information network, characterized in that, include: The power system state estimation module is used to: estimate the power system state using the weighted least squares method based on the acquired power information network operation data, and obtain a standardized system state vector; The graph construction module is used to: construct a power information network graph based on the standardized system state vector and the acquired power information network topology data; The path completion module is used to: perform path completion based on the power information network diagram using a trained topology-enhanced dynamic perception graph neural network model, and obtain a path-completed power information network diagram.