A substation equipment fault association rule mining and risk early warning method based on a federal time knowledge graph

By using federated temporal knowledge graph technology, the problems of data privacy and security and cross-site collaborative modeling in substation equipment fault early warning are solved. It achieves high accuracy and interpretability of fault association rule mining and early warning, supporting intelligent operation and maintenance of large-scale substation clusters and power grid safety and stability.

CN122221202APending Publication Date: 2026-06-16TIANYU SPACE (BEIJING) TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
TIANYU SPACE (BEIJING) TECHNOLOGY CO LTD
Filing Date
2026-03-24
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing substation equipment fault early warning technologies cannot simultaneously ensure data privacy and security while maintaining cross-site collaborative modeling. They cannot accurately capture the temporal evolution patterns of equipment faults, have insufficient accuracy in fault association rule mining, and exhibit poor interpretability and generalization ability in early warning. Consequently, they are unable to support the intelligent operation and maintenance of large-scale substation clusters and the safe and stable operation of the power grid.

Method used

A federated temporal knowledge graph-based approach is adopted, which uses a federated learning architecture that coordinates multiple substation edge nodes and power grid dispatch center cloud nodes to achieve cross-site temporal knowledge graph collaborative construction and distributed mining of fault association rules. Vertical federated learning, homomorphic encryption and secure multi-party computation technologies are used to ensure data privacy, and improved temporal constraint algorithms are combined to mine equipment fault association rules.

🎯Benefits of technology

It achieves advanced graded early warning of equipment failures with high accuracy and strong interpretability, accurately identifies the propagation patterns of cascading failures, supports intelligent operation and maintenance of substations, and ensures the safe and stable operation of the power grid.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122221202A_ABST
    Figure CN122221202A_ABST
Patent Text Reader

Abstract

This invention discloses a method for mining fault association rules and providing risk early warning for substation equipment based on a federated temporal knowledge graph, belonging to the interdisciplinary field of intelligent power operation and maintenance and AI. This method constructs a vertical federated architecture with edge nodes and cloud collaboration, encompassing six core processes: data preprocessing, local temporal knowledge graph construction, federated graph collaborative construction, fault rule federated mining, risk classification and early warning, and incremental model updates. Nodes communicate via homomorphic encryption and secure multi-party computation encryption, with raw data stored locally throughout the process to ensure privacy and security. This method can achieve standardization of multi-source heterogeneous data, privacy-preserving collaborative computation, fault cascading pattern mining, and fault propagation deduction. It features strong privacy protection, high mining accuracy, comprehensive fault evolution capture, and precise and efficient early warning, making it suitable for intelligent operation and maintenance of multi-regional substation clusters, hidden danger investigation, and cascading risk prevention and control, ensuring the safe and stable operation of the power grid.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of power system automation and intelligent operation and maintenance technology, specifically involving substation equipment fault diagnosis, risk warning and fault association rule mining technology, and also involves artificial intelligence cross-technology such as time series knowledge graph construction, federated learning, privacy computing and distributed time series data mining. Background Technology

[0002] With the continuous advancement of the "dual carbon" target, the new power system is characterized by a high proportion of new energy access and a high proportion of power electronic equipment application. As the core hub of power grid energy flow, voltage transformation, and fault isolation, the operational reliability of the primary and secondary equipment in substations directly determines the power supply security and stable operation level of the regional power grid. Currently, my country's substations have fully completed digital and intelligent transformation. Core equipment such as transformers, circuit breakers, instrument transformers, and surge arresters are equipped with massive online monitoring terminals, SCADA data acquisition systems, and fault recording devices. This continuously generates massive amounts of multi-source heterogeneous data, including equipment operation sequence parameters, fault event ledgers, operation and maintenance records, power grid topology data, and environmental parameter data. This provides a complete data foundation for accurate perception of substation equipment status, mining of fault evolution patterns, and advanced risk warning. At the same time, the electrical coupling depth of multiple substation clusters within the regional power grid continues to deepen. A single equipment failure in a single substation can easily trigger a chain of failures across equipment and substations through the power grid topology links, posing a significant threat to the safe and stable operation of the large power grid. This also puts forward higher requirements for substation equipment failure early warning technology, namely "advanced early warning, accurate fault tracing, cross-domain collaborative prevention and control, and data privacy and security".

[0003] Traditional substation equipment fault early warning and diagnosis technologies are mainly divided into three categories: alarm methods based on fixed thresholds, methods based on physical mechanism modeling, and centralized machine learning methods. Among them, alarm methods based on fixed thresholds can only provide post-event alarms after equipment parameters exceed limits, failing to capture the gradual temporal evolution of equipment from early anomaly ignition to fault triggering. This results in serious false alarms and missed alarms, completely lacking the ability to provide early fault warnings. Modeling methods based on physical mechanisms heavily rely on the fault evolution mechanism and electrical physical characteristics of the equipment. However, substation equipment types are complex, and operating conditions vary greatly. Fault mechanisms under different equipment and scenarios exhibit significant heterogeneity, making mechanism modeling for all scenarios and all types of equipment extremely difficult. The models also suffer from severely insufficient generalization ability and cannot adapt to large-scale substations. The standardized operation and maintenance requirements of power plant clusters; although centralized machine learning methods can mine the mapping relationship between equipment status and fault events through data-driven mode, they are limited by the common industry problem of scarce fault samples in a single substation. The model training is insufficient, the generalization ability is weak, and it can only mine the feature correlation within a single device. It cannot characterize the electrical connection relationship between devices, the fault causal relationship and the cross-domain propagation characteristics, making it difficult to identify the risk of cascading faults. At the same time, the model is a black box model, the fault reasoning process is not transparent, and it cannot provide operation and maintenance personnel with clear fault causes and handling basis, making it difficult to be applied to actual engineering operation and maintenance scenarios.

[0004] In recent years, knowledge graph technology, with its powerful entity relationship representation and semantic reasoning capabilities, has been gradually applied to the field of substation fault diagnosis, providing a feasible path to solve the industry pain point of poor interpretability of fault reasoning. However, existing substation fault diagnosis solutions based on knowledge graphs still have significant technical shortcomings: First, most existing solutions construct static knowledge graphs, which can only represent the static associations between equipment, faults, and maintenance events, and cannot depict the dynamic changes of equipment state parameters and entity associations over time. The initiation, evolution, and triggering of substation equipment faults have strict temporal dependencies and time window constraints. Static knowledge graphs cannot capture the temporal causal relationships and evolutionary patterns of faults. Furthermore, the mining of fault association rules often employs algorithms without temporal constraints, which can only mine static co-occurrence relationships between entities, easily generating a large number of pseudo-association rules and failing to support early warning of faults. Second, existing methods... The current approach often employs a centralized data graph construction model, which requires the centralized aggregation of raw operating data and fault data from various substations to a central node for modeling. However, substation operating data involves core power grid operational safety, corporate trade secrets, and user electricity privacy. Since each substation belongs to a different operation and maintenance entity, there is a serious data silo problem. Raw data cannot be directly shared across entities, and the centralized construction model has a very high risk of privacy leakage. It also cannot achieve cross-domain knowledge fusion for multi-substation clusters. Thirdly, due to the scarcity of fault samples at a single site, the centralized solution cannot effectively utilize fault sample resources from multiple sites, making it difficult to discover common association rules for equipment faults with industry-wide applicability. The model's cross-scenario adaptability is severely insufficient.

[0005] Federated learning, as a distributed collaborative modeling technology under a privacy protection framework, can complete cross-node collaborative model training without disclosing local original data among data holders, providing a technical direction for resolving the core contradiction between data silos and privacy protection in power systems. While some federated learning solutions have been applied to scenarios such as power system load forecasting and equipment status identification, existing solutions still have significant technical shortcomings: First, most existing federated learning solutions only train models on structured time-series data and do not deeply integrate with knowledge graph technology. They cannot utilize the prior knowledge of substations' inherent power grid topology, electrical connections, and fault causality, resulting in black-box models with opaque fault reasoning processes that cannot provide interpretable evidence for operation and maintenance decisions. Second, a few knowledge graph solutions based on federated learning still focus on the federated construction and entity alignment of static knowledge graphs, without introducing temporal constraints and dynamic representations. They cannot achieve distributed collaborative construction of temporal knowledge graphs, nor can they complete the distributed mining of fault association rules with temporal constraints. They cannot adapt to the temporal evolution characteristics of substation equipment faults, and the lead time, accuracy, and generalization ability of early warnings cannot meet the needs of actual engineering. Third, existing solutions cannot realize the deduction of fault cascading propagation paths and collaborative risk early warning across equipment and substations, making it difficult to meet the needs of comprehensive prevention and control of cascading faults under new power systems.

[0006] Existing technologies cannot simultaneously solve the three core technical bottlenecks in the field of substation equipment fault early warning: First, they cannot break down data silos between multiple substations and achieve knowledge fusion and collaborative modeling across operation and maintenance entities while ensuring data privacy and security; second, they cannot effectively characterize the dynamic temporal evolution characteristics of equipment status and fault correlation, making it difficult to accurately mine equipment fault correlation rules with time constraints and cross-domain cascading fault propagation patterns; and third, they cannot achieve advanced hierarchical early warning of equipment faults with high accuracy, strong interpretability, and wide adaptability, making it difficult to support intelligent operation and maintenance of large-scale substation clusters and ensure the safe and stable operation of large power grids.

[0007] Therefore, developing a privacy-secure, cross-domain collaborative technology that can accurately capture the temporal evolution of faults in substation equipment for fault association rule mining and risk warning has become an urgent technical problem to be solved in this field. Summary of the Invention

[0008] The purpose of this invention is to overcome the core technical bottlenecks of existing substation equipment fault early warning technologies, such as the inability to simultaneously ensure data privacy and security while maintaining cross-site collaborative modeling, the inability to accurately capture the temporal evolution patterns of equipment faults, insufficient accuracy in fault association rule mining, and poor interpretability and generalization of early warnings. This invention provides a method and system for substation equipment fault association rule mining and risk early warning based on a federated temporal knowledge graph. While ensuring that the original data of each substation remains locally and that data privacy is absolutely secure, this invention breaks down data silos between multiple substations, achieving collaborative construction of temporal knowledge graphs and fusion of fault knowledge across operation and maintenance entities. Through a dynamic temporal knowledge graph, it fully represents the temporal evolution characteristics of equipment faults, accurately mining time-constrained equipment fault association rules and cascading fault propagation patterns. Ultimately, it achieves advanced hierarchical early warning of equipment faults with high accuracy, strong interpretability, and wide adaptability, providing core technical support for intelligent operation and maintenance of large-scale substation clusters and prevention and control of power grid cascading faults.

[0009] To achieve the above objectives, the technical solution adopted by the present invention is as follows:

[0010] A method for substation equipment fault association rule mining and risk early warning based on federated temporal knowledge graph is implemented based on a federated learning architecture that coordinates multiple substation edge nodes and power grid dispatch center cloud nodes. The method is characterized by the following steps:

[0011] S1 Local Data Preprocessing: Each substation edge node collects multi-source heterogeneous time-series operation data, fault event data and operation and maintenance data of the equipment in the station, and completes data cleaning, time-series alignment, feature normalization and entity relationship extraction to obtain a standardized local dataset.

[0012] S2 Local Temporal Knowledge Graph Construction: Based on a standardized local dataset, each substation edge node constructs a local temporal knowledge graph for substation equipment, defining the graph ontology layer, entity layer, relation layer, and temporal attribute layer, and generating a dynamic local graph with time dimension labels;

[0013] S3 Federated Temporal Knowledge Graph Collaborative Construction: Based on the vertical federated learning framework, each edge node, without leaking local original data and graph details, collaborates with cloud nodes to complete cross-node entity alignment, relationship fusion, and federated aggregation of temporal features, generating a global federated temporal knowledge graph covering multiple substations.

[0014] S4 Fault Temporal Association Rule Federated Mining: Based on a global federated temporal knowledge graph, a distributed association rule mining algorithm with temporal constraints is used to mine temporal association rules between equipment status features and fault events, cross-equipment fault events, and cross-site fault events. After filtering by support, confidence, and lift thresholds, a substation equipment fault association rule library is constructed.

[0015] S5 equipment fault risk classification and early warning: Each edge node collects the running sequence data of the equipment in the station in real time, maps it to the local time sequence knowledge graph to complete the real-time status update, and combines the fault association rule library to complete the local real-time fault risk inference. At the same time, the risk inference features are encrypted and uploaded to the cloud node, and cross-node collaborative verification is completed through the global federated time sequence knowledge graph to output the graded fault risk early warning result.

[0016] S6 Model and Graph Incremental Iterative Update: Based on the newly added equipment operation data, fault event records and early warning result feedback, each edge node completes the incremental update of the local time-series knowledge graph, and completes the iterative optimization of the global federated time-series knowledge graph and the dynamic update of the fault association rule base through the federated learning framework.

[0017] The preferred method is characterized in that, in step S1, the multi-source heterogeneous time-series data includes SCADA real-time operation data, online monitoring status data, fault waveform data, environmental parameter data, historical fault event ledgers, and operation and maintenance record data of transformers, circuit breakers, disconnect switches, instrument transformers, and surge arresters in the substation; the data cleaning includes missing value filling, outlier removal, and data deduplication; the time-series alignment uses a unified timestamp granularity to synchronize the time dimension of the multi-source data; and the entity relationship extraction uses a pre-trained language model to extract equipment entities, status feature entities, fault event entities, and operation and maintenance event entities, as well as extract causal relationships, time-series relationships, subordinate relationships, and electrical connection relationships between entities.

[0018] The preferred method is characterized in that, in step S2, the ontology layer of the local time-series knowledge graph defines the substation equipment domain ontology, including ontology concepts of equipment type, state parameters, fault type, and operation and maintenance events, and the hierarchical relationships between these concepts; the entity layer consists of instance entities corresponding to the ontology concepts, including equipment entities, measurement point entities, fault entities, operation and maintenance entities, and environment entities; the relationship layer defines the static and dynamic relationships between entities, with static relationships including electrical connection relationships, subordinate relationships, and assembly relationships, and dynamic relationships including time-series causal relationships, fault evolution relationships, and state association relationships; the time-series attribute layer adds timestamp tags and time-series feature attributes to entities and relationships, records the dynamic changes in entity states and relationships, and forms a dynamic time-series knowledge graph updated by time slices.

[0019] The preferred method is characterized in that, in step S3, the vertical federated learning framework employs homomorphic encryption and secure multi-party computation techniques to achieve privacy protection, and the specific steps of cross-node entity alignment, relationship fusion, and temporal feature federated aggregation include:

[0020] S31 local feature encryption upload: Each edge node encrypts the entity, relationship, and temporal features of the local map and generates encrypted feature vectors to be uploaded to the cloud node. The local original data and map details do not leave the local node.

[0021] S32 encrypted entity alignment and relationship fusion: Based on encrypted feature vectors, cloud nodes complete the alignment and matching of entities with the same name and synonyms across nodes, as well as the consistency verification and fusion of relationships between entities, to generate the ontology and entity relationship framework of the global graph.

[0022] S33 Temporal Feature Federated Aggregation: The cloud nodes use a federated averaging algorithm to securely aggregate the encrypted temporal feature vectors uploaded by each edge node, generate global temporal feature parameters, and distribute them to each edge node.

[0023] S34 distributed graph synchronous update: each edge node updates the temporal attribute layer of its local temporal knowledge graph based on global temporal feature parameters, and works with cloud nodes to maintain the distributed storage and synchronous update of the global federated temporal knowledge graph.

[0024] The preferred method is characterized in that the homomorphic encryption adopts the Paillier semi-homomorphic encryption algorithm, and the secure multi-party computation adopts secret sharing technology to ensure that cloud nodes and edge nodes cannot reverse-engineer the original data and map details of other nodes throughout the entire collaboration process.

[0025] The preferred method is characterized in that, in step S4, the distributed association rule mining algorithm with time-series constraints is an improved time-series Apriori algorithm, specifically including:

[0026] The S41 time-series transaction itemset construction is based on a global federated time-series knowledge graph. It transforms the device status time-series characteristics and fault events in the graph into transaction itemsets with time windows, and defines the time-series constraint window and the time sequence constraint of fault events.

[0027] S42 local frequent itemset parallel mining adopts a distributed computing framework to complete the frequent itemset mining of local transaction itemsets in parallel on each edge node, and uploads the encrypted frequent itemset support to the cloud node.

[0028] S43 global frequent itemset aggregation and filtering: cloud nodes complete the secure aggregation and filtering of global frequent itemsets, filter out global frequent itemsets that meet the minimum support threshold, and distribute them to each edge node;

[0029] S44 Strong Association Rule Filtering and Rule Base Construction: Each edge node generates association rules with time-series constraints based on global frequent itemsets, calculates the confidence and lift of each rule, filters out strong association rules that meet the minimum confidence and minimum lift thresholds, uploads them to cloud nodes to complete global aggregation and deduplication, and constructs a substation equipment fault association rule base.

[0030] The preferred method is characterized in that the timing constraint window is set according to the evolution characteristics of substation equipment fault types and is divided into instantaneous fault window, short-term fault window and long-term fault evolution window; the strong correlation rules include timing correlation rules between internal state parameters of a single device and fault events, timing correlation rules for fault cascading across devices within the same substation, and timing correlation rules for power grid fault propagation across substations.

[0031] The preferred method is characterized in that, in step S5, the specific steps for graded fault risk early warning include:

[0032] S51 real-time status update and subgraph generation: Each edge node maps the real-time collected device runtime sequence data to the local time-series knowledge graph to complete the real-time update of the attributes of the corresponding device entity and status entity.

[0033] S52 local fault risk reasoning, based on a fault association rule base, uses subgraph matching and link prediction algorithms to complete real-time reasoning of local fault risks, calculate the probability of fault occurrence and risk level, and generate local risk reasoning results.

[0034] S53 global collaborative verification and chain failure simulation: Each edge node encrypts the feature vector of local risk reasoning and uploads it to the cloud node. The cloud node, based on the global federated temporal knowledge graph and combined with the cross-site and cross-device relationships, completes the global collaborative verification of risk reasoning results and the simulation of failure propagation paths.

[0035] The S54 graded early warning results are output. The cloud node sends the globally verified early warning results to the corresponding edge node. The edge node outputs the early warning level, fault type, early warning time window and handling suggestions for the corresponding device. The early warning level is divided into Level 1, Level 2, Level 3 and Level 4.

[0036] The preferred method is characterized in that the first-level warning is an emergency warning of an impending fault, the second-level warning is a high-probability fault risk warning, the third-level warning is an abnormal state trend warning, and the fourth-level warning is a normal state indication.

[0037] The preferred method is characterized in that, in step S6, the incremental iterative update cycle is divided into real-time incremental update and periodic full update. The real-time incremental update completes the real-time update of entities, relationships and attributes of the local graph based on the newly added equipment operation data and maintenance records. The periodic full update is executed at a fixed cycle. Based on the newly added fault events and early warning feedback results within the cycle, the global iterative optimization of the federated time-series knowledge graph, the addition, correction and removal of association rule bases are completed, and the adaptability of association rules and the accuracy of risk warning are continuously improved.

[0038] Preferably, a substation equipment fault association rule mining and risk early warning system based on federated temporal knowledge graph is characterized by comprising edge computing nodes deployed in each substation and a cloud central node deployed in the power grid dispatch center. The system is used to execute the substation equipment fault association rule mining and risk early warning method based on federated temporal knowledge graph as described in any one of claims 1-10. The edge computing nodes include a data acquisition module, a local graph construction module, a local inference early warning module, and a federated communication encryption module. The cloud central node includes a federated collaborative computing module, a global graph management module, an association rule management module, and a global early warning verification module.

[0039] The technical effects achieved by this invention are as follows:

[0040] This invention completely resolves the core contradiction between data silos and privacy protection, enabling cross-site collaborative modeling. Based on vertical federated learning, combined with homomorphic encryption and secure multi-party computation, it completes the collaborative construction of cross-site temporal knowledge graphs and distributed mining of fault association rules without leaving the local data site. This avoids privacy leaks, breaks down data barriers, and utilizes multi-site fault samples to address the problem of insufficient model generalization caused by the scarcity of single-site samples. It overcomes the limitations of static knowledge graphs, achieving accurate representation of the temporal evolution characteristics of equipment faults. By adding time dimension labels to a four-layer dynamic temporal knowledge graph, it depicts the dynamic changes in equipment states and relationships, providing a semantic and temporal foundation for fault association rule mining and early warning. It improves the accuracy and effectiveness of fault association rule mining, achieving accurate identification of cascading fault propagation patterns. Employing a temporally constrained distributed association rule mining algorithm, it introduces temporal constraint windows and time sequence constraints, eliminates pseudo-association rules, and mines the temporal causal relationships of faults across single devices, devices, and substations, reconstructing the power grid fault propagation path. It achieves high-accuracy and highly interpretable advanced hierarchical early warning, supporting intelligent operation and maintenance of substations. Fault risk reasoning is performed based on subgraph matching and link prediction of time-series knowledge graph, and the output is an interpretable fault cause, evolution path, early warning time and handling suggestions. By combining local real-time reasoning and global collaborative verification, the real-time performance and accuracy of early warning are ensured, the risk of equipment failure and cascading failure is reduced, and support is provided for the safe and stable operation of the power grid. Attached Figure Description

[0041] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the embodiments will be briefly described below. It should be understood that the following drawings only show some embodiments of the present invention and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0042] Figure 1 A flowchart of a method for mining and risk warning of substation equipment fault association rules based on federated temporal knowledge graph provided in this embodiment of the invention;

[0043] Figure 2 A flowchart illustrating the collaborative construction of a federated temporal knowledge graph-based substation equipment fault association rule mining and risk warning method, provided in this embodiment of the invention.

[0044] Figure 3 A method for mining and risk warning of substation equipment fault association rules based on federated temporal knowledge graphs is provided for embodiments of the present invention. Flowchart of the federated mining of fault temporal association rules.

[0045] Figure 4 This invention provides a flowchart of a substation equipment fault risk classification and early warning method based on federated temporal knowledge graph for substation equipment fault association rule mining and risk early warning. Detailed Implementation

[0046] The technical solution of the present invention will be clearly and completely described in detail below with reference to specific embodiments. The described embodiments are only preferred embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0047] This specific implementation discloses a method for mining and risk warning of substation equipment fault association rules based on federated temporal knowledge graphs. It is implemented based on a vertical federated learning architecture of multiple substation edge nodes and power grid dispatch center cloud nodes. The entire process adopts Paillier semi-homomorphic encryption and secure multi-party computation technology to achieve privacy protection. The local original data and graph details of each substation edge node do not leave the corresponding node throughout the process. Only the encrypted feature vectors and intermediate calculation results are uploaded, completely avoiding the risk of original data leakage.

[0048] In this embodiment, the substation equipment fault association rule mining and risk warning method based on federated time-series knowledge graph includes:

[0049] Figure 1Step S1, local data preprocessing, involves each substation's edge computing node collecting multi-source heterogeneous data from core equipment within the substation via the IEC61850 protocol to complete standardized preprocessing. Specific implementation details are as follows:

[0050] Data Acquisition Scope: The collected data is divided into 6 categories, specifically including: Equipment operation sequence data: transformer oil temperature, winding temperature, dissolved gas composition content in oil, load current, winding DC resistance; circuit breaker opening and closing times, energy storage motor current, contact temperature, opening and closing coil current; core operating parameters of equipment such as disconnecting switches, instrument transformers, and surge arresters, with a collection granularity of 1 minute / time; Fault event data: historical fault logs, fault waveform data, and protection action event records within the station, including fault occurrence time, faulty equipment, fault type, etc. Fields such as fault cause and handling result; Operation and maintenance data: equipment inspection records, maintenance records, test reports, defect records, including fields such as operation and maintenance time, operation and maintenance object, operation and maintenance content, and equipment status assessment results; Power grid topology data: electrical wiring diagrams of primary equipment in the station, electrical connection relationships between equipment, and power grid topology relationships between upper and lower level substations; Environmental parameter data: ambient temperature, humidity, rainfall, lightning strike records, and SF6 gas leakage monitoring data in the station; Protection setting data: operating settings and alarm thresholds of relay protection devices in the station.

[0051] Data cleaning: For the collected raw data, linear interpolation + Kalman filtering algorithm is used to fill missing values, 3σ criterion + box plot method is used to remove outliers, and duplicate data is removed based on device unique identifier and timestamp to obtain a cleaned and effective dataset.

[0052] Time alignment: A unified Unix timestamp is used to complete the time alignment of multi-source data with a granularity of 1 minute; for high-frequency sampled data such as fault recordings, an equal-interval downsampling method is used to achieve granularity unification; for discrete event data such as fault events and maintenance events, the timestamp of the event is marked to achieve complete synchronization of the time dimension of equipment status data, fault events, and maintenance events.

[0053] Feature normalization: The Min-Max normalization method is used to normalize all numerical state parameters to the [0,1] interval, eliminating the influence of data with different dimensions on subsequent modeling. The formula is as follows:

[0054]

[0055] Where x is the original feature value, , These are the maximum and minimum values ​​of this feature in historical data, respectively.

[0056] Entity and Relation Extraction: A BERT-wwm pre-trained language model fine-tuned based on power domain corpus is used to automatically extract entities and relations. The extracted entity types include: equipment entities, status feature entities, fault event entities, operation and maintenance event entities, environmental entities, and protection device entities. The extracted relation types include: subordinate relations, electrical connection relations, assembly relations, causal relations, temporal relations, fault evolution relations, and action triggering relations.

[0057] Figure 1 Step S2, as shown, involves constructing a local time-series knowledge graph. Each substation edge node builds a local time-series knowledge graph for its equipment based on a standardized local dataset. The graph employs a four-layer architecture: ontology layer, entity layer, relationship layer, and time-series attribute layer. Specific implementation details are as follows:

[0058] Ontology Layer Construction: The Protégé ontology construction tool is used to build a standardized ontology for the substation equipment domain, defining ontology concepts, hierarchical relationships between concepts, semantic constraints, and attribute definitions. The core ontology classes include: equipment, state parameter, fault, operation and maintenance event, environment, and protection device. Each ontology class has corresponding subclasses. For example, the equipment class is divided into primary equipment subclasses and secondary equipment subclasses. The primary equipment subclass is further divided into sub-equipment classes such as transformers, circuit breakers, disconnect switches, instrument transformers, and surge arresters. The ontology layer is a unified standardized ontology for all substation nodes, ensuring consistency in cross-node knowledge fusion.

[0059] Entity layer construction: The entity instances extracted in the preprocessing stage are mapped to the corresponding ontology concepts in the ontology layer to form the entity layer; each entity is assigned a unique global entity identifier, and entity attributes include static attributes such as entity name, equipment number, equipment model, manufacturer, commissioning time, and installation location.

[0060] Relationship layer construction: Define static and dynamic relationships between entities to form a relationship layer; where:

[0061] Static relationships are fixed relationships that do not change over time, including: subordinate relationships between equipment and components, electrical connection relationships between equipment, assembly relationships between components and equipment, and correspondence relationships between protection devices and protected equipment;

[0062] Dynamic relationships are the temporal relationships that change with the operating state of the equipment, including: the temporal causal relationship between state characteristics and fault events, the evolutionary relationship between preceding and subsequent faults, the correlation between different equipment state characteristics, and the triggering relationship between protection actions and equipment faults.

[0063] Temporal attribute layer construction: Time dimension labels are attached to the dynamic attributes of all entities and all dynamic relationships to construct a temporal triplet structure, with the format: (head entity, relationship, tail entity, start timestamp, end timestamp, feature value sequence); where the timestamp precision is 1 minute, and the feature value sequence is the temporal data of the entity status within the corresponding time window.

[0064] For normal operation, a 15-minute time slice is used to update the graph periodically. For abnormal equipment status, the time slice is automatically switched to 1 minute to achieve real-time updates of the graph. Finally, a dynamic time-series knowledge graph that can fully represent the entire lifecycle evolution of equipment status from normal to abnormal to fault triggering is formed and stored in the Neo4j time-series graph database at the edge nodes.

[0065] Figure 1 Step S3, as shown, involves the collaborative construction of a federated temporal knowledge graph. Based on a vertical federated learning framework, edge nodes and cloud nodes collaborate to complete cross-node entity alignment, relationship fusion, and federated aggregation of temporal features, generating a global federated temporal knowledge graph. Privacy-preserving computation technology is used throughout the process to ensure data security. The specific implementation steps are as follows:

[0066] Figure 2 The S31 local feature encryption upload is shown: The cloud node pre-generates a public key and a private key for Paillier semi-homomorphic encryption, distributes the public key to all edge nodes, and securely stores the private key by the cloud node; each edge node uses the public key to convert the entity features, relation features, and temporal features of the local graph into 128-dimensional dense vectors and then encrypts them, generating encrypted feature vectors which are then uploaded to the cloud node. The original local data and graph details do not leave the edge node throughout the entire process. Figure 2 The S32 encrypted entity alignment and relationship fusion are shown below: Based on the received encrypted feature vector, the cloud node uses a secure multi-party computation secret sharing technology to split the feature vector into multiple secret shares. The cloud and edge nodes collaboratively calculate the cosine similarity of the entity features without decrypting the original feature vector. The similarity threshold is set to 0.9. Entities with a cosine similarity ≥ 0.9 are judged as cross-node same-name / synonymous entities, completing global entity alignment. Based on the aligned entities, the consistency verification of cross-node entity relationships is completed, conflicting relationships are eliminated, complementary relationships are added, and the ontology framework and entity relationship topology of the global graph are generated. Figure 2 The S33 temporal feature federated aggregation shown is as follows: The cloud node adopts the federated averaging FedAvg algorithm to perform weighted averaging on the encrypted temporal feature vectors uploaded by each edge node according to the effective sample size of each node. The global temporal feature parameters are securely aggregated in the ciphertext state to generate global temporal feature parameters, which are then encrypted with a public key and sent to each edge node. Figure 2The S34 distributed graph synchronous update is shown as follows: After receiving the encrypted global temporal feature parameters, each edge node decrypts them using the locally stored public key and updates the temporal attribute layer of the local temporal knowledge graph based on the global parameters, ensuring the consistency between the local graph and the global ontology framework. Finally, a distributed storage global federated temporal knowledge graph is formed, in which the cloud node stores the global ontology framework, entity alignment results and global feature parameters, and each edge node stores the complete local temporal knowledge graph details, realizing cross-site knowledge fusion under the premise of privacy and security.

[0067] Figure 1 Step S4, as shown, involves federated mining of fault time-series association rules. Based on a global federated time-series knowledge graph, an improved distributed Apriori algorithm with time-series constraints is used to complete the federated mining of fault time-series association rules and construct a substation equipment fault association rule library. The specific implementation steps are as follows:

[0068] Figure 3 The S41 time-series transaction item set construction shown is as follows: Based on a global federated time-series knowledge graph, the device status time-series characteristics, fault events, and maintenance events in the graph are transformed into transaction item sets with time windows. Each transaction item corresponds to an entity status or event within a time window. According to the evolutionary characteristics of substation equipment fault types, a three-level time-series constraint window and a time sequence constraint for fault events are set, specifically:

[0069] Transient fault window: 0-10s, suitable for transient faults such as equipment short circuits, lightning strikes, and protection malfunctions;

[0070] Short-term fault window: 10min-72h, suitable for short-term evolving faults such as equipment insulation deterioration, mechanism jamming, and contact overheating;

[0071] Long-term fault evolution window: 7d-180d, suitable for long-term progressive faults such as equipment aging, oil deterioration, and mechanical performance degradation.

[0072] Figure 3 The S42 local frequent itemset parallel mining method is shown below: Using the Hadoop distributed computing framework, frequent itemset mining of local transaction itemsets is performed in parallel on each edge node. The local support of each candidate frequent itemset is calculated, and the encrypted local support data is uploaded to the cloud node. The support calculation formula is as follows:

[0073]

[0074] Figure 3The S43 global frequent itemset aggregation and filtering process is as follows: The cloud node completes the secure aggregation of the local support of each node in the encrypted state, calculates the global support of each candidate frequent itemset, sets the minimum support threshold to 0.05, filters out global frequent itemsets with a global support ≥ 0.05, encrypts them, and sends them to each edge node.

[0075] Figure 3 The S44 strong association rule filtering and rule base construction shown: Each edge node generates association rules with time-series constraints based on a global frequent itemset. The rule format is as follows: Where X is the preceding term of the rule (abnormal state characteristics / preceding events), Y is the following term of the rule (failure events), and Δt is the time constraint window, and the time window of X must be earlier than Y to satisfy the time sequence constraint; calculate the confidence and lift of each rule using the following formula:

[0076]

[0077]

[0078] The minimum confidence threshold is set to 0.8, and the minimum lift threshold is set to 1.2. Strong association rules that simultaneously satisfy confidence ≥ 0.8 and lift ≥ 1.2 are selected and uploaded to the cloud node. The cloud node performs global rule aggregation, deduplication, and consistency verification, ultimately constructing a substation equipment fault association rule library. The rule library contains three core rule categories: Internal rules for single equipment: time-series association rules between single equipment status parameters and corresponding fault events, such as: transformer oil temperature exceeds the threshold for three consecutive time slices → fault event within 72 hours. Transformer winding overheating fault, confidence level 0.92, lift 3.5; Intra-station cross-equipment rules: fault cascading timing association rules between different equipment within the same station, for example: bus voltage distortion → transformer insulation breakdown fault within 30 minutes, confidence level 0.86, lift 2.8; Cross-site propagation rules: power grid fault propagation timing association rules across substations, for example: 220kV bus fault at the upstream substation → downstream substation incoming line protection action within 10 seconds, confidence level 0.98, lift 4.2.

[0079] Figure 1 The S5 step, equipment fault risk classification and early warning, involves each edge node collecting real-time equipment operation data within the station. This data is then combined with a local time-series knowledge graph and a fault association rule base to complete local real-time risk inference. Cloud nodes perform global collaborative verification, ultimately outputting the classification and early warning results. The specific implementation steps are as follows:

[0080] Figure 4The S51 real-time status update and subgraph generation are shown: Each edge node collects the running sequence data of the equipment in the station in real time, maps the data to the local time series knowledge graph, completes the real-time update of the attributes of the corresponding equipment entity and status entity, extracts the equipment status association subgraph within the current time window, and generates the real-time status subgraph.

[0081] Figure 4 The S52 local fault risk reasoning is shown as follows: Based on the fault association rule base, the subgraph isomorphic matching algorithm is used to match the real-time state subgraph with the subgraph of the preceding rule in the rule base and calculate the subgraph matching degree; the temporal GraphSAGE link prediction algorithm is used to calculate the probability of occurrence of the corresponding fault, generate local risk reasoning results, and clarify the fault cause, associated equipment, fault evolution path and warning time window.

[0082] Figure 4 The S53 global collaborative verification and cascading fault simulation are shown: Each edge node encrypts the feature vector of local risk inference and uploads it to the cloud node. The cloud node, based on the global federated temporal knowledge graph and combined with the electrical topology relationship across sites and devices, completes the global collaborative verification of the risk inference results, and simulates the cross-site propagation path of the fault to identify potential cascading fault risks.

[0083] Figure 4 The S54 graded early warning output is shown below: The cloud node encrypts and sends the globally verified early warning results to the corresponding edge nodes. The edge nodes output four-level graded early warning results according to the probability and severity of the fault. The specific grading standards are as follows: Level 1 warning: Fault occurrence probability ≥ 90%, warning time window ≤ 24h, is an emergency warning of an impending fault, and the corresponding handling suggestion is: "Immediately shut down the power for maintenance, and carry out special tests and defect investigations on the corresponding equipment"; Level 2 warning: 70% ≤ Fault occurrence probability < 90%, warning time window ≤ 72h, is a high-probability fault risk warning, and the corresponding handling suggestion is: "Conduct a special on-site inspection within 24 hours, perform live testing on the equipment, and track abnormal trend changes"; Level 3 warning: 40% ≤ Fault occurrence probability < 70%, is an abnormal state trend warning, and the corresponding handling suggestion is: "Increase the frequency of daily inspections, pay attention to the changing trends of corresponding parameters, and include them in the recent maintenance plan"; Level 4 warning: Fault occurrence probability < 40%, is a normal state indication, and there are no special handling requirements.

[0084] Figure 1 Step S6, as shown, involves incremental iterative updates of the model and map. Based on newly added equipment operation data, fault event records, and early warning result feedback, incremental iterative updates of the map and model are completed, specifically divided into two update modes:

[0085] Real-time incremental updates: The trigger condition is that the edge node collects new device operation data or maintenance records, and updates the entities, relationships and attributes of the local time-series knowledge graph in real time; when the device is in an abnormal state, the graph update frequency is automatically increased from 15 minutes / time to 1 minute / time to ensure the real-time nature of local state awareness.

[0086] Periodic full update: The update cycle is once a month. Based on the newly added fault event records, operation and maintenance results, and early warning accuracy feedback data of the month, three core updates are completed: First, each edge node completes the full optimization of its local time series knowledge graph; second, iterative optimization of the global federated time series knowledge graph and global feature parameter updates are completed through the federated learning framework; third, the fault association rule base is dynamically maintained, adding new rules that have been verified to be effective, correcting rules with decreased confidence, and removing invalid rules with high false alarm rates, so as to continuously improve the adaptability of association rules and the accuracy of risk warning.

[0087] The above description is merely a specific embodiment of this application, enabling those skilled in the art to understand or implement this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features claimed herein.

Claims

1. A method for mining and risk warning of substation equipment fault association rules based on federated temporal knowledge graphs, implemented based on a federated learning architecture that coordinates multiple substation edge nodes and power grid dispatch center cloud nodes, characterized in that... Includes the following steps: S1 Local Data Preprocessing: Each substation edge node collects multi-source heterogeneous time-series operation data, fault event data and operation and maintenance data of the equipment in the station, and completes data cleaning, time-series alignment, feature normalization and entity relationship extraction to obtain a standardized local dataset. S2 Local Temporal Knowledge Graph Construction: Based on a standardized local dataset, each substation edge node constructs a local temporal knowledge graph for substation equipment, defining the graph ontology layer, entity layer, relation layer, and temporal attribute layer, and generating a dynamic local graph with time dimension labels; S3 Federated Temporal Knowledge Graph Collaborative Construction: Based on the vertical federated learning framework, each edge node, without leaking local original data and graph details, collaborates with cloud nodes to complete cross-node entity alignment, relationship fusion, and federated aggregation of temporal features, generating a global federated temporal knowledge graph covering multiple substations. S4 Fault Temporal Association Rule Federated Mining: Based on a global federated temporal knowledge graph, a distributed association rule mining algorithm with temporal constraints is used to mine temporal association rules between equipment status features and fault events, cross-equipment fault events, and cross-site fault events. After filtering by support, confidence, and lift thresholds, a substation equipment fault association rule library is constructed. S5 equipment fault risk classification and early warning: Each edge node collects the running sequence data of the equipment in the station in real time, maps it to the local time sequence knowledge graph to complete the real-time status update, and combines the fault association rule library to complete the local real-time fault risk inference. At the same time, the risk inference features are encrypted and uploaded to the cloud node, and cross-node collaborative verification is completed through the global federated time sequence knowledge graph to output the graded fault risk early warning result. S6 Model and Graph Incremental Iterative Update: Based on the newly added equipment operation data, fault event records and early warning result feedback, each edge node completes the incremental update of the local time-series knowledge graph, and completes the iterative optimization of the global federated time-series knowledge graph and the dynamic update of the fault association rule base through the federated learning framework.

2. The method according to claim 1, characterized in that, In step S1, the multi-source heterogeneous time-series data includes SCADA real-time operation data, online monitoring status data, fault waveform data, environmental parameter data, historical fault event ledgers, and operation and maintenance records of transformers, circuit breakers, disconnect switches, instrument transformers, and surge arresters in the substation; the data cleaning includes missing value filling, outlier removal, and data deduplication; the time-series alignment uses a unified timestamp granularity to synchronize the time dimension of the multi-source data; and the entity relationship extraction uses a pre-trained language model to extract equipment entities, status feature entities, fault event entities, and operation and maintenance event entities, as well as the causal relationships, time-series relationships, subordinate relationships, and electrical connection relationships between entities.

3. The method according to claim 1, characterized in that, In step S2, the ontology layer of the local time-series knowledge graph defines the substation equipment domain ontology, including ontology concepts of equipment type, state parameters, fault type, and operation and maintenance events, as well as the hierarchical relationships between these concepts. The entity layer consists of instance entities corresponding to the ontology concepts, including equipment entities, measurement point entities, fault entities, operation and maintenance entities, and environment entities. The relationship layer defines the static and dynamic relationships between entities. Static relationships include electrical connection relationships, subordinate relationships, and assembly relationships, while dynamic relationships include time-series causal relationships, fault evolution relationships, and state association relationships. The time-series attribute layer adds timestamp tags and time-series feature attributes to entities and relationships, recording the dynamic changes in entity states and relationships, forming a dynamic time-series knowledge graph updated by time slices.

4. The method according to claim 1, characterized in that, In step S3, the vertical federated learning framework employs homomorphic encryption and secure multi-party computation techniques to protect privacy. The specific steps for cross-node entity alignment, relationship fusion, and temporal feature federated aggregation include: S31 local feature encryption upload: Each edge node encrypts the entity, relationship, and temporal features of the local map and generates encrypted feature vectors to be uploaded to the cloud node. The local original data and map details do not leave the local node. S32 encrypted entity alignment and relationship fusion: Based on encrypted feature vectors, cloud nodes complete the alignment and matching of entities with the same name and synonyms across nodes, as well as the consistency verification and fusion of relationships between entities, to generate the ontology and entity relationship framework of the global graph. S33 Temporal Feature Federated Aggregation: The cloud nodes use a federated averaging algorithm to securely aggregate the encrypted temporal feature vectors uploaded by each edge node, generate global temporal feature parameters, and distribute them to each edge node. S34 distributed graph synchronous update: each edge node updates the temporal attribute layer of its local temporal knowledge graph based on global temporal feature parameters, and works with cloud nodes to maintain the distributed storage and synchronous update of the global federated temporal knowledge graph.

5. The method according to claim 4, characterized in that, The homomorphic encryption uses the Paillier semi-homomorphic encryption algorithm, and the secure multi-party computation uses secret sharing technology to ensure that cloud nodes and edge nodes cannot reverse engineer the original data and graph details of other nodes throughout the entire collaboration process.

6. The method according to claim 1, characterized in that, In step S4, the distributed association rule mining algorithm with time-series constraints is an improved time-series Apriori algorithm, specifically including: The S41 time-series transaction itemset construction is based on a global federated time-series knowledge graph. It transforms the device status time-series characteristics and fault events in the graph into transaction itemsets with time windows, and defines the time-series constraint window and the time sequence constraint of fault events. S42 local frequent itemset parallel mining adopts a distributed computing framework to complete the frequent itemset mining of local transaction itemsets in parallel on each edge node, and uploads the encrypted frequent itemset support to the cloud node. S43 global frequent itemset aggregation and filtering: cloud nodes complete the secure aggregation and filtering of global frequent itemsets, filter out global frequent itemsets that meet the minimum support threshold, and distribute them to each edge node; S44 Strong Association Rule Filtering and Rule Base Construction: Each edge node generates association rules with time-series constraints based on global frequent itemsets, calculates the confidence and lift of each rule, filters out strong association rules that meet the minimum confidence and minimum lift thresholds, uploads them to cloud nodes to complete global aggregation and deduplication, and constructs a substation equipment fault association rule base.

7. The method according to claim 6, characterized in that, The timing constraint window is set according to the evolution characteristics of substation equipment fault types and is divided into instantaneous fault window, short-term fault window and long-term fault evolution window; the strong correlation rules include timing correlation rules between internal state parameters of a single device and fault events, timing correlation rules for fault cascading across devices within the same substation, and timing correlation rules for power grid fault propagation across substations.

8. The method according to claim 1, characterized in that, In step S5, the specific steps for graded fault risk early warning include: S51 real-time status update and subgraph generation: Each edge node maps the real-time collected device runtime sequence data to the local time-series knowledge graph to complete the real-time update of the attributes of the corresponding device entity and status entity. S52 local fault risk reasoning, based on a fault association rule base, uses subgraph matching and link prediction algorithms to complete real-time reasoning of local fault risks, calculate the probability of fault occurrence and risk level, and generate local risk reasoning results. S53 global collaborative verification and chain failure simulation: Each edge node encrypts the feature vector of local risk reasoning and uploads it to the cloud node. The cloud node, based on the global federated temporal knowledge graph and combined with the cross-site and cross-device relationships, completes the global collaborative verification of risk reasoning results and the simulation of failure propagation paths. The S54 graded early warning results are output. The cloud node sends the globally verified early warning results to the corresponding edge node. The edge node outputs the early warning level, fault type, early warning time window and handling suggestions for the corresponding device. The early warning level is divided into Level 1, Level 2, Level 3 and Level 4.

9. The method according to claim 8, characterized in that, The Level 1 warning is an emergency warning that a fault is about to occur; the Level 2 warning is a high-probability fault risk warning; the Level 3 warning is an abnormal state trend warning; and the Level 4 warning is a normal state indication.

10. The method according to claim 1, characterized in that, In step S6, the incremental iteration update cycle is divided into real-time incremental update and periodic full update. Real-time incremental update completes the real-time update of entities, relationships and attributes of the local graph based on newly added equipment operation data and maintenance records. Periodic full update is executed at a fixed period. Based on the newly added fault events and early warning feedback results within the period, it completes the global iterative optimization of the federated time-series knowledge graph, the addition, correction and removal of association rule base, and continuously improves the adaptability of association rules and the accuracy of risk warning. A substation equipment fault association rule mining and risk early warning system based on federated temporal knowledge graph, characterized in that, The system includes edge computing nodes deployed in various substations and cloud central nodes deployed in the power grid dispatch center. The system is used to execute the substation equipment fault association rule mining and risk early warning method based on federated time-series knowledge graph as described in any one of claims 1-10. The edge computing nodes include a data acquisition module, a local graph construction module, a local inference early warning module, and a federated communication encryption module. The cloud central nodes include a federated collaborative computing module, a global graph management module, an association rule management module, and a global early warning verification module.