Intelligent fault diagnosis and maintenance method for complex network

By standardizing and preprocessing multi-source heterogeneous data from complex networks and fusing dynamic graph topology models, combined with multi-layer GAT networks and fault knowledge graphs, the problems of missing global modeling, insufficient topology utilization, and poor dynamic adaptation capabilities in fault diagnosis of complex networks are solved. This enables accurate fault tracing and intelligent maintenance, and improves the comprehensiveness and interpretability of diagnosis.

CN122247841APending Publication Date: 2026-06-19SHAANXI BOZHI HONGLIN INFORMATION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHAANXI BOZHI HONGLIN INFORMATION TECHNOLOGY CO LTD
Filing Date
2026-04-29
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing complex network fault diagnosis technologies suffer from problems such as lack of global modeling, insufficient topology utilization, poor dynamic adaptability, weak interpretability, and disconnect between diagnosis and maintenance, making it difficult to adapt to the operation and maintenance needs of large-scale, dynamic, and highly coupled complex networks.

Method used

By standardizing and preprocessing multi-source heterogeneous data from complex networks, a dynamic graph topology model is constructed and deeply integrated with it. A multi-layer GAT network is used for feature learning. Combined with online incremental learning and fault knowledge graph, the root cause of faults is located and the scope of impact is predicted. Differentiated maintenance strategies are generated, and a closed-loop feedback mechanism is established.

🎯Benefits of technology

It enables accurate source tracing, propagation prediction, and intelligent maintenance of complex network faults, improves the comprehensiveness, accuracy, and interpretability of fault diagnosis, reduces maintenance costs, and adapts to large-scale, dynamic, and highly coupled operation and maintenance needs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122247841A_ABST
    Figure CN122247841A_ABST
Patent Text Reader

Abstract

This invention relates to the field of intelligent fault diagnosis technology for complex networks. It discloses an intelligent fault diagnosis and maintenance method for complex networks, comprising the following steps: S1. Collecting multi-source heterogeneous data and standardizing it to form a standardized data matrix; constructing a dynamic graph topology model with an embedded real-time update algorithm to obtain a fused feature matrix; S2. Constructing a multi-layer GAT network; pre-training a basic feature learning model; introducing an online incremental learning algorithm to match the real-time update algorithm; and combining a temporal attention mechanism to output the final fused feature vector; S3. Constructing a fault knowledge graph and association rules; combining the fused feature vector to locate the root cause of the fault and output interpretable diagnostic results; S4. Generating and executing differentiated maintenance strategies based on the diagnostic results; updating the fault knowledge graph and multi-layer GAT network through a closed-loop feedback mechanism. This invention addresses the shortcomings of existing technologies, improves diagnostic accuracy and maintenance intelligence, and adapts to the operational needs of complex networks.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of intelligent fault diagnosis technology for complex networks, and specifically discloses an intelligent fault diagnosis and maintenance method for complex networks. Background Technology

[0002] Complex networks are mesh systems composed of a large number of different individuals and complex relationships. They differ from simple linear and regular structures. They are characterized by uneven structure, strong correlation, dynamic changes, and easy chain propagation of faults. They can fully reflect the coupling, dependence and interaction relationships between the units of the system.

[0003] For example, in a power grid, nodes are substations, generator sets, and electrical loads, while the edge lines are transmission lines. A few core hub substations are crucial, and a fault in one line can easily cause rapid cascading tripping, making it a typical complex network.

[0004] For example, in industrial production, nodes include machine tools, robotic arms, conveying equipment, and detection sensors, while edges include material transmission, signal communication, and power linkage. A failure in a single piece of equipment can propagate along the production line, causing batch production anomalies.

[0005] For example, in the Internet of Things (IoT) of communications, nodes are terminal devices, routers, and servers, and edges are data transmission links; the failure of the core gateway will cause a large number of devices to lose connection, and the structure is complex and highly interconnected.

[0006] Please refer to the instruction manual appendix. Figure 2 , attached Figure 2 The simulation and visualization of complex network structures are presented as follows: Complex networks exhibit small-world characteristics, meaning that while the network size is extremely large, the average path between any two nodes is very short, and local clustering is high. This allows faults to propagate rapidly across regions; for example, a short circuit in a power grid can affect multiple provinces within seconds, and a fault in one node of an industrial production line can quickly propagate to upstream and downstream areas. Complex networks also exhibit scale-free characteristics, meaning the number of node connections follows a power-law distribution. When the system contains critical core components, such as power grid substations, aircraft engine spindles, or data center core switches, the failure of the hub can lead to system paralysis; however, the impact of ordinary node failures is limited. Complex networks also exhibit high clustering, meaning that local nodes are highly interconnected, forming tight communities or modules, while connections between modules are relatively sparse. The system has a modular, hierarchical structure, such as a factory-workshop-production line-equipment-component structure. If a fault occurs, it easily spreads within a module, and cross-module propagation faces bottlenecks.

[0007] With the increasing prevalence of complex networks in various key fields, their topological complexity and node coupling have significantly increased. The suddenness and propagation of faults place higher demands on the global and real-time nature of diagnostic technologies. Currently, complex network fault diagnosis technologies have formed four major categories.

[0008] Firstly, there are traditional basic diagnostic technologies: these rely heavily on human experience and are suitable for small, static networks. They mainly include three types: threshold alarms and hierarchical troubleshooting (single-point monitoring, fixed thresholds, simple deployment but high false alarm rate and low efficiency); rule-based expert systems (solidify operation and maintenance experience, adapt to stable fault modes but rule coverage is incomplete and updates are lagging); and fault tree and cause-effect graph analysis (characterize fault causes, suitable for fixed topologies but cannot adapt to dynamic scenarios and is difficult to handle nonlinear propagation).

[0009] Then there is: Knowledge-driven diagnostic technology: Based on structured knowledge, it solves some of the limitations of traditional technologies, including knowledge graph diagnosis (modeling node associations, improving diagnostic efficiency but with high construction costs and difficulty in dynamic updates); and case-based reasoning (reusing historical cases, without complex rules but relying on case quality and unable to handle new types of faults).

[0010] Next is: Data-driven intelligent diagnostic technology: the mainstream direction of intelligent diagnostics, which does not rely on human experience. It is divided into traditional machine learning (manual feature extraction, which is suitable for medium-sized networks but has weak generalization ability and does not utilize topological information) and deep learning (automatic feature extraction, which is suitable for high-dimensional time series data but is a black box model, has poor performance with small samples, and does not integrate topology).

[0011] Finally, there are: diagnostic techniques specifically for complex networks: core characteristics of complex networks, explicitly utilizing topological information, including graph neural networks; propagation dynamics models; and centrality analysis.

[0012] Based on comprehensive technical analysis, current complex network fault diagnosis technologies still suffer from five core problems, making it difficult to adapt to the operational needs of large-scale, dynamic, and highly coupled complex networks. Specifically: 1. Lack of global modeling: Most technologies adopt a single-point or local perspective, severing the coupling relationship between nodes and links. This makes it impossible to track cascading faults across nodes and modules, leading to false alarms and missed alarms. Even some intelligent technologies do not fully integrate topology and node status data; 2. Insufficient topology utilization: Mainstream data-driven technologies do not explicitly utilize the network topology structure, making it impossible to accurately depict fault propagation paths and impact ranges, and hindering fault diagnosis. 3. Poor dynamic adaptability: Most models are statically constructed and cannot adapt to dynamic scenarios such as topology reconstruction, operating condition fluctuations, and equipment aging. Frequent manual parameter adjustments are required, resulting in high maintenance costs. 4. Weak interpretability: Intelligent models such as deep learning and GNN are mostly black-box structures, making the fault reasoning process difficult to explain. This reduces the trust of maintenance personnel and affects practical application. 5. Disconnect between diagnosis and maintenance: The technology focuses on fault identification and location but is not deeply integrated with maintenance strategies. It cannot output a closed-loop solution of location-assessment-maintenance-repair, making it difficult to meet the needs of intelligent maintenance.

[0013] Based on the above, this invention provides an intelligent fault diagnosis and maintenance method for complex networks to solve the aforementioned problems. Summary of the Invention

[0014] This invention discloses an intelligent fault diagnosis and maintenance method for complex networks, aiming to solve the technical problems of lack of global modeling, insufficient topology utilization, poor dynamic adaptation capability, weak interpretability, and disconnect between diagnosis and maintenance in existing complex network fault diagnosis technologies. It realizes accurate source tracing, propagation prediction and intelligent maintenance of complex network faults, and adapts to the operation and maintenance needs of large-scale, dynamic and highly coupled complex networks.

[0015] To achieve the above objectives, the present invention provides the following basic solution: A method for intelligent fault diagnosis and maintenance of complex networks includes the following steps: Step S1: Collect multi-source heterogeneous data in complex networks and perform standardized preprocessing to form a standardized data matrix; based on the actual physical structure and information flow dependencies of complex networks, construct a dynamic graph topology model and embed a real-time update algorithm into the dynamic graph topology model, deeply integrate the standardized data matrix with the dynamic graph topology model, and obtain a fusion feature matrix of the core nodes and key links of the complex network. Step S2: Construct a new multi-layer GAT network. The first layer is the input layer, which receives the fused feature matrix. The second layer is the attention hidden layer, which uses a multi-head attention mechanism to automatically learn the association features between nodes and topology in the fused feature matrix. The third layer is the output layer, which outputs the preliminary feature vectors of nodes and links in the fused feature matrix. Residual connections are added between each layer of the GAT network, and a dropout regularization layer is added between the attention hidden layer and the output layer. Then, the new multi-layer GAT network is pre-trained using historical fault data and normal operation data to obtain a basic feature learning model. The basic feature learning model introduces an online incremental learning algorithm, which is matched with a real-time update algorithm. Then, the basic feature learning model introduces a temporal attention mechanism to capture the temporal dependencies of node running state data, optimize the feature extraction effect, and output the final fused feature vector. Step S3: Construct a fault knowledge graph for the online incremental learning algorithm and form fault association rules. By fusing feature vectors and fault association rules from the fault knowledge graph, accurately locate the root cause of the fault, predict the scope of the fault's impact, and output interpretable diagnostic results. Step S4: Based on the interpretable diagnostic results in Step S3, generate differentiated maintenance strategies and execute maintenance operations, and establish a closed-loop feedback mechanism. The closed-loop feedback mechanism feeds back the maintenance results to the fault knowledge graph in Step S3 and the multi-layer GAT network in Step S2.

[0016] Furthermore, the feature is that, in step S1, multi-source heterogeneous data is synchronously collected through sensors, operation and maintenance monitoring systems, and log collection tools deployed in the complex network. This multi-source heterogeneous data includes node operating status data, link connection data, and fault alarm data. The collected multi-source heterogeneous data is then cleaned and normalized to generate a standardized data matrix. The specific processing steps are as follows: Step S11: Use an outlier detection algorithm to identify and remove outliers in multi-source heterogeneous data, use linear interpolation to supplement missing data in multi-source heterogeneous data, and perform unit conversion on multi-source heterogeneous data with inconsistent formats and delete invalid multi-source heterogeneous data. Step S12: Using the Min-Max normalization algorithm, all cleaned data are mapped to the [0,1] interval to obtain normalized node running status data, link connection data and fault alarm data; Step S13: Associate the normalized node operation status data, link connection data, and fault alarm data according to the form of nodes and links to construct a standardized data matrix. The rows of the standardized data matrix correspond to the nodes and links in the network, the columns of the standardized data matrix correspond to the various collected indicators, and the elements of the standardized data matrix are the standardized indicator values ​​of the corresponding nodes and links. At the same time, add a timestamp column to the standardized data matrix for subsequent time series feature analysis.

[0017] Furthermore, in step S1, the specific method for constructing the dynamic graph topology model is as follows: Step S14: Define the nodes in the complex network directly as nodes of the graph topology. The attributes of the nodes include node ID, node type, node location, and core level. Define the physical connections, information flow interactions, and functional dependencies between nodes as edges of the graph topology. The attributes of the edges of the graph topology include link, link type, and connection strength. Step S15: Reassign the weights of the edges in the graph topology. Calculate the weight of each edge using a weighted summation method based on the link's transmission rate, transmission delay, load conditions, and coupling between nodes. The weight values ​​range from [0,1]. Step S16: Establish a real-time update algorithm. When the following three situations are detected, the topology structure is automatically updated: adding or deleting any node or link; the link connection status changes; the weight of the edge in the graph topology changes beyond a threshold; the update frequency of the topology structure is synchronized with the collection frequency of multi-source heterogeneous data, and a new dynamic graph topology model is regenerated after the update.

[0018] Furthermore, the steps for deep integration of the standardized data matrix and the dynamic graph topology model are as follows: Step S17: Concatenate the standardized data matrix and the state graph topology model dimensionally to form an initial fusion feature matrix. The concatenation method is as follows: add the state graph topology model as an additional column to the end of the standardized data matrix to form the initial fusion feature matrix, ensuring that the features of each node and link simultaneously include its own operating status data and topology association data. Step S18: Introduce an attention mechanism, calculate the attention weight of each node and link, and perform weighted processing on the initial fusion feature matrix of step S17 using the attention weight; Step S19: Manually set the ideal variance contribution rate, and then calculate the variance contribution rate of the initial fusion feature matrix after weighting. If the variance contribution rate is greater than or equal to the ideal variance contribution rate, the fusion is deemed effective and the final fusion feature matrix is ​​output. If the variance contribution rate is less than the ideal variance contribution rate, the attention weight coefficients in the attention mechanism are readjusted until the fusion is effective.

[0019] Furthermore, in step S2, the specific steps for constructing the basic feature learning model are as follows: Step S21: Training dataset construction: Collect historical normal operation data and historical failure data of complex networks, and divide them into training set and validation set in either a 7:3 ratio or an 8:2 ratio. The historical failure data includes sudden failures, slow degradation failures and cascading failures. Step S22: Input the training set into the new multilayer GAT network, and update the parameters of the new multilayer GAT network through the backpropagation algorithm until the new multilayer GAT network meets the accuracy requirements. Step S23: Use the new multi-layer GAT network as the basic feature learning model, save the model parameters and network structure, and directly insert the online incremental learning algorithm into the basic feature learning model to locally update the new data and the topology changes. Step S24: After the output layer of the new multi-layer GAT network, add a temporal attention layer to calculate the attention weights of node features at different timestamps. The weight calculation is based on the rate of change of node features. The greater the rate of change, the higher the attention weight. The temporal features output by the temporal attention layer are fused with the preliminary feature vector output by the basic feature learning model to generate the final node and link fused feature vector, i.e., the final fused feature vector.

[0020] Furthermore, in step S3, nodes, links, fault types, alarm information, and maintenance plans in the complex network are defined as entities in the knowledge graph. The connection relationship between nodes and links, the association relationship between faults and alarms, and the correspondence relationship between faults and maintenance plans are defined as relationships in the fault knowledge graph. The fault knowledge graph is constructed, and the fault diagnosis experience of domain experts and the fault association relationship in historical fault cases are entered into the fault knowledge graph to form fault association rules. Based on the online incremental learning algorithm, the entities, relationships, and fault association rules of the fault knowledge graph are updated in real time.

[0021] Furthermore, the fused feature vectors are input into a softmax classifier to calculate the failure probability of each node and link. Suspected faulty nodes and links are input into a fault knowledge graph, matched against fault association rules in the fault knowledge graph, and the matching degree of each rule is calculated. Fault propagation confidence is introduced, and based on a dynamic graph topology model, the propagation path corresponding to the initial fault root cause is traced back, the propagation confidence is calculated, and finally the final fault root cause is output. The final fault root cause includes the fault root cause node ID, fault root cause link, fault type, fault occurrence time, and fault level.

[0022] Furthermore, the specific method for predicting the impact range of a fault is as follows: Step S41: Divide the nodes in the complex network into susceptible state, infected state, exposed state, and recovered state; Step S42: Starting from the root cause node of the fault, track the fault propagation path in real time based on the edge weights of the dynamic graph topology model, mark the exposed nodes and infected nodes in the propagation process, record the propagation time and propagation steps, and generate a visualized fault propagation path graph. Step S43: Calculate the propagation range of the fault at different time points, predict the number of infected nodes and the number of affected links when the fault reaches a steady state, and mark high-risk nodes and high-risk links; Step S44: Integrate the root cause of the failure, propagation path, scope of impact, and information on high-risk nodes and links, and output complete and interpretable diagnostic results.

[0023] Furthermore, based on centrality analysis, fault impact range, and fault level, combined with equipment aging status and maintenance costs, faulty nodes and links are prioritized. Differentiated maintenance strategies are generated for faults of different priorities and types, and maintenance operations are performed based on these differentiated maintenance strategies.

[0024] Furthermore, it also includes repeating steps S1-S4 to achieve continuous closed-loop operation of intelligent fault diagnosis and maintenance of complex networks.

[0025] The principle and effect of this solution are as follows: 1. Compared with existing technologies, this invention forms a standardized data matrix by standardizing and preprocessing multi-source heterogeneous data of complex networks. It then constructs a dynamic graph topology model by combining the actual physical structure of the network and the information flow dependencies, and embeds a real-time update algorithm. This deep integration of the standardized data matrix and the dynamic graph topology model enables the accurate capture of the characteristics of core nodes and key links, achieving feature representation from a global perspective of complex networks. This avoids false alarms and missed alarms caused by single-point monitoring and the separation of topology and data, thereby improving the comprehensiveness and accuracy of fault diagnosis.

[0026] 2. Compared with existing technologies, by constructing a new multi-layer GAT network, adding residual connections between layers and a dropout regularization layer between the attention hidden layer and the output layer, the vanishing gradient and overfitting problems of the model are effectively alleviated. The basic feature learning model is obtained by pre-training with historical data, and an online incremental learning algorithm is introduced and accurately matched with the real-time update algorithm of the dynamic graph topology model. It can adapt to network topology changes and operating condition fluctuations in real time without retraining the entire model, thus reducing the model maintenance cost. At the same time, the introduction of a temporal attention mechanism can accurately capture the temporal dependencies of node operating status data, further optimize the feature extraction effect, and improve the accuracy of fault feature identification.

[0027] 3. Compared with existing technologies, by constructing a fault knowledge graph of online incremental learning algorithm and forming standardized fault association rules, combined with the final output fusion feature vector, it can accurately locate the root cause of the fault, predict the scope of the fault impact, output interpretable diagnostic results, clarify the fault reasoning logic, solve the defect of the "black box" of existing intelligent diagnostic models that are difficult to interpret, improve the trust of operation and maintenance personnel in the diagnostic results, and help the method to be practically applied.

[0028] 4. Compared with existing technologies, this method generates differentiated maintenance strategies and executes maintenance operations based on interpretable diagnostic results. At the same time, it establishes a closed-loop feedback mechanism to feed the maintenance results back to the fault knowledge graph and the multi-layer GAT network in real time. This enables the updating of fault knowledge graph rules and the optimization of multi-layer GAT network parameters, continuously improving the accuracy of fault diagnosis and the adaptability of maintenance strategies. It forms a complete closed loop of "feature learning - fault diagnosis - maintenance execution - feedback optimization", which is suitable for the complex network operation and maintenance needs of large-scale, dynamic and highly coupled networks. Attached Figure Description

[0029] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0030] Figure 1 This paper presents an overall flowchart of an intelligent fault diagnosis and maintenance method for complex networks proposed in an embodiment of this application. Figure 2 This paper illustrates a simulated and visualized complex network structure in an intelligent fault diagnosis and maintenance method for complex networks proposed in an embodiment of this application. Figure 3 A detailed flowchart of step S1 in an intelligent fault diagnosis and maintenance method for complex networks proposed in an embodiment of this application is shown. Figure 4 A detailed flowchart of step S2 in an intelligent fault diagnosis and maintenance method for complex networks proposed in an embodiment of this application is shown. Figure 5 The following is a detailed flowchart of step S4 in an intelligent fault diagnosis and maintenance method for complex networks proposed in an embodiment of this application. Detailed Implementation

[0031] To further illustrate the technical means and effects of the present invention in achieving its intended purpose, the following detailed description of the specific implementation methods, structures, features, and effects of the present invention, in conjunction with the accompanying drawings and preferred embodiments, is provided below.

[0032] Implementation, for example Figures 1-5 As shown: A method for intelligent fault diagnosis and maintenance of complex networks, using simulated and visualized complex network structures such as... Figure 2 As shown, the method details are as follows: Figure 1 , Figure 3 , Figure 4 and Figure 5 As shown, the process includes the following steps: Step S1: Collect multi-source heterogeneous data in the complex network and perform standardized preprocessing to form a standardized data matrix; based on the actual physical structure and information flow dependencies of the complex network, the actual physical structure refers to the physical connection form, which is the objective carrier of the physical layer; the information flow dependencies refer to the transmission logic and coupling constraints of data, signals, instructions and resources in the network, which are the interaction rules of the logic layer and the business layer, construct a dynamic graph topology model and embed a real-time update algorithm into the dynamic graph topology model, and deeply integrate the standardized data matrix with the dynamic graph topology model to obtain the fused feature matrix of the core nodes and key links of the complex network.

[0033] Specifically: Regarding the collection of multi-source heterogeneous data, in step S1, multi-source heterogeneous data is collected synchronously through sensors, operation and maintenance monitoring systems and log collection tools deployed in the complex network. The multi-source heterogeneous data includes node operating status data, link connection data and fault alarm data.

[0034] Regarding the collection metrics for node operation status data, link connection data, and fault alarm data, the collection metrics for node operation status data include, but are not limited to, CPU utilization, memory usage, network bandwidth usage, device operating temperature, operating voltage, and current value. The collection frequency is set to 1-5 minutes / time, dynamically adjusted according to the node type. The collected data format is uniformly JSON format, recording the data collection timestamp, node ID, metric name, and corresponding value.

[0035] Link connection data collection metrics: Collect connection status data of all links in the complex network. The collection metrics include, but are not limited to, link connection status, transmission rate, transmission delay, packet loss rate, and link load. The collection frequency is consistent with the collection frequency of the core node of the corresponding link connection. The link, the IDs of the two end nodes, the link type, and the corresponding collection metric values ​​are recorded synchronously.

[0036] The collection metrics for fault alarm data include: collecting fault alarm information from all nodes and links in the network. The collected information includes alarm ID, alarm type, alarm occurrence time, alarm level, alarm associated nodes and links, and alarm description. The alarm data is collected and stored immediately upon detection, and the real-time operating data of the corresponding nodes and links are synchronized.

[0037] The collected multi-source heterogeneous data is cleaned and normalized to generate a standardized data matrix. The specific processing steps are as follows: Step S11: An outlier detection algorithm is used to identify and remove outliers in the multi-source heterogeneous data. Missing data in the multi-source heterogeneous data is supplemented by linear interpolation. Units of multi-source heterogeneous data with inconsistent formats are uniformly converted and invalid multi-source heterogeneous data is deleted.

[0038] An outlier detection algorithm based on the 3σ principle is used to identify and remove outliers in the data of each indicator. Missing data is supplemented by linear interpolation to make the missing rate less than 5%. Data with inconsistent formats is converted to a unified unit and invalid data is deleted.

[0039] Step S12: The Min-Max normalization algorithm is adopted. This application will not disclose the Min-Max normalization algorithm in detail. The Min-Max normalization algorithm is a common normalization algorithm. It maps all the cleaned data to the [0,1] interval to obtain the normalized node running status data, link connection data and fault alarm data.

[0040] Step S13: Associate the normalized node operation status data, link connection data, and fault alarm data according to the form of nodes and links to construct a standardized data matrix. That is, the rows of the standardized data matrix correspond to the nodes and links in the network, the columns of the standardized data matrix correspond to the various collected indicators, and the elements of the standardized data matrix are the standardized indicator values ​​of the corresponding nodes and links. At the same time, add a timestamp column for subsequent time series feature analysis to the standardized data matrix.

[0041] As stated in step S1, a dynamic graph topology model needs to be constructed in step S1. The specific method for constructing the dynamic graph topology model is as follows: Step S14: Nodes in the complex network are directly defined as nodes of the graph topology. The attributes of the nodes include node ID, node type, node location, and core level. The physical connections, information flow interactions, and functional dependencies between nodes are defined as edges of the graph topology. The attributes of the edges of the graph topology include link, link type, and connection strength.

[0042] Step S15: Reassign the weights of the edges in the graph topology. Calculate the weight of each edge using a weighted summation method based on the link's transmission rate, transmission delay, load conditions, and coupling between nodes. The weight values ​​range from [0,1].

[0043] Specifically, the weighted summation method is as follows: w (edge ​​weight) = α × (1 - transmission delay / maximum delay) + β × (transmission rate / maximum rate) + γ × (1 - load rate), where α, β, and γ are weight coefficients, and α + β + γ = 1, which can be dynamically adjusted according to the network type (e.g., for power grids, α = 0.4, β = 0.3, γ = 0.3; for communication networks, α = 0.3, β = 0.4, γ = 0.3).

[0044] Step S16: Establish a real-time update algorithm. When the following three situations are detected, the topology structure is automatically updated: adding or deleting any node, adding or deleting any link; the link connection status changes; the weight of the edge in the graph topology changes beyond a threshold. The threshold is set to 0.1. The update frequency of the topology structure is synchronized with the collection frequency of multi-source heterogeneous data. After the update, a new dynamic graph topology model is regenerated.

[0045] The standardized data matrix and the dynamic graph topology model are deeply integrated to obtain a fused feature matrix of the core nodes and key links of the complex network. Specifically, step S17: The standardized data matrix and the dynamic graph topology model are dimensionally concatenated to form an initial fused feature matrix. The concatenation method is as follows: the dynamic graph topology model is added as an additional column to the end of the standardized data matrix to form the initial fused feature matrix, ensuring that the features of each node and link simultaneously include its own operating status data and topology association data.

[0046] Step S18: Introduce an attention mechanism, calculate the attention weight of each node and link, and perform weighted processing on the initial fusion feature matrix of step S17 using the attention weight.

[0047] Specifically: An attention mechanism is introduced to calculate the attention weight of each node and link. The attention weight of core nodes and key links is set to 0.7-0.9, and the attention weight of ordinary nodes and links is set to 0.1-0.3. The initial fusion feature matrix is ​​weighted by the attention weight. The calculation formula is: F=W×F0, where F is the final fusion feature matrix, W is the attention weight matrix, and F0 is the initial fusion feature matrix.

[0048] Step S19: Manually set the ideal variance contribution rate. In this embodiment, the ideal variance contribution rate is set to 86%. Then, calculate the variance contribution rate of the initial fusion feature matrix after weighting. If the variance contribution rate is greater than or equal to the ideal variance contribution rate, the fusion is deemed effective and the final fusion feature matrix is ​​output. If the variance contribution rate is less than the ideal variance contribution rate, the attention weight coefficients in the attention mechanism are readjusted until the fusion is effective.

[0049] After step S1, a fused feature matrix is ​​obtained, which contains the features of the core nodes and key links of the complex network. Then, step S2 is entered. The essence of step S2 is to achieve dynamic learning of fused features based on graph attention network (GAT), combined with online incremental learning and temporal attention mechanism, so as to improve the accuracy of feature extraction and the dynamic adaptation capability of the model.

[0050] Specifically: Regarding step S2: Construct a new multi-layer GAT network. The first layer is the input layer, which receives the fused feature matrix from step S1. The second layer is an attention hidden layer, employing a multi-head attention mechanism with 4-8 attention heads, automatically learning the association features between nodes and topology in the fused feature matrix. The third layer is the output layer, outputting preliminary feature vectors of nodes and links in the fused feature matrix. Then, residual connections are added between each layer of the GAT network to alleviate the gradient vanishing problem of the new multi-layer GAT network. A dropout regularization layer is added between the attention hidden layer and the output layer, with a dropout probability of 0.3-0.5 to alleviate overfitting of the new multi-layer GAT network. Then, the weight matrix and bias vector of the above network are initialized. The weight matrix uses the Xavier initialization method, the bias vector is initialized to 0, the learning rate is set to 0.001-0.01, and the number of iterations is set to 100-200. After the number of iterations is completed, the new multi-layer GAT network is constructed.

[0051] Next, the new multi-layer GAT network is pre-trained using historical fault data and normal operation data to obtain a basic feature learning model. The basic feature learning model introduces an online incremental learning algorithm, which is matched with the real-time update algorithm. Then, the basic feature learning model introduces a temporal attention mechanism to capture the temporal dependencies of node operation status data, optimize the feature extraction effect, and output the final fused feature vector.

[0052] Specifically: In step S2, the construction of the basic feature learning model is carried out in the following steps: Step S21: Construction of training dataset: Collect historical normal operation data and historical failure data of complex network, and divide them into training set and validation set in a ratio of 7:3 or 8:2. Among them, historical failure data covers sudden failure, slow deterioration failure and cascading failure.

[0053] Step S22: Input the training set into the new multi-layer GAT network, and update the parameters of the new multi-layer GAT network through the backpropagation algorithm until the new multi-layer GAT network meets the accuracy requirements; specifically: update the model parameters of the new multi-layer GAT network through the backpropagation algorithm to minimize the loss function; the core formula for updating the model parameters through backpropagation is as follows: by calculating the gradient of the loss function with respect to each parameter, the parameters are iteratively updated using the gradient descent method to minimize the loss function: Gradient calculation formula: For the weight matrix W and bias vector b of the new multi-layer GAT network, their gradients are calculated by the partial derivatives of the loss function L with respect to W and b, respectively, as shown in the following formula: ; Where L is the overall loss function and N is the number of training set samples. Let be the loss value for the i-th sample. Let W be the partial derivative of the loss for the i-th sample with respect to the weight matrix W. Let be the partial derivative of the loss of the i-th sample with respect to the bias vector b.

[0054] Based on the gradient calculated above, the weight matrix W and the bias vector b are updated using the gradient descent method: ; ;in, Let be the weight matrix and bias vector at the t-th iteration, respectively. Let be the weight matrix and bias vector after the (t+1)th iteration, respectively. The learning rate (ranging from 0.001 to 0.01) is used to control the step size of parameter updates and avoid gradient explosion or gradient vanishing.

[0055] For example, every 20 iterations, the model performance is validated using a validation set. If the validation set accuracy is ≥90%, pre-training is stopped; if the validation set accuracy is <90%, the learning rate is adjusted. Using the dropout probability, continue iterative training until the accuracy requirement is met.

[0056] Once the accuracy requirement is met, training is complete. The pre-trained new multi-layer GAT network is used as the basic feature learning model. The model parameters and network structure are saved for subsequent online incremental learning. Proceed to step S23. Step S23: The new multi-layer GAT network is used as the basic feature learning model. The model parameters and network structure are saved. The online incremental learning algorithm is directly inserted into the basic feature learning model to locally update the new data and the parts with topological changes.

[0057] Regarding the insertion of online incremental learning algorithms for local updates of new data and topology changes: First, new operational data, new fault data, and topology change data of complex networks are collected in real time and standardized preprocessed according to preprocessing methods to generate incremental feature data. Then, the incremental feature data is input into the basic feature learning model, updating only the parameters of the attention hidden layer and output layer, while fixing the input layer parameters to reduce model training costs. During the update process, gradient descent is still used to accelerate parameter convergence, and the update frequency is synchronized with the topology update frequency. After each update, the model performance is verified using new fault data. If the fault feature extraction accuracy is ≥97%, the update is considered valid, and the updated model is saved. If the accuracy is <97%, the model is rolled back to the previous version, the update parameters are readjusted, and the update is performed again.

[0058] Step S24: After the output layer of the new multi-layer GAT network, add a temporal attention layer to calculate the attention weights of node features at different timestamps. The weight calculation is based on the rate of change of node features. The greater the rate of change, the higher the attention weight. Fuse the temporal features output by the temporal attention layer with the preliminary feature vector output by the basic feature learning model to generate the final node and link fused feature vector. The fusion method is vector concatenation, i.e., the final fused feature vector. Output the final fused feature vector to the next step for fault diagnosis inference. At the same time, save the fused feature vector for subsequent model updates and maintenance feedback. From here, proceed to step S3.

[0059] Step S3: Construct a fault knowledge graph for the online incremental learning algorithm and form fault association rules. By fusing feature vectors and fault association rules from the fault knowledge graph, accurately locate the root cause of the fault, predict the scope of the fault's impact, and output interpretable diagnostic results.

[0060] The key point of step S3 is to construct the fault knowledge graph of the online incremental learning algorithm and form fault association rules. Then, the fault association rules are combined with the fusion feature vector obtained in step S2 to accurately locate the root cause of the fault, and the scope of the fault impact is predicted based on the root cause of the fault, and interpretable diagnostic results are output.

[0061] In step S3, nodes, links, fault types, alarm information, and maintenance plans in the complex network are defined as entities in the knowledge graph. The connection relationship between nodes and links, the association relationship between faults and alarms, and the correspondence relationship between faults and maintenance plans are defined as relationships in the fault knowledge graph. The fault knowledge graph is constructed by inputting the fault diagnosis experience of domain experts and the fault association relationships in historical fault cases into the fault knowledge graph to form fault association rules, such as core switch alarm - multiple downstream devices not responding - core switch port fault. Each rule corresponds to a confidence level with a confidence value of 0.8-1.0. As mentioned above, an online incremental learning algorithm is used to locally update the new data and topology changes. Similarly, based on the online incremental learning algorithm, the entities, relationships, and fault association rules of the fault knowledge graph are updated in real time. For example, the knowledge graph is updated once for every 10 new fault cases to ensure the accuracy and completeness of the fault reasoning rules.

[0062] The fault association rules are combined with the fused feature vector obtained in step S2 to accurately locate the root cause of the fault as follows: The fused feature vector is input into the softmax classifier to calculate the fault probability of each node and link. If the fault probability is ≥0.8, it is judged as a suspected fault node and link, and if the fault probability is <0.8, it is judged as a normal node and link.

[0063] Suspected faulty nodes and links are input into the fault knowledge graph, and the fault association rules in the fault knowledge graph are matched. The fault cause corresponding to the rule with the highest matching degree is taken as the preliminary root cause of the fault.

[0064] The matching degree of each rule is calculated, and the fault propagation confidence is introduced. Based on the dynamic graph topology model, the propagation path corresponding to the initial fault root cause is traced backward. Since the dynamic graph topology model can essentially see the fault propagation path, it can be achieved by reverse reasoning. Calculating the propagation confidence is to calculate the sum of the edge weights on the propagation path. If the propagation confidence is ≥0.75, the initial fault root cause is determined to be the final fault root cause; if the propagation confidence is <0.75, the fault reasoning rules are rematched until the final fault root cause is determined.

[0065] Finally, the final root cause of the fault is output, which includes the root cause node ID, root cause link, fault type, fault occurrence time, and fault level.

[0066] Once the root cause of the final failure is determined, the scope of the failure's impact needs to be predicted based on the root cause. Specifically, we introduce the SEIR propagation dynamics model, which tracks the failure propagation path in real time and predicts the scope of the failure's impact based on the dynamic graph topology model and the root cause. The specific steps are as follows: First, the SEIR propagation dynamics model divides the nodes in the complex network into susceptible states (normal nodes, S), infected states (faulty nodes, I), exposed states (nodes affected by the failure but not yet faulty, E), and recovered states (repaired nodes, R). The parameters of the SEIR propagation dynamics model are initialized as follows: propagation rate β (0.3-0.5), latency rate σ (0.2-0.4), and recovery rate γ (0.1-0.3). The parameters are dynamically adjusted according to the network type.

[0067] Then, starting from the final root cause node of the failure, based on the edge weights of the dynamic graph topology model and combined with the SEIR model, the failure propagation path is tracked in real time, the exposed nodes and infected nodes in the propagation process are marked, the propagation time and propagation steps are recorded, and a visualized failure propagation path graph is generated.

[0068] Next, based on the SEIR propagation dynamics model, the propagation range of the fault at different time points is calculated, the number of infected nodes and the number of affected links when the fault reaches a steady state are predicted, and high-risk nodes, such as core hub nodes and exposed state nodes, are marked with high-risk links.

[0069] Finally, integrate the final root cause of the failure, propagation path, scope of impact, high-risk nodes and link information, and output complete and interpretable diagnostic results, clarifying the diagnostic reasoning logic. For example, the final root cause of the failure is a port failure at node A, which propagates to node B through link L1. It is expected to affect nodes C and D within 1 hour, with node C being the high-risk node, ensuring that maintenance personnel can clearly understand the diagnostic process.

[0070] Finally, the maintenance strategy and maintenance operations are implemented. Therefore, step S4 is set up. Step S4: Based on the interpretable diagnostic results in step S3, a differentiated maintenance strategy and maintenance operations are generated, and a closed-loop feedback mechanism is established. The closed-loop feedback mechanism feeds back the maintenance results to the fault knowledge graph in step S3 and the multi-layer GAT network in step S2.

[0071] The differentiated maintenance strategy and maintenance operations in this application include maintenance priority ranking, differentiated maintenance strategy generation, and maintenance operation execution and monitoring.

[0072] Regarding maintenance priority ranking: Based on centrality analysis, fault impact range, and fault level, combined with equipment aging status and maintenance costs, fault nodes and links are prioritized. The specific steps are as follows: Centrality calculation: Calculate the degree centrality and betweenness centrality of fault nodes. Nodes with degree centrality ≥ 0.8 and betweenness centrality ≥ 0.7 are identified as core hub nodes and prioritized. A weighted scoring method is used to score the priority of each fault node and link. The scoring formula is: S = 0.4 × fault level score + 0.3 × centrality score + 0.2 × impact range score + 0.1 × maintenance cost score, where the fault level score (urgent = 10 points, normal = 6 points, warning = 0.8, normal = 0.7, etc.) is calculated. The scoring system is as follows: 3 points), centrality score (core node = 10 points, ordinary node = 5 points), impact range score (affected nodes ≥ 10 = 10 points, 5-9 = 6 points, < 5 = 3 points), and maintenance cost score (low cost = 10 points, medium cost = 6 points, high cost = 3 points). Based on the scoring results, faulty nodes and links are divided into three priorities: Priority 1 faults (S ≥ 8 points): core hub node faults, emergency faults, and faults with a wide impact range; Priority 2 faults (5 ≤ S < 8 points): ordinary node faults and general faults; Priority 3 faults (S < 5 points): warning level faults and faults with a small impact range. The faults are sorted from highest to lowest priority, with priority 1 faults being handled first.

[0073] Regarding the generation of differentiated maintenance strategies: Priority 1 faults: adopt an emergency repair strategy, with a maintenance time limit of 0.5-2 hours. The maintenance steps include: immediately isolating the faulty node and link to prevent further spread of the fault; dispatching maintenance personnel with dedicated maintenance tools to the site to investigate the specific cause of the fault; performing repair operations; and after repair, checking the operating status of the node and link to ensure that the fault is completely resolved.

[0074] Priority 2 faults: adopt conventional repair strategies, with a maintenance time limit of 2-8 hours. The maintenance steps include: remotely monitoring the operating status of the faulty node and link to make a preliminary judgment on the cause of the fault; arranging maintenance personnel to carry out on-site repair or remote repair; after repair, conducting 1-2 hours of operation monitoring to confirm that the fault has not recurred.

[0075] Priority 3 faults: The maintenance time limit is set to 8-24 hours. The maintenance steps include: continuously monitoring the operating status of the faulty node / link and recording the fault change trend; developing a maintenance plan based on the aging status of the equipment; and performing maintenance operations according to the plan to avoid fault escalation.

[0076] According to the maintenance strategy generated above, perform maintenance operations and monitor the maintenance process and results in real time. The specific steps are as follows: Assign maintenance tasks to corresponding maintenance personnel based on maintenance priority and maintenance strategy, and clarify maintenance responsibilities, maintenance time limits and operating procedures; Maintenance personnel perform maintenance operations according to the maintenance steps, record key data in real time during the maintenance process, and collect the operating status data of faulty nodes / links in real time after maintenance is completed. The monitoring time is set to 1-24 hours (priority 1 monitors for 24 hours, priority 2 monitors for 8 hours, and priority 3 monitors for 1 hour). If no fault recurrence occurs during the monitoring period, the maintenance is considered successful; if a fault recurrence occurs, immediately re-investigate the root cause of the fault and perform secondary maintenance.

[0077] As discussed above, maintenance inevitably leads to changes in outcomes. Therefore, a closed-loop feedback mechanism is established to feed the maintenance results back to the fault knowledge graph in step S3 and the multi-layer GAT network in step S2. Specifically, the maintenance results are fed back to the fault knowledge graph in S3 and the multi-layer GAT network in S2, updating the rules of the multi-layer GAT network and the fault knowledge graph to achieve closed-loop optimization of diagnosis and maintenance. The specific steps are as follows: Collect maintenance result data, including maintenance success rate, fault recurrence rate, maintenance time, maintenance cost, and new fault types and new fault associations discovered during the maintenance process; Input the feedback data into the multi-layer GAT network, adjust the attention weight and learning rate of the multi-layer GAT network, record the new fault types and new fault associations discovered during the maintenance process into the fault knowledge graph, update the fault association rules and corresponding confidence levels to improve the accuracy of fault root cause localization; Based on the maintenance success rate and maintenance cost data, adjust the maintenance priority scoring mode, maintenance time limit and maintenance steps to optimize differentiated maintenance strategies, improve maintenance efficiency and reduce maintenance costs.

[0078] Finally, repeat steps 1-4 to achieve continuous closed-loop operation of intelligent fault diagnosis and maintenance for complex networks: real-time data acquisition and updating of dynamic graph topology model - new multi-layer GAT network - interpretable diagnostic results - differentiated maintenance and feedback optimization, ensuring that the model always adapts to the dynamic changes of complex networks and continuously improves the accuracy of fault diagnosis and the level of intelligent maintenance.

[0079] This invention discloses an intelligent fault diagnosis and maintenance method for complex networks, aiming to solve the technical problems of lack of global modeling, insufficient topology utilization, poor dynamic adaptation capability, weak interpretability, and disconnect between diagnosis and maintenance in existing complex network fault diagnosis technologies. It realizes accurate source tracing, propagation prediction and intelligent maintenance of complex network faults, and adapts to the operation and maintenance needs of large-scale, dynamic and highly coupled complex networks.

[0080] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention in any way. Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make some modifications or alterations to the above-disclosed technical content to create equivalent embodiments without departing from the scope of the present invention. Any simple modifications, equivalent changes and alterations made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the scope of the present invention.

Claims

1. A method for intelligent fault diagnosis and maintenance of complex networks, characterized in that, Includes the following steps: Step S1: Collect multi-source heterogeneous data in complex networks and perform standardized preprocessing to form a standardized data matrix; based on the actual physical structure and information flow dependencies of complex networks, construct a dynamic graph topology model and embed a real-time update algorithm into the dynamic graph topology model, deeply integrate the standardized data matrix with the dynamic graph topology model, and obtain a fusion feature matrix of the core nodes and key links of the complex network. Step S2: Construct a new multi-layer GAT network, with the first layer being the input layer, which receives the fused feature matrix; The second layer is the attention hidden layer, which adopts a multi-head attention mechanism to automatically learn the association features between nodes and topology in the fused feature matrix; The third layer is the output layer, which outputs the preliminary feature vectors of nodes and links in the fused feature matrix. Residual connections are added between each layer of the GAT network, and a dropout regularization layer is added between the attention hidden layer and the output layer. Then, the new multi-layer GAT network is pre-trained using historical fault data and normal operation data to obtain the basic feature learning model. The basic feature learning model introduces an online incremental learning algorithm, which is matched with the real-time update algorithm. Then, the basic feature learning model introduces a temporal attention mechanism to capture the temporal dependencies of node running state data, optimize the feature extraction effect, and output the final fused feature vector. Step S3: Construct a fault knowledge graph for the online incremental learning algorithm and form fault association rules. By fusing feature vectors and fault association rules from the fault knowledge graph, accurately locate the root cause of the fault, predict the scope of the fault's impact, and output interpretable diagnostic results. Step S4: Based on the interpretable diagnostic results in Step S3, generate differentiated maintenance strategies and execute maintenance operations, and establish a closed-loop feedback mechanism. The closed-loop feedback mechanism feeds back the maintenance results to the fault knowledge graph in Step S3 and the multi-layer GAT network in Step S2.

2. The intelligent fault diagnosis and maintenance method for complex networks according to claim 1, characterized in that, In step S1, multi-source heterogeneous data is synchronously collected using sensors, operation and maintenance monitoring systems, and log collection tools deployed in the complex network. This multi-source heterogeneous data includes node operating status data, link connection data, and fault alarm data. The collected multi-source heterogeneous data is then cleaned and normalized to generate a standardized data matrix. The specific processing steps are as follows: Step S11: Use an outlier detection algorithm to identify and remove outliers in multi-source heterogeneous data, use linear interpolation to supplement missing data in multi-source heterogeneous data, and perform unit conversion on multi-source heterogeneous data with inconsistent formats and delete invalid multi-source heterogeneous data. Step S12: Using the Min-Max normalization algorithm, all cleaned data are mapped to the [0,1] interval to obtain normalized node running status data, link connection data and fault alarm data; Step S13: Associate the normalized node operation status data, link connection data, and fault alarm data according to the form of nodes and links to construct a standardized data matrix. The rows of the standardized data matrix correspond to the nodes and links in the network, the columns of the standardized data matrix correspond to the various collected indicators, and the elements of the standardized data matrix are the standardized indicator values ​​of the corresponding nodes and links. At the same time, add a timestamp column to the standardized data matrix for subsequent time series feature analysis.

3. The intelligent fault diagnosis and maintenance method for complex networks according to claim 2, characterized in that, In step S1, the specific method for constructing the dynamic graph topology model is as follows: Step S14: Define the nodes in the complex network directly as nodes of the graph topology. The attributes of the nodes include node ID, node type, node location, and core level. Define the physical connections, information flow interactions, and functional dependencies between nodes as edges of the graph topology. The attributes of the edges of the graph topology include link, link type, and connection strength. Step S15: Reassign the weights of the edges in the graph topology. Calculate the weight of each edge using a weighted summation method based on the link's transmission rate, transmission delay, load conditions, and coupling between nodes. The weight values ​​range from [0,1]. Step S16: Establish a real-time update algorithm. When the following three situations are detected, the topology structure is automatically updated: adding or deleting any node or link; the link connection status changes; the weight of the edge in the graph topology changes beyond a threshold; the update frequency of the topology structure is synchronized with the collection frequency of multi-source heterogeneous data, and a new dynamic graph topology model is regenerated after the update.

4. The intelligent fault diagnosis and maintenance method for complex networks according to claim 3, characterized in that, The steps for deep integration of standardized data matrices and dynamic graph topology models are as follows: Step S17: Concatenate the standardized data matrix and the state graph topology model dimensionally to form an initial fusion feature matrix. The concatenation method is as follows: add the state graph topology model as an additional column to the end of the standardized data matrix to form the initial fusion feature matrix, ensuring that the features of each node and link simultaneously include its own operating status data and topology association data. Step S18: Introduce an attention mechanism, calculate the attention weight of each node and link, and perform weighted processing on the initial fusion feature matrix of step S17 using the attention weight; Step S19: Manually set the ideal variance contribution rate, and then calculate the variance contribution rate of the initial fusion feature matrix after weighting. If the variance contribution rate is greater than or equal to the ideal variance contribution rate, the fusion is deemed effective, and the final fusion feature matrix is ​​output. If the variance contribution rate is less than the ideal variance contribution rate, then the attention weight coefficients in the attention mechanism are readjusted until fusion is effective.

5. The intelligent fault diagnosis and maintenance method for complex networks according to claim 4, characterized in that, In step S2, the specific steps for constructing the basic feature learning model are as follows: Step S21: Training dataset construction: Collect historical normal operation data and historical failure data of complex networks, and divide them into training set and validation set in either a 7:3 ratio or an 8:2 ratio. The historical failure data includes sudden failures, slow degradation failures and cascading failures. Step S22: Input the training set into the new multilayer GAT network, and update the parameters of the new multilayer GAT network through the backpropagation algorithm until the new multilayer GAT network meets the accuracy requirements. Step S23: Use the new multi-layer GAT network as the basic feature learning model, save the model parameters and network structure, and directly insert the online incremental learning algorithm into the basic feature learning model to locally update the new data and the parts with topological changes. Step S24: After the output layer of the new multi-layer GAT network, add a temporal attention layer to calculate the attention weights of node features at different timestamps. The weight calculation is based on the rate of change of node features. The greater the rate of change, the higher the attention weight. The temporal features output by the temporal attention layer are fused with the preliminary feature vector output by the basic feature learning model to generate the final node and link fused feature vector, i.e., the final fused feature vector.

6. The intelligent fault diagnosis and maintenance method for complex networks according to claim 5, characterized in that, In step S3, nodes, links, fault types, alarm information, and maintenance plans in the complex network are defined as entities in the knowledge graph. The connection relationship between nodes and links, the association relationship between faults and alarms, and the correspondence relationship between faults and maintenance plans are defined as relationships in the fault knowledge graph. The fault knowledge graph is constructed, and the fault diagnosis experience of domain experts and the fault association relationship in historical fault cases are entered into the fault knowledge graph to form fault association rules. Based on the online incremental learning algorithm, the entities, relationships, and fault association rules of the fault knowledge graph are updated in real time.

7. The intelligent fault diagnosis and maintenance method for complex networks according to claim 6, characterized in that, The fused feature vectors are input into a softmax classifier to calculate the failure probability of each node and link. Suspected faulty nodes and links are input into a fault knowledge graph, and the fault association rules in the fault knowledge graph are matched. The matching degree of each rule is calculated, and the fault propagation confidence is introduced. Based on a dynamic graph topology model, the propagation path corresponding to the initial fault root cause is traced back, the propagation confidence is calculated, and finally the final fault root cause is output. The final fault root cause includes the fault root cause node ID, fault root cause link, fault type, fault occurrence time, and fault level.

8. The intelligent fault diagnosis and maintenance method for complex networks according to claim 7, characterized in that, The specific method for predicting the impact range of a fault is as follows: Step S41: Divide the nodes in the complex network into susceptible state, infected state, exposed state, and recovered state; Step S42: Starting from the root cause node of the fault, track the fault propagation path in real time based on the edge weights of the dynamic graph topology model, mark the exposed nodes and infected nodes in the propagation process, record the propagation time and propagation steps, and generate a visualized fault propagation path graph. Step S43: Calculate the propagation range of the fault at different time points, predict the number of infected nodes and the number of affected links when the fault reaches a steady state, and mark high-risk nodes and high-risk links; Step S44: Integrate the root cause of the failure, propagation path, scope of impact, and information on high-risk nodes and links, and output complete and interpretable diagnostic results.

9. The intelligent fault diagnosis and maintenance method for complex networks according to claim 8, characterized in that, Based on centrality analysis, fault impact range, and fault level, combined with equipment aging status and maintenance costs, faulty nodes and links are prioritized. Differentiated maintenance strategies are generated for faults of different priorities and types, and maintenance operations are performed based on these differentiated maintenance strategies.

10. The intelligent fault diagnosis and maintenance method for complex networks according to claim 9, characterized in that, It also includes repeating steps S1-S4 to achieve continuous closed-loop operation of intelligent fault diagnosis and maintenance of complex networks.