A Smart Fault Location System for Distribution Networks Based on Large Model Analysis
By constructing an intelligent fault location system based on large model analysis, integrating topology structure and unstructured alarm text, inferring fault propagation paths and calculating upstream node states in reverse, the system solves the problem of insufficient accuracy in distribution network fault location in existing technologies, and achieves accurate and quantitative presentation of fault location.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SICHUAN TIANLING HI-TECH ELECTRIC CO LTD
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-30
AI Technical Summary
Existing technologies fail to fully integrate topological relationships and unstructured alarm text information in distribution network fault location, making it impossible to accurately capture the deep correlations between fault elements. This results in insufficient accuracy and anti-interference capability of fault location results, making it difficult to cope with complex features.
An intelligent fault location system based on large model analysis is constructed. The system forms a structured original dataset of fault scenarios through the data processing module. The pre-trained large language model encodes the graph structure data and unstructured alarm text into a unified semantic embedding vector. The system combines the causal reasoning module to infer the fault propagation path and the state inversion module to calculate the theoretical operating state of the upstream nodes. Finally, the fault location module identifies suspicious fault sections and calculates probability scores.
It enables precise location of faults in the distribution network and outputs structured fault diagnosis reports, making up for the shortcomings of conventional technologies that cannot cover the status of all nodes, and can perform precise priority sorting in complex scenarios.
Smart Images

Figure CN122307241A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of power system automation technology, and in particular to an intelligent fault location system for distribution networks based on large model analysis. Background Technology
[0002] Fault location in distribution networks is a core aspect of ensuring the safe and stable operation of distribution networks. Existing technical solutions mostly focus on analyzing single-dimensional electrical quantity characteristics, using traditional numerical calculations or simple logical rules to initially determine faulty sections. These technical solutions rely solely on measured electrical data during distribution network operation, failing to fully integrate the topological relationships and unstructured alarm text information at the time of fault occurrence, and lacking sufficient capability for fusing and processing multi-source heterogeneous data.
[0003] The actual fault scenarios in distribution networks are complex and heterogeneous. Conventional technologies cannot accurately capture the deep correlations between fault elements and trace the complete propagation logic of faults when faced with multiple fault points and intersecting fault propagation paths. At the same time, existing technologies lack a computational mechanism to deduce the operating status of upstream nodes from the load side at the end of the fault. They can only perform fault feature matching based on forward data and cannot achieve refined probabilistic determination of suspected fault sections. As a result, the accuracy and anti-interference capability of fault location results are insufficient to meet the actual operation and maintenance needs of distribution networks.
[0004] Existing technologies have failed to solve the problem of unified encoding of unstructured alarm text in distribution network diagram structure data and multi-source operation data streams. They are unable to construct a complete fault feature system that includes topological and semantic associations, nor have they established a fault probability determination mechanism that combines fault propagation path direction and electrical quantity change trend. As a result, they are unable to cope with the diverse and complex characteristics of distribution network fault scenarios. Summary of the Invention
[0005] The purpose of this invention is to address the shortcomings of existing technologies by proposing a smart fault location system for distribution networks based on large model analysis.
[0006] To achieve the above objectives, the present invention adopts the following technical solution: a distribution network intelligent fault location system based on large model analysis, comprising: The data processing module receives multi-source operational data streams from the distribution network during the fault occurrence period and forms a structured original dataset of the fault scenario. The semantic encoding module constructs graph structure data of the power distribution network based on the original dataset of the fault scenario, calls a pre-trained large language model, and encodes the graph structure data and the unstructured alarm text in the multi-source operation data stream into a unified semantic embedding vector. The causal reasoning module inputs the semantic embedding vector into the causal reasoning engine to infer several potential fault propagation paths; The state inversion module, based on the inferred fault propagation path, initiates the state inversion calculation process, starting from the known end load side state, and reverses to calculate the theoretical operating state of each upstream node; The fault location module compares the theoretical operating state obtained from the state inversion calculation with the measured data in the original dataset of the structured fault scenario to identify suspicious fault sections. For each suspicious fault section, it extracts the electrical quantity change trend features before and after the fault, and calculates and sorts the probability score of the suspected fault section based on the fault propagation path direction output by the causal inference engine. Finally, it outputs the fault location result and the structured fault diagnosis report.
[0007] As a further aspect of the present invention, a structured original dataset of fault scenarios is formed, including: The multi-source operation data stream includes three-phase current and voltage waveforms collected by the distribution transformer terminal, position signals of feeder switches, and the operating status of fault indicators. The multi-source operational data streams are synchronized in time and mapped to spatial coordinates, unifying the data streams from different devices to the same time reference and geographic topological coordinate system, forming a structured original dataset of fault scenarios, specifically including: The nodes in the graph structure data represent power equipment, the edges represent line connections, and real-time electrical measurement attributes are assigned to the nodes and edges. The three-phase current and voltage waveforms collected by the distribution transformer terminal are analyzed, the amplitude and phase angle of the fundamental component are extracted, and linear interpolation is performed to complete the missing data points. The position change signal of the feeder switch and the operation status of the fault indicator are converted into a standardized event log with timestamps. Read the static topology diagram of the power distribution network to obtain the physical installation coordinates of each power device and the logical relationship with its feeder; Based on the logical relationship between the physical installation coordinates and the feeder, each event log with a timestamp is associated with the corresponding node or edge in the graph structure data; Following a unified timeline, the completed three-phase current and voltage waveform data, standardized event logs, and equipment static attributes are integrated into a structured original dataset of the fault scenario in the form of a wide table.
[0008] As a further aspect of the present invention, based on the original dataset of the fault scenarios, graph structure data of the distribution network is constructed, including: Traverse the static topology wiring diagram of the distribution network and create a node set containing all substation outgoing switches, sectionalizing switches, tie switches, and distribution transformers; Based on the physical connection relationship of the lines, edge connections are established between the node sets, and each edge is assigned the line length, type and impedance parameters; Read the real-time current and voltage values of each node from the original structured fault scenario dataset, and attach the real-time current and voltage values as attributes to the corresponding node objects. For the feeder switch that has undergone displacement and the fault indicator that has been activated, add a fault feature label to the corresponding node or edge attribute; The completed graph structure data, which includes node attributes, edge attributes, and topological connections, is serialized and stored for use in subsequent model encoding.
[0009] As a further aspect of the present invention, the step of calling a pre-trained large language model to encode the graph structure data and the unstructured alarm text in the multi-source running data stream into a unified semantic embedding vector includes: The semantic embedding vector integrates the physical topology information and real-time operating status of the power grid; A graph neural network encoder is used to perform a depth traversal of the graph structure data to extract the global topological features and local electrical features of the distribution network, and generate a graph embedding vector. The unstructured alarm text is word-embedded and semantically understood using a natural language processing encoder to capture the description of fault phenomena and the correlation between the alarm information and the equipment, and to generate a text embedding vector. The graph embedding vector and the text embedding vector are concatenated to form a hybrid feature vector; The hybrid feature vector is fed into the attention mechanism layer of the pre-trained large language model, where the attention mechanism layer dynamically adjusts the fusion weights of topological features and text features to eliminate redundant information. The semantic embedding vector, which is weighted and fused by the attention mechanism, is output. The semantic embedding vector fully represents the fault status of the current distribution network.
[0010] As a further aspect of the present invention, the semantic embedding vector is input into a causal reasoning engine to infer several potential fault propagation paths, including: The causal reasoning engine is based on the topological constraints and electromagnetic transient mechanisms of the power distribution network; The semantic embedding vector is decoded into a set of candidate root cause hypotheses, which correspond to the device node in the distribution network that first fails. Starting with each of the root cause hypotheses, the flow of fault current is simulated on the topology of the graph structure data, following the physical law that current flows from the power source side to the load side, to generate a positive propagation tree. By combining the fault mechanism knowledge base built into the causal reasoning engine, false propagation trees that do not conform to electrical logic are filtered out, while physically feasible propagation paths are retained; For each preserved propagation path, record its node sequence, edge sequence, and electrical quantity attenuation characteristics during propagation. All propagation paths that meet the conditions are aggregated to form the set of potential fault propagation paths.
[0011] As a further aspect of the present invention, the step of initiating a state inversion calculation process based on the inferred fault propagation path, starting from the known end-load side state, and backwards calculating the theoretical operating state of each upstream node, includes: Determine the endpoint of each path in the set of fault propagation paths, where the endpoint is typically a disconnected tie switch or a terminal distribution transformer. The theoretical operating state at the end point is set as a known boundary condition, which includes zero current and voltage drop to the residual voltage level; Along the reverse direction of the fault propagation path, the voltage drop equation and power balance equation of the distribution network are applied to calculate the theoretical voltage and theoretical current values of the next higher level node level by level. In the calculation process, the actual impedance parameters of the line and the load rate correction factor are introduced to refine the calculation results. The calculation process is repeated until the starting point of the fault propagation path is reached, thus completing the state inversion of a single path.
[0012] As a further aspect of the present invention, the theoretical operating state obtained by the state inversion calculation is compared with the measured data in the original dataset of the structured fault scenario to identify suspicious fault sections, including: Retrieve the theoretical voltage and theoretical current values of each node output by the state inversion calculation process; Extract the measured voltage and measured current values of the corresponding nodes from the original structured fault scenario dataset; Calculate the absolute value of the difference between the measured value and the theoretical value for each node, and calculate the percentage deviation of the difference; A deviation threshold is set. When the deviation percentage of a node exceeds the deviation threshold, the node is determined to have an abnormal state. Mark all nodes with abnormal status as the suspected fault segments and record the specific numerical value of their deviation.
[0013] As a further aspect of the present invention, for each suspected fault section, the trend characteristics of electrical quantity changes before and after the fault are extracted, and combined with the fault propagation path direction output by the causal inference engine, the probability score of the suspected fault section occurring is calculated, including: Electrical quantity sampling points of the suspected fault section in the three cycles before and after the fault are extracted from the original dataset of the structured fault scenario. Calculate the current surge and voltage dip depth of the suspected fault section as characteristic indicators of the suspected fault section; Analyze the direction of the fault propagation path to determine whether the suspected faulty section is located upstream, midstream, or downstream of the propagation path; Based on the pre-set feature index weights and path position weights, the current surge, voltage dip depth, and path position are weighted and summed. The weighted summation result is mapped to a numerical range of zero to one to obtain the probability score of the suspected faulty section.
[0014] As a further aspect of the present invention, the output of the final fault location result and structured fault diagnosis report includes: All the suspected fault segments are sorted according to the probability scores, and the segment with the highest score is selected as the final fault location result. Generate a structured fault diagnosis report that includes details of the fault location, fault type, and fault propagation path; All the suspected fault segments are sorted according to their probability scores, and the segment with the highest ranking is selected as the final fault location result, including: Create a candidate list that includes the segment identifier, probability score, deviation percentage, and path number; The items in the candidate list are sorted in descending order according to their probability scores. If the probability scores are the same, the deviation percentages are compared. The suspected faulty section at the top of the list is selected as the primary fault location result; Meanwhile, the two segments with the second-highest probability scores are used as backup auxiliary judgment results to deal with cross-validation in complex fault scenarios.
[0015] As a further aspect of the present invention, generating a structured fault diagnosis report containing details of the fault location, fault type, and fault propagation path includes: Organize the final fault location results and obtain its geographical location information and the name of the feeder to which it belongs; Based on the electrical quantity change trend characteristics of the suspected fault section before and after the fault, the specific fault type is determined by matching the fault type discrimination rule base. The key node data involved in the state inversion calculation process, the comparison table of theoretical and measured values, and the complete path information inferred by the causal reasoning engine are integrated into the appendix; According to the preset reporting protocol, fill in the corresponding fields of the report with the fault location, fault type, confidence level, and attachment data; Output a structured fault diagnosis report that can be reviewed by dispatchers.
[0016] Compared with the prior art, the advantages and positive effects of the present invention are as follows: A graph-structured data set of the distribution network is constructed, and a pre-trained large language model is invoked to encode the graph-structured data and unstructured alarm text from multi-source operational data streams into a unified semantic embedding vector. This unified semantic embedding vector simultaneously integrates the topological relationship information of the distribution network with the semantic relationship information of the unstructured alarm text, and can fully present the deep internal connections between various elements under fault scenarios. Based on this vector, the causal inference engine can directly identify potential fault propagation paths, avoiding the fault propagation path identification bias caused by relying solely on single electrical data.
[0017] Based on the inferred fault propagation path, starting from the known state of the end load side, the theoretical operating state of each upstream node is calculated in reverse. The theoretical operating state is compared with the measured data in the original dataset of the structured fault scenario to identify suspicious fault sections. The electrical quantity change trend characteristics before and after the fault in the suspicious fault section are extracted. Combined with the fault propagation path direction, the probability score of the suspicious fault section is calculated and ranked. This reverse calculation mechanism can reconstruct the actual operating state of upstream nodes from the fault end result, overcoming the deficiency of conventional forward calculations that cannot cover the state of all nodes. Combined with the probability score calculation of the fault propagation path direction, multiple suspicious fault sections can be accurately prioritized, directly outputting a structured fault diagnosis report, achieving accurate and quantitative presentation of fault location results. Attached Figure Description
[0018] Figure 1 This is a timing diagram of a power distribution network intelligent fault location system based on large model analysis, as described in this invention. Figure 2 A flowchart for constructing the distribution network diagram structure data; Figure 3 A diagram showing the attenuation characteristics of electrical quantities along a fault propagation path; Figure 4 The diagram shows the three-phase current timing characteristics of a single-phase ground fault in a distribution network. Figure 5This is a comparison diagram of electrical quantities before and after a power distribution network fault. Detailed Implementation
[0019] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention.
[0020] In the description of this invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," and "outer," etc., indicating orientation or positional relationships, are based on the orientation or positional relationships shown in the accompanying drawings and are only for the convenience of describing the invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation, and therefore should not be construed as a limitation of the invention. Furthermore, in the description of this invention, "a plurality of" means two or more, unless otherwise explicitly specified.
[0021] See Figure 1 The data processing module receives multi-source operational data streams from the distribution network during the fault occurrence period, forming a structured original dataset of the fault scenario. The semantic encoding module, based on this dataset, constructs graph-structured data of the distribution network and uses a pre-trained large language model to encode the graph-structured data and unstructured alarm text from the multi-source operational data streams into a unified semantic embedding vector. The causal reasoning module inputs this semantic embedding vector into the causal reasoning engine to infer several potential fault propagation paths. The state inversion module, based on the inferred fault propagation paths, initiates the state inversion calculation process, starting from the known end-load side state and working backward to calculate the theoretical operating state of each upstream node. The fault location module compares the theoretical operating state obtained from the state inversion calculation with the measured data in the structured original dataset of the fault scenario to identify suspicious fault sections. For each suspicious fault section, it extracts the electrical quantity change trend features before and after the fault, and combines this with the fault propagation path direction output by the causal reasoning engine to calculate and rank the probability score of the suspected fault section, outputting the final fault location result and a structured fault diagnosis report.
[0022] In one embodiment of the present invention, the data processing module receives multi-source operational data streams of the distribution network during the fault occurrence period. These multi-source operational data streams include three-phase current and voltage waveforms collected by the distribution transformer terminal, feeder switch position signals, and the operational status of fault indicators. The data processing module performs time synchronization and spatial coordinate mapping on the multi-source operational data streams, unifying the data streams from different devices under the same time reference and geographical topological coordinate system, forming a structured original dataset of the fault scenario. Specifically, this process includes: nodes in the graph structure data representing power equipment, edges representing line connection relationships, and assigning real-time electrical measurement attributes to nodes and edges; parsing the three-phase current and voltage waveforms collected by the distribution transformer terminal, extracting the fundamental component amplitude and phase angle, and performing linear interpolation to complete missing data points; converting the feeder switch position signals and the operational status of fault indicators into standardized event logs with timestamps; and reading the static topology wiring diagram of the distribution network to obtain the physical installation coordinates of each power device and the logical relationship with its corresponding feeder. Based on the logical relationship between physical installation coordinates and their respective feeders, each timestamped event log is associated with a corresponding node or edge in the graph structure data. Following a unified timeline, the completed three-phase current and voltage waveform data, standardized event logs, and equipment static attributes are integrated into a structured original dataset of fault scenarios in the form of a wide table.
[0023] In practical implementation, the data processing module receives multi-source operational data streams from the distribution network during the fault occurrence period. These multi-source data streams include three-phase current and voltage waveforms collected by the distribution transformer terminal, position signals of feeder switches, and the operational status of fault indicators. The data processing module performs time synchronization and spatial coordinate mapping operations on the multi-source operational data streams, unifying the data streams from different devices under the same time reference and geographical topological coordinate system, forming a structured original dataset of the fault scenario. In practical implementation, nodes in the graph structure data represent power equipment, and edges represent line connections. Real-time electrical measurement attributes are assigned to nodes and edges. The module analyzes the three-phase current and voltage waveforms collected by the distribution transformer terminal, extracts the amplitude and phase angle of the fundamental component, and performs linear interpolation completion for missing data points. In some embodiments, the linear interpolation completion operation uses a formula to calculate the voltage value of the missing data points:
[0024] in: This represents the interpolated voltage value, where t represents the timestamp of the missing data point. and This represents the timestamps of adjacent known data points before and after the missing data point. and The known voltage value corresponding to the timestamp is represented; similar calculations can be applied to interpolation and completion of current values. The position change signals of feeder switches and the action states of fault indicators are converted into standardized event logs with timestamps. The static topology diagram of the distribution network is read to obtain the physical installation coordinates of each power device and the logical relationship with its corresponding feeder. In some embodiments, based on the logical relationship between the physical installation coordinates and the corresponding feeder, each event log with timestamp is associated with the corresponding node or edge in the graph structure data. Following a unified timeline, the completed three-phase current and voltage waveform data, standardized event logs, and equipment static attributes are integrated into a structured fault scenario raw dataset in a wide table format. Optionally, time synchronization utilizes GPS timestamps to align data points in the multi-source operational data stream. Optionally, spatial coordinate mapping is based on the latitude and longitude coordinates of the power devices to achieve geographic topological coordinate system transformation. The structured fault scenario raw dataset in wide table format uses rows to represent time-series sampling points and columns to represent attribute fields of different power devices. The integration process of multi-source operational data streams ensures data consistency in both time and space, providing a foundation for subsequent module processing.
[0025] In one embodiment of the present invention, a graph structure data of the distribution network is constructed based on a structured original dataset of fault scenarios. This process includes: (See reference) Figure 2 The process iterates through the static topology diagram of the distribution network, creating a node set containing all substation outgoing switches, sectionalizing switches, tie switches, and distribution transformers. Based on the physical connections of the lines, edge connections are established between the node sets, and each edge is assigned the line length, type, and impedance parameters. Real-time current and voltage values of each node are read from the structured fault scenario dataset, and these values are attached as attributes to the corresponding node objects. For feeder switches that have undergone displacement and fault indicators that have been activated, fault feature tags are appended to their corresponding node or edge attributes. The completed graph structure data, containing node attributes, edge attributes, and topology connections, is serialized and stored.
[0026] A pre-trained large language model is invoked to encode graph-structured data and unstructured alarm text from multi-source operational data streams into a unified semantic embedding vector. This semantic embedding vector integrates the physical topology information and real-time operational status of the power grid. A graph neural network encoder performs a depth-first traversal of the graph-structured data to extract global topological features and local electrical features of the distribution network, generating a graph embedding vector. A natural language processing encoder performs word embedding and semantic understanding on the unstructured alarm text, capturing the fault phenomenon description and equipment correlation in the alarm information, generating a text embedding vector. The graph embedding vector and the text embedding vector are concatenated to form a hybrid feature vector. This hybrid feature vector is fed into the attention mechanism layer of the pre-trained large language model, where the attention mechanism dynamically adjusts the fusion weights of topological and text features to eliminate redundant information. The output is a weighted and fused semantic embedding vector that fully represents the current fault status of the distribution network.
[0027] In practical implementation, based on the structured fault scenario dataset, a graph structure data of the distribution network is constructed. This involves traversing the static topology diagram of the distribution network to create a node set containing all substation outgoing switches, sectionalizing switches, tie switches, and distribution transformers. According to the physical connection relationships of the lines, edge connections are established between the node sets, and each edge is assigned the line length, type, and impedance parameters. Real-time current and voltage values of each node are read from the structured fault scenario dataset and attached as attributes to the corresponding node objects. For feeder switches that have undergone displacement and fault indicators that have been activated, fault feature tags are added to the node or edge attributes corresponding to these switches. The completed graph structure data, containing node attributes, edge attributes, and topology connections, is then serialized and stored.
[0028] In specific implementation, a pre-trained large language model is invoked to encode the graph-structured data and unstructured alarm text from multi-source operational data streams into a unified semantic embedding vector. This semantic embedding vector integrates the physical topology information and real-time operational status of the power grid. In some embodiments, a graph neural network encoder performs a depth-first traversal of the graph-structured data to extract global topological features and local electrical features of the distribution network, generating a graph embedding vector. A natural language processing encoder performs word embedding and semantic understanding on the unstructured alarm text, capturing the fault phenomenon description and equipment correlation in the alarm information, generating a text embedding vector. In some embodiments, the graph embedding vector and the text embedding vector are concatenated to form a hybrid feature vector. This hybrid feature vector is then fed into the attention mechanism layer of the pre-trained large language model, where the attention mechanism layer dynamically adjusts the fusion weights of the topological and text features. Optionally, the process of the attention mechanism layer calculating the fusion weights can be described as follows:
[0029] in: Represents the attention weight matrix. This represents the query matrix generated from the mixed feature vectors. This represents the key matrix generated by the mixture of eigenvectors. This represents the dimension of the key vector. The output is a semantic embedding vector, weighted and fused using an attention mechanism. This semantic embedding vector fully represents the current fault status of the power distribution network. Optionally, the graph neural network encoder uses a graph convolutional network structure. It is understood that the unstructured alarm text originates from fault descriptions entered by the monitoring system or dispatchers. It is also understood that the pre-trained large language model has been pre-trained on general corpora and fine-tuned on power domain texts.
[0030] In one embodiment of the present invention, the causal reasoning module inputs a semantic embedding vector into the causal reasoning engine. The causal reasoning engine infers several potential fault propagation paths based on the topological constraints and electromagnetic transient mechanisms of the distribution network. The causal reasoning engine decodes the semantic embedding vector into a set of candidate root cause hypotheses, each corresponding to the device node in the distribution network where the fault first occurs. Starting with each root cause hypothesis, the flow of fault current is simulated on the topological graph of the graph structure data, following the physical law that current flows from the power source side to the load side, generating a forward propagation tree. Combining the fault mechanism knowledge base built into the causal reasoning engine, false propagation trees that do not conform to electrical logic are filtered out, retaining physically feasible propagation paths. For each retained propagation path, its node sequence, edge sequence, and electrical quantity attenuation characteristics during propagation are recorded. All propagation paths that meet the conditions are summarized to form a set of potential fault propagation paths.
[0031] The state inversion module initiates the state inversion calculation process based on the inferred fault propagation path. Starting from the known state of the end load side, it reverses the calculation to determine the theoretical operating state of each upstream node. It identifies the endpoint of each path in the fault propagation path set, typically a disconnected tie switch or end distribution transformer. The theoretical operating state of the endpoint is set as known boundary conditions, including zero current and voltage drop to the residual voltage level. Along the reverse direction of the fault propagation path, the voltage drop equation and power balance equation of the distribution network are applied to calculate the theoretical voltage and current values of the next higher-level node level. During the calculation, the actual impedance parameters of the line and the load rate correction factor are introduced to refine the calculation results. The calculation process is repeated until the starting point of the fault propagation path is reached, completing the state inversion of a single path.
[0032] In practical implementation, the causal reasoning module inputs semantic embedding vectors into the causal reasoning engine. Based on the topological constraints and electromagnetic transient mechanisms of the distribution network, the engine infers several potential fault propagation paths. Specifically, the engine decodes the semantic embedding vectors into a set of candidate root cause hypotheses, each corresponding to the device node in the distribution network where the fault first occurs. Starting with each root cause hypothesis, the flow of fault current is simulated on the topological graph of the graph data, following the physical law that current flows from the power source side to the load side, generating a forward propagation tree. Combining this with the fault mechanism knowledge base built into the causal reasoning engine, false propagation trees that do not conform to electrical logic are filtered out, retaining physically feasible propagation paths. For each retained propagation path, its node sequence, edge sequence, and electrical quantity attenuation characteristics during propagation are recorded. All propagation paths that meet the conditions are summarized to form a set of potential fault propagation paths.
[0033] In practical implementation, the state inversion module initiates the state inversion calculation process based on the inferred fault propagation path, starting from the known end-load side state and working backward to calculate the theoretical operating state of each upstream node. In some embodiments, the terminal point of each path in the fault propagation path set is determined, which is typically a disconnected tie switch or a terminal distribution transformer. The theoretical operating state of the terminal point is set as known boundary conditions, including zero current and voltage drop to the residual voltage level. Along the reverse direction of the fault propagation path, the voltage drop equation and power balance equation of the distribution network are applied to calculate the theoretical voltage and current values of the upstream node level by level. During the calculation process, the actual impedance parameters of the line and the load rate correction factor are introduced to refine the calculation results. Optionally, the voltage drop equation can be expressed as:
[0034] in: This represents the theoretical voltage value of the upstream node. This represents the known voltage value of the downstream node. This indicates the current value flowing through the line. This indicates the resistance parameters of the circuit. This indicates the reactance parameters of the line. This represents the power factor angle. The calculation process is repeated until the starting point of the fault propagation path is reached, completing the state inversion for a single path. In some embodiments, for complex paths with branches, the state inversion calculation starts synchronously from the end points of each branch, and power or current is combined at the junction nodes according to Kirchhoff's current law. It is understood that the load rate correction factor is used to reflect the correction of the impedance calculation to the actual load level of the line before the fault. It is understood that the residual voltage level is preset according to the fault type and system grounding method. Optionally, the power balance equation is constructed using the principle of forward-backward substitution.
[0035] See Figure 3 This is a diagram showing the attenuation characteristics of electrical quantities along a fault propagation path. It visually illustrates the attenuation patterns of fault current and voltage along the propagation path after a fault occurs in a distribution network, representing a core visualization result in the causal reasoning stage. At the initial moment of the fault, both current and voltage attenuate by 100%. As the fault propagates towards the load side, both current and voltage exhibit a monotonically decreasing trend, consistent with the physical law of fault current flowing from the power source side to the load side in a distribution network. When the fault propagates to the terminal distribution transformer 2, the fault current attenuates to 30%, and the fault voltage attenuates to 50%, demonstrating the attenuation characteristic where current is more sensitive to fault propagation. This provides data verification for the operating logic of feeder protection devices and fault indicators, preventing false tripping or failure to trip. The measured / simulated deviation of the attenuation curve can be used to optimize the weight parameters of large model coding and causal reasoning, improving location accuracy.
[0036] In one embodiment of the present invention, the fault location module compares the theoretical operating state obtained from state inversion calculation with the measured data in the structured fault scenario original dataset to identify suspicious fault sections. It retrieves the theoretical voltage and current values of each node output from the state inversion calculation process. It extracts the measured voltage and current values of the corresponding nodes from the structured fault scenario original dataset. It calculates the absolute value of the difference between the measured and theoretical values for each node and calculates the percentage deviation of the difference. It sets a deviation threshold; when the deviation percentage of a node exceeds the threshold, the node is determined to have a state anomaly. All nodes with state anomalies are marked as suspicious fault sections, and their specific deviation values are recorded. For each suspicious fault section, it extracts the electrical quantity change trend characteristics before and after the fault, and combines this with the fault propagation path direction output by the causal inference engine to calculate the probability score of the suspected fault section occurring. It extracts electrical quantity sampling points for the suspected fault section from the structured fault scenario original dataset for the three cycles before and after the fault. The current surge and voltage dip depth of the suspected fault section are calculated as characteristic indicators. The direction of the fault propagation path is analyzed to determine whether the suspected fault section is located upstream, midstream, or downstream of the propagation path. Based on pre-set weights for the characteristic indicators and path location, the current surge, voltage dip depth, and path location are weighted and summed. The weighted sum is mapped to a numerical range of zero to one to obtain the probability score of a fault occurring in the suspected fault section.
[0037] In specific implementation, the fault location module compares the theoretical operating state obtained from the state inversion calculation with the measured data in the structured fault scenario original dataset to identify suspicious fault sections. Specifically, it retrieves the theoretical voltage and current values of each node output from the state inversion calculation process. In the same implementation, it extracts the measured voltage and current values of the corresponding nodes from the structured fault scenario original dataset, calculates the absolute value of the difference between the measured and theoretical values for each node, and calculates the percentage deviation of the difference. A deviation threshold is set; when the deviation percentage of a node exceeds the threshold, the node is determined to have an abnormal state. All nodes with abnormal states are marked as suspicious fault sections, and the specific numerical value of their deviation degree is recorded. In some embodiments, the fault location module generates a node state comparison table to organize the comparison results; see Table 1 for node state comparison.
[0038] Table 1: Node Status Comparison Table Node number Theoretical voltage (kV) Measured voltage (kV) Voltage deviation percentage (%) Theoretical current (A) Measured current (A) Current deviation percentage (%) N101 10.12 10.15 0.30 125.6 126.1 0.40 N203 9.85 0.05 99.49 203.4 1500.2 637.66 N305 9.80 9.78 0.20 98.7 0.5 99.49 In practical implementation, for each suspected fault section, the electrical quantity change trend characteristics before and after the fault are extracted, and combined with the fault propagation path direction output by the causal inference engine, the probability score of the suspected fault section occurring is calculated. In some embodiments, electrical quantity sampling points of the suspected fault section in the three cycles before and after the fault are extracted from the original structured fault scenario dataset, and the current surge and voltage dip depth of the suspected fault section are calculated as feature indicators of the suspected fault section. The direction of the fault propagation path is analyzed to determine whether the suspected fault section is located upstream, midstream, or downstream of the propagation path. According to the pre-set feature indicator weights and path position weights, the current surge, voltage dip depth, and path position are weighted and summed, and the result of the weighted sum is mapped to a numerical range of zero to one to obtain the probability score of the suspected fault section occurring. The probability score is calculated using the following formula:
[0039] in: This represents the final probability score. This represents the Sigmoid activation function. This represents the characteristic value of the current abrupt change after normalization. This represents the voltage sag depth characteristic value after normalization. This represents the numerical location feature obtained based on the path location transformation. These represent the weight coefficients of the corresponding features. It can be understood that the feature index weights and path location weights are obtained through training with historical fault cases. Optionally, the calculation of the current surge is based on the difference between the effective current value of the first cycle after the fault and the average effective current value of the three cycles before the fault. It can be understood that the voltage sag depth is the ratio of the minimum voltage effective value drop during the fault period to the rated voltage.
[0040] See Figure 4This is a time-series characteristic diagram of the three-phase currents in a distribution network during a single-phase ground fault, clearly showing the changes in the three-phase currents before and after the fault. The yellow dashed line marks the moment of the fault occurrence, a clear dividing point for waveform changes. Before the fault, the amplitude of the three-phase currents was stable within ±600A, and the waveforms exhibited a symmetrical sinusoidal pattern, consistent with normal operating conditions. After the fault, the A-phase current experienced a drastic change, with its amplitude suddenly increasing to ±1600A, exhibiting high-frequency transient oscillation characteristics; the amplitudes and frequencies of the B and C-phase currents remained basically stable, with only minor fluctuations due to system coupling. Only the A-phase current experienced a large transient impact, while the B and C-phase currents showed no obvious abnormalities, consistent with the typical electrical characteristics of a single-phase ground fault. The high-frequency oscillation characteristics of the A-phase current after the fault reflect the transient energy release process of the distribution network's capacitance and inductance to ground. The high-frequency oscillations after the fault lasted for 0.1s, providing a sufficient characteristic time window for transient protection algorithms.
[0041] In one embodiment of the present invention, the fault location module outputs the final fault location result and a structured fault diagnosis report. All suspected fault segments are sorted according to their probability scores, and the segment with the highest ranking is selected as the final fault location result. A structured fault diagnosis report containing details of the fault location, fault type, and fault propagation path is generated. The process of sorting all suspected fault segments according to their probability scores and selecting the segment with the highest ranking as the final fault location result includes: establishing a candidate list containing segment identifiers, probability scores, deviation percentages, and path numbers; sorting the items in the candidate list in descending order of probability scores; if probability scores are the same, comparing the deviation percentages; selecting the top-ranked suspected fault segment as the primary fault location result; and using the two segments with the second-highest probability scores as backup auxiliary judgment results. The process of generating a structured fault diagnosis report containing details of the fault location, fault type, and fault propagation path includes: organizing the final fault location result and obtaining its geographical location information and the name of the feeder to which it belongs. Based on the electrical quantity change trends of the suspected fault section before and after the fault, the specific fault type is determined by matching the fault type discrimination rule base. Key node data involved in the state inversion calculation process, a comparison table of theoretical and measured values, and complete path information inferred by the causal inference engine are integrated into an appendix. Following a preset reporting protocol, the fault location, fault type, confidence level, and appendix data are filled into the corresponding fields of the report. A structured fault diagnosis report is output for dispatchers to review.
[0042] In practical implementation, the fault location module outputs the final fault location result and a structured fault diagnosis report. The module sorts all suspected fault segments according to their probability scores and selects the segment with the highest ranking as the final fault location result. In this implementation, a candidate list is established, containing the segment identifier, probability score, deviation percentage, and path number. Items in the candidate list are sorted in descending order of probability score. In some embodiments, if probability scores are the same, the deviation percentage is compared, and the top-ranked suspected fault segment is selected as the primary fault location result. Simultaneously, the two segments with the next highest probability scores are used as backup auxiliary judgment results. Optionally, the backup auxiliary judgment results are used to handle cross-validation in complex fault scenarios. It can be understood that the sorting process ensures that the primary fault location result has the highest probability score confidence.
[0043] In practical implementation, a structured fault diagnosis report containing details of the fault location, fault type, and fault propagation path is generated. The final fault location results are compiled, and its geographical location information and the name of the feeder to which it belongs are obtained. Based on the electrical quantity change trend characteristics of the suspected fault section before and after the fault, the specific fault type is determined by matching the fault type discrimination rule base. The key node data involved in the state inversion calculation process, the comparison table of theoretical and measured values, and the complete path information inferred by the causal inference engine are integrated into an appendix. In some embodiments, according to a preset reporting protocol, the fault location, fault type, confidence level, and appendix data are filled into the corresponding fields of the report, and a structured fault diagnosis report that can be viewed by dispatchers is output. Optionally, the confidence level... The calculation formula is:
[0044] in: This indicates the percentage confidence level of the primary fault location result. This represents the probability score of the primary fault location result. This indicates the total number of potentially faulty segments in the candidate list. Indicates the first item in the candidate list The probability score of each suspected fault segment is calculated, and the summation symbol indicates that the probability scores of all suspected fault segments are accumulated. It can be understood that the preset reporting protocol defines the data structure and field order of the structured fault diagnosis report.
[0045] See Figure 5This is a comparative characteristic diagram of electrical quantities before and after a distribution network fault. It clearly shows the typical changes in current and voltage before and after a fault, serving as a core visualization basis for fault diagnosis and state inversion. The black dashed line marks the moment of the fault occurrence, a clear dividing point for abrupt changes in electrical quantities. Before the fault, the normal current was stable at 100, and the normal voltage was stable at 220, indicating stable system operation. The fault current and voltage completely overlapped with normal operating conditions, showing no abnormal characteristics. After the fault, the fault current surged to 490, then oscillated at high frequency between 400 and 600, with an amplitude approximately 4 to 6 times that of the normal current. The fault voltage dropped instantly to below 100, then oscillated between 50 and 150, showing a significant voltage dip. The normal current and voltage remained stable, unaffected by the fault, indicating that the fault only affected a specific phase / section. The sudden change in current and the depth of the voltage dip are core characteristics for fault initiation identification and can be used to quickly trigger the fault location process.
[0046] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention in any other way. Any person skilled in the art may make changes or modifications to the above-disclosed technical content to create equivalent embodiments that can be applied to other fields. However, any simple modifications, equivalent changes, and modifications made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention shall still fall within the protection scope of the present invention.
Claims
1. A smart fault location system for distribution networks based on large model analysis, characterized in that, include: The data processing module receives multi-source operational data streams from the distribution network during the fault occurrence period and forms a structured original dataset of the fault scenario. The semantic encoding module constructs graph structure data of the power distribution network based on the original dataset of the fault scenario, calls a pre-trained large language model, and encodes the graph structure data and the unstructured alarm text in the multi-source operation data stream into a unified semantic embedding vector. The causal reasoning module inputs the semantic embedding vector into the causal reasoning engine to infer several potential fault propagation paths; The state inversion module, based on the inferred fault propagation path, initiates the state inversion calculation process, starting from the known end load side state, and reverses to calculate the theoretical operating state of each upstream node; The fault location module compares the theoretical operating state obtained from the state inversion calculation with the measured data in the original dataset of the structured fault scenario to identify suspicious fault sections. For each suspicious fault section, it extracts the electrical quantity change trend features before and after the fault, and calculates and sorts the probability score of the suspected fault section based on the fault propagation path direction output by the causal inference engine. Finally, it outputs the fault location result and the structured fault diagnosis report.
2. The intelligent fault location system for distribution networks based on large model analysis as described in claim 1, characterized in that, A structured dataset of original fault scenarios is generated, including: The multi-source operation data stream includes three-phase current and voltage waveforms collected by the distribution transformer terminal, position signals of feeder switches, and the operating status of fault indicators. The multi-source operational data streams are synchronized in time and mapped to spatial coordinates, unifying the data streams from different devices to the same time reference and geographic topological coordinate system, forming a structured original dataset of fault scenarios, specifically including: The nodes in the graph structure data represent power equipment, the edges represent line connections, and real-time electrical measurement attributes are assigned to the nodes and edges. The three-phase current and voltage waveforms collected by the distribution transformer terminal are analyzed, the amplitude and phase angle of the fundamental component are extracted, and linear interpolation is performed to complete the missing data points. The position change signal of the feeder switch and the operation status of the fault indicator are converted into a standardized event log with timestamps. Read the static topology diagram of the power distribution network to obtain the physical installation coordinates of each power device and the logical relationship with its feeder; Based on the logical relationship between the physical installation coordinates and the feeder, each event log with a timestamp is associated with the corresponding node or edge in the graph structure data; Following a unified timeline, the completed three-phase current and voltage waveform data, standardized event logs, and equipment static attributes are integrated into a structured original dataset of the fault scenario in the form of a wide table.
3. The intelligent fault location system for distribution networks based on large model analysis as described in claim 2, characterized in that, Based on the original dataset of the aforementioned fault scenarios, graph-structured data of the distribution network is constructed, including: Traverse the static topology wiring diagram of the distribution network and create a node set containing all substation outgoing switches, sectionalizing switches, tie switches, and distribution transformers; Based on the physical connection relationship of the lines, edge connections are established between the node sets, and each edge is assigned the line length, type and impedance parameters; Read the real-time current and voltage values of each node from the original structured fault scenario dataset, and attach the real-time current and voltage values as attributes to the corresponding node objects. For the feeder switch that has undergone displacement and the fault indicator that has been activated, add a fault feature label to the corresponding node or edge attribute; The completed graph structure data, which includes node attributes, edge attributes, and topological connections, is serialized and stored for use in subsequent model encoding.
4. The intelligent fault location system for distribution networks based on large model analysis as described in claim 3, characterized in that, The aforementioned invocation of the pre-trained large language model encodes the graph structure data and the unstructured alarm text in the multi-source running data stream into a unified semantic embedding vector, including: The semantic embedding vector integrates the physical topology information and real-time operating status of the power grid; A graph neural network encoder is used to perform a depth traversal of the graph structure data to extract the global topological features and local electrical features of the distribution network, and generate a graph embedding vector. The unstructured alarm text is word-embedded and semantically understood using a natural language processing encoder to capture the description of fault phenomena and the correlation between the alarm information and the equipment, and to generate a text embedding vector. The graph embedding vector and the text embedding vector are concatenated to form a hybrid feature vector; The hybrid feature vector is fed into the attention mechanism layer of the pre-trained large language model, where the attention mechanism layer dynamically adjusts the fusion weights of topological features and text features to eliminate redundant information. The semantic embedding vector, which is weighted and fused by the attention mechanism, is output. The semantic embedding vector fully represents the fault status of the current distribution network.
5. The intelligent fault location system for distribution networks based on large model analysis as described in claim 4, characterized in that, The semantic embedding vector is input into the causal inference engine to infer several potential fault propagation paths, including: The causal reasoning engine is based on the topological constraints and electromagnetic transient mechanisms of the power distribution network; The semantic embedding vector is decoded into a set of candidate root cause hypotheses, which correspond to the device node in the distribution network that first fails. Starting with each of the root cause hypotheses, the flow of fault current is simulated on the topology of the graph structure data, following the physical law that current flows from the power source side to the load side, to generate a positive propagation tree. By combining the fault mechanism knowledge base built into the causal reasoning engine, false propagation trees that do not conform to electrical logic are filtered out, while physically feasible propagation paths are retained; For each preserved propagation path, record its node sequence, edge sequence, and electrical quantity attenuation characteristics during propagation. All propagation paths that meet the conditions are aggregated to form the set of potential fault propagation paths.
6. The intelligent fault location system for distribution networks based on large model analysis as described in claim 5, characterized in that, Based on the inferred fault propagation path, the state inversion calculation process is initiated, starting from the known end-load side state, and backwards to calculate the theoretical operating state of each upstream node, including: Determine the endpoint of each path in the set of fault propagation paths, where the endpoint is typically a disconnected tie switch or a terminal distribution transformer. The theoretical operating state at the end point is set as a known boundary condition, which includes zero current and voltage drop to the residual voltage level; Along the reverse direction of the fault propagation path, the voltage drop equation and power balance equation of the distribution network are applied to calculate the theoretical voltage and theoretical current values of the next higher level node level by level. In the calculation process, the actual impedance parameters of the line and the load rate correction factor are introduced to refine the calculation results. The calculation process is repeated until the starting point of the fault propagation path is reached, thus completing the state inversion of a single path.
7. The intelligent fault location system for distribution networks based on large model analysis as described in claim 6, characterized in that, The theoretical operating state obtained from the state inversion calculation is compared with the measured data in the original structured fault scenario dataset to identify suspicious fault sections, including: Retrieve the theoretical voltage and theoretical current values of each node output by the state inversion calculation process; Extract the measured voltage and measured current values of the corresponding nodes from the original structured fault scenario dataset; Calculate the absolute value of the difference between the measured value and the theoretical value for each node, and calculate the percentage deviation of the difference; A deviation threshold is set. When the deviation percentage of a node exceeds the deviation threshold, the node is determined to have an abnormal state. Mark all nodes with abnormal status as the suspected fault segments and record the specific numerical value of their deviation.
8. The intelligent fault location system for distribution networks based on large model analysis as described in claim 7, characterized in that, For each suspected faulty section, the electrical quantity change trend characteristics before and after the fault are extracted, and combined with the fault propagation path direction output by the causal inference engine, the probability score of the suspected faulty section occurring is calculated, including: Electrical quantity sampling points of the suspected fault section in the three cycles before and after the fault are extracted from the original dataset of the structured fault scenario. Calculate the current surge and voltage dip depth of the suspected fault section as characteristic indicators of the suspected fault section; Analyze the direction of the fault propagation path to determine whether the suspected faulty section is located upstream, midstream, or downstream of the propagation path; Based on the pre-set feature index weights and path position weights, the current surge, voltage dip depth, and path position are weighted and summed. The weighted summation result is mapped to a numerical range of zero to one to obtain the probability score of the suspected faulty section.
9. The intelligent fault location system for distribution networks based on large model analysis as described in claim 8, characterized in that, The final fault location result and structured fault diagnosis report output include: All the suspected fault segments are sorted according to the probability scores, and the segment with the highest score is selected as the final fault location result. Generate a structured fault diagnosis report that includes details of the fault location, fault type, and fault propagation path; All the suspected fault segments are sorted according to their probability scores, and the segment with the highest ranking is selected as the final fault location result, including: Create a candidate list that includes the segment identifier, probability score, deviation percentage, and path number; The items in the candidate list are sorted in descending order according to their probability scores. If the probability scores are the same, the deviation percentages are compared. The suspected faulty section at the top of the list is selected as the primary fault location result; Meanwhile, the two segments with the second-highest probability scores are used as backup auxiliary judgment results to deal with cross-validation in complex fault scenarios.
10. The intelligent fault location system for distribution networks based on large model analysis as described in claim 9, characterized in that, The process of generating a structured fault diagnosis report that includes details of the fault location, fault type, and fault propagation path includes: Organize the final fault location results and obtain its geographical location information and the name of the feeder to which it belongs; Based on the electrical quantity change trend characteristics of the suspected fault section before and after the fault, the specific fault type is determined by matching the fault type discrimination rule base. The key node data involved in the state inversion calculation process, the comparison table of theoretical and measured values, and the complete path information inferred by the causal reasoning engine are integrated into the appendix; According to the preset reporting protocol, fill in the corresponding fields of the report with the fault location, fault type, confidence level, and attachment data; Output a structured fault diagnosis report that can be reviewed by dispatchers.