A power distribution network single-end traveling wave fault location method and system based on a PINN-Transformer model

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By introducing the PINN-Transformer model and combining it with physical consistency constraints, the accuracy and interpretability issues of fault location in distribution networks under high-resistance grounding and strong noise scenarios are solved, achieving high-precision and low-cost fault location.

CN122017474BActive Publication Date: 2026-06-19CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
Filing Date: 2026-04-15
Publication Date: 2026-06-19

AI Technical Summary

Technical Problem

Existing fault location methods for power distribution networks lack sufficient accuracy in scenarios with high-resistance grounding and strong noise interference. Traditional deep learning models lack physical interpretability and are prone to overfitting, leading to physically infeasible solutions and location errors.

Method used

A fault location method based on the PINN-Transformer model is adopted. By acquiring the transient time-frequency map of the fault voltage in real time, global time-frequency features are extracted using a multi-head attention layer and a feedforward neural network. A joint loss function and physical consistency constraints are introduced to train the model to output the fault feeder results and distance.

Benefits of technology

It enhances the interpretability and positioning accuracy of the model, reduces non-physically feasible solutions, lowers hardware costs, and achieves high-precision fault location under complex working conditions.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122017474B_ABST

Patent Text Reader

Abstract

This invention discloses a method and system for locating single-ended traveling wave faults in distribution networks based on the PINN-Transformer model. The method includes: real-time acquisition of transient time-frequency diagrams of fault voltage and inputting them into a trained PINN-Transformer model, and synchronously outputting the fault feeder results and precise location distance. This invention combines the advantages of deep learning in fault time-frequency feature extraction with the mechanistic constraints of physical information neural networks. By constructing differentiable physical residual terms, it ensures that the location results meet physical consistency requirements, thereby improving the ranging accuracy under complex operating conditions to a certain extent.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of distribution network fault location technology, specifically to a method and system for locating single-ended traveling wave faults in distribution networks based on the PINN-Transformer model. Background Technology

[0002] Existing methods for fault location in distribution networks mainly include impedance methods, traveling wave methods, signal injection methods, distribution automation methods, and artificial intelligence algorithms. Among these, the impedance method is constrained by frequent changes in distribution network topology, transformer saturation, and the high proportion of distributed power sources, resulting in insufficient accuracy and reliability in high-resistance grounding scenarios. While the traveling wave method has a fast response speed, it is limited by the short lines and numerous branches of the distribution network structure, leading to severe reflection and aliasing of the traveling wave signal. Traditional knowledge models based on wavefront arrival time difference struggle to accurately capture effective high-frequency traveling wave fronts under high-resistance faults, easily causing location errors. The signal injection method involves actively injecting a specific frequency signal into the grid through the transformer neutral point or dedicated injection equipment after a single-phase grounding fault occurs. Although this method offers high location accuracy, it is not... While the neutral point grounding method is affected, the hardware investment cost is high, and the positioning effect is easily affected by parameters such as conductor-to-ground distributed capacitance and grounding resistance. Traditional artificial intelligence algorithms use deep learning to build a feature-location mapping model. Although it has the advantage of nonlinear representation, as a pure data-driven model, it is essentially a black box model. When processing transient signals, it is prone to over-reliance on the statistical correlation of samples. When the fault conditions change or the sample size is limited, it is prone to overfitting. At the same time, the internal logic of the model lacks transparency and cannot guarantee that the output positioning result conforms to the basic physical mechanism of the power system. It has problems of insufficient generalization ability and poor interpretability.

[0003] In traveling wave fault location scenarios, pure data-driven models such as Transformer and CNN-LSTM are prone to outputting physically infeasible solutions when dealing with conditions such as high-resistance grounding and strong noise interference. This can lead to physical inconsistencies, such as the location result exceeding the entire line length and the voltage and current traveling wave increments not satisfying the transient mechanism. Summary of the Invention

[0004] In view of this, the present invention provides a method and system for locating single-ended traveling wave faults in distribution networks based on the PINN-Transformer model, which at least solves the problems of strong parameter dependence of traveling wave physical models and limited generalization performance of traditional deep learning methods in high impedance and high noise scenarios in the prior art.

[0005] To achieve the above objectives, the present invention adopts the following technical solution:

[0006] A method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model includes the following steps:

[0007] The transient time-frequency graph of the fault voltage is acquired in real time and input into the trained PINN-Transformer model, which simultaneously outputs the fault feeder results and precise location distance. The training process of the PINN-Transformer model includes:

[0008] S1: Collect fault traveling wave samples within a set time window after the fault, and obtain the corresponding input features of the sample fault voltage transient time-frequency diagram. ;

[0009] S2: Will N encoder modules are input in parallel to capture the global time-frequency features of the input signal. Each encoder module includes a multi-head attention layer and a feedforward neural network. The output feature vectors of the N encoder modules are output in parallel to the fault line selection branch and the fault distance measurement branch to obtain the probability distribution and the predicted fault distance, respectively.

[0010] S3: The backpropagation algorithm transforms the loss function into a gradient flow, driving the network parameters to approach the physically consistent region in the solution space until the PINN-Transformer model converges. The joint loss function includes... Data loss item, transient physical consistency loss item and boundary constraint terms .

[0011] Preferably, the specific content of S1 includes:

[0012] S11. Construct an electromagnetic transient model of the distribution network, set up multiple feeder branches and simulate the fault conditions formed based on key fault parameters, including single-phase grounding fault type, transition resistance and fault initial phase angle.

[0013] S12. Record the transient voltage of the bus and the transient current of the outgoing line after the fault occurs, and extract the transient short-window voltage sequence and current sequence after the fault triggering time;

[0014] S13. Establish a sample index table and associate the fault location with the branch number label;

[0015] S14. Perform continuous wavelet transform on the voltage sequence and then normalize it to obtain the transient time-frequency diagram of the fault voltage. ;

[0016] S15. Will The mapping is applied to the input matrix X, and positional encoding is then superimposed to serve as the input features for the deep learning model. .

[0017] Preferably, the specific content of S15 includes:

[0018] Transient time-frequency diagram of fault voltage Mapped to input matrix R indicates that X is a real matrix and T is the time step. For feature dimensions;

[0019] Position codes are generated using sine and cosine functions:

[0020] ;

[0021] ;

[0022] In the formula, Indicates the position index of the time step. Indicates the feature dimension index;

[0023] The superimposed positional encoding is then used as the input feature of the deep learning model. Represented as:

[0024] ;

[0025] Where PE represents the location code.

[0026] Preferably, the specific content of S2 includes:

[0027] S21. Input Features The inputs are fed into N encoder modules in parallel, and the following steps are performed in each encoder module:

[0028] ① Input features Enter h attention points respectively;

[0029] ② Input features in each attention head After weight matrix , and Mapping yields the query matrix Key matrix Sum matrix ,based on , and Obtain the corresponding single-head attention score ;

[0030] ③ Multi-head attention executes step ② in parallel i times, concatenates the outputs of all attention heads, and then performs a linear transformation. Obtain the final output of the multi-head attention layer The output is after residual connection and layer normalization. ;

[0031] ④ The input is fed into a feedforward neural network (FFN), and the output features of the FFN are obtained after passing through two fully connected layers. After residual connection and layer normalization, the final output feature vector is generated. ;

[0032] S22. The fault selection branch and fault location branch are input in parallel, and the fault probability distribution of each feeder branch is output respectively. and predicted fault distance .

[0033] Preferably, step ② includes the following:

[0034] Input features After weight matrix , and Mapping yields the query matrix Key matrix Sum matrix :

[0035] ;

[0036] The weights are obtained by normalization using the Softmax function and then applied to... The corresponding single-head attention score is obtained. :

[0037] ;

[0038] In the formula, for and The number of columns, This is the scaling factor.

[0039] Preferably, the final output of the multi-head attention layer in step ③ is... for:

[0040] ;

[0041] ;

[0042] In the formula, It is the first Each self-attention head output, , , It is the first The weight matrix of each attention head.

[0043] Preferably, step ④ includes the following:

[0044] The input is fed into a feedforward neural network (FFN), passing through two fully connected layers, with the intermediate layer employing the ReLU activation function to obtain the FFN output features. :

[0045] ;

[0046] In the formula, This is the weight matrix. For bias terms;

[0047] After residual connection and layer normalization, the final output feature vector is generated. :

[0048] ;

[0049] ;

[0050] In the formula, and These are the mean and standard deviation of the input, respectively. To prevent tiny amounts of division by zero, and These are the learnable affine transformation parameters.

[0051] Preferably, the specific content of S3 includes:

[0052] Within each training epoch, the training set samples are shuffled and divided into batches, and the following is performed on each batch:

[0053] Perform forward propagation, based on the fault transient time-frequency diagram corresponding to this batch. and The joint loss value is calculated based on the joint loss function;

[0054] Perform backpropagation and gradient calculation, and calculate the joint loss value for all learnable parameters of the network based on the chain rule. gradient ;

[0055] By calculating the gradient First moment estimation and second-order moment estimation Furthermore, the parameters at the current time step are tuned by combining the estimated value and the current learning rate;

[0056] After each training epoch, the model's generalization performance and physical residual level are evaluated using the validation set:

[0057] When the joint loss of the validation set no longer decreases, the physical residual converges to the preset physical tolerance threshold, or the maximum number of iterations is reached, the model is determined to have converged, the early stopping mechanism is triggered to end the training, and the current optimal parameters are solidified to obtain the final PINN-Transformer fault location model.

[0058] Otherwise, continue to the next training round until convergence.

[0059] Preferred, gradient for:

[0060] ;

[0061] Update first-order moment estimate and second-order moment estimation :

[0062] ;

[0063] ;

[0064] In the formula, and All are exponential decay rates;

[0065] First-order moment estimation and second-order moment estimation Perform deviation correction to eliminate the influence of zero deviation caused by initialization:

[0066] ;

[0067] ;

[0068] In the formula, and They are respectively and Correction value;

[0069] The parameters at the current time step are tuned by combining the correction value and the current learning rate as follows:

[0070] ;

[0071] In the formula, The parameters to be updated in the next moment. For the parameters at the current time, This is the learning rate.

[0072] Preferably, the joint loss function is:

[0073] ;

[0074] In the formula, For data loss items; This is the transient physical consistency loss term; These are boundary constraint terms; and These are the regularization weight coefficients for the physical constraint terms;

[0075] Data loss items :

[0076] ;

[0077] ;

[0078] ;

[0079] In the formula, and These are the weighting coefficients. For regression loss; and The first The distance between the actual fault location and the model prediction distance for each sample, where n is the number of samples in the current training batch. The cross-entropy loss is K, where K is the total number of feeder branches. The actual label indication value, This represents the model's predicted probability that the i-th sample belongs to the k-th branch;

[0080] Transient physical consistency loss term :

[0081] Select a short time window near the fault wavefront And take the baseline value before the short window to construct the incremental signal:

[0082] ;

[0083] ;

[0084] In the formula, For transient voltage increments, This is the transient bus voltage. For transient current increment, This corresponds to the transient outgoing line current. and They are respectively and The baseline value, according to traveling wave theory, shows that the short-time traveling wave components satisfy an approximate consistency relation:

[0085] ;

[0086] ;

[0087] In the formula, The characteristic impedance of the line. For line inductance, For line capacitance;

[0088] Define the transient voltage-current consistency residual as:

[0089] ;

[0090] To eliminate amplitude scale differences caused by different voltage levels and load fluctuations, a normalization factor (den) is introduced to construct a dimensionless relative physical residual, and the physical loss is defined in the form of mean absolute error (MAE).

[0091] ;

[0092] ;

[0093] but:

[0094] ;

[0095] in, It is a dimensionless index obtained by normalizing the amplitude of the physical consistency residual. Used for training constraints and result confidence evaluation;

[0096] (3) Boundary constraint terms :

[0097] ;

[0098] In the formula, This corresponds to the maximum distance from the measuring end to the measuring end of the line; when When the value is 0; when When the boundary is exceeded, the penalty increases rapidly. Gradient descent forces the model to constrain the prediction results to the physical feasible region.

[0099] A single-ended traveling wave fault location system for a distribution network based on the PINN-Transformer model includes a processor, a memory, and a computer program stored in the memory and executable by the processor. When the processor runs the computer program, it implements the single-ended traveling wave fault location method for a distribution network based on the PINN-Transformer model as described above.

[0100] As can be seen from the above technical solution, compared with the prior art, the present invention discloses a method and system for locating single-ended traveling wave faults in distribution networks based on the PINN-Transformer model, which has the following beneficial effects:

[0101] 1. This invention enhances the interpretability of pure data-driven models: By introducing voltage-current traveling wave transient consistency constraints, this invention embeds physical prior knowledge into the deep learning training process, effectively suppressing the non-physically feasible solutions generated by the model under high-resistance grounding and strong noise interference. Furthermore, it can output physical residuals as confidence indicators for the location results, achieving synchronous output of location results and confidence levels. To a certain extent, this enhances the interpretability of achieving accurate fault location using pure data-driven models.

[0102] 2. This invention improves the model's positioning accuracy: by reducing the model's solution space through physical regularization constraints, it suppresses non-physically feasible solutions to a certain extent, thereby enabling the model to maintain higher accuracy in line selection and positioning output even under conditions such as high resistance and strong noise.

[0103] 3. This invention has stronger implementability: It only requires single-ended bus voltage and same-ended outgoing current signals, eliminating the need for double-ended synchronization or additional injection equipment, thus greatly reducing hardware investment and maintenance costs. The system architecture has good versatility, making it easy to migrate algorithms and scale up deployment in existing distribution network automation terminals and monitoring and control devices. Attached Figure Description

[0104] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0105] Figure 1 A flowchart of a single-ended traveling wave fault location method for distribution networks based on the PINN-Transformer model provided by the present invention;

[0106] Figure 2 The structural diagram of the PINN-Transformer fault location model in the single-ended traveling wave fault location method for distribution networks based on the PINN-Transformer model provided by this invention;

[0107] Figure 3 This is a diagram of the IEEE 14-node distribution network topology provided in an embodiment of the present invention.

[0108] Figure 4 A comparison chart of the positioning effects of four models provided in the embodiments of the present invention;

[0109] Figure 5 This is a comparison chart of the absolute positioning errors of Transformer and PINN-Transformer under different transition resistances provided in this embodiment of the invention;

[0110] Figure 6 The graph showing the number of iterations of the model and the accuracy of feeder identification is provided for embodiments of the present invention.

[0111] Figure 7 A comparison chart of the positioning error of Transformer and PINN-Transformer as a function of physical consistency residuals, provided for embodiments of the present invention. Detailed Implementation

[0112] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0113] This invention provides a method for locating single-ended traveling wave faults in distribution networks based on the PINN-Transformer model, such as... Figure 1 As shown, it includes the following steps:

[0114] The transient time-frequency graph of the fault voltage is acquired in real time and input into the trained PINN-Transformer model, which simultaneously outputs the fault feeder results and precise location distance. The training process of the PINN-Transformer model includes:

[0115] S1: Collect fault traveling wave samples within a set time window after the fault, and obtain the corresponding input features of the sample fault voltage transient time-frequency diagram. ;

[0116] S2: Will N encoder modules are input in parallel to capture the global time-frequency features of the input signal. Each encoder module includes a multi-head attention layer and a feedforward neural network. The output feature vectors of the N encoder modules are output in parallel to the fault line selection branch and the fault distance measurement branch to obtain the probability distribution and the predicted fault distance, respectively.

[0117] S3: The backpropagation algorithm transforms the loss function into a gradient flow, driving the network parameters to approach the physically consistent region in the solution space until the PINN-Transformer model converges. The joint loss function includes... Data loss item, transient physical consistency loss item and boundary constraint terms .

[0118] It should be noted that:

[0119] The specific application process of the PINN-Transformer model in the online monitoring scenario of the power distribution network after training is as follows:

[0120] 1. Real-time data acquisition and feature consistency processing: When the distribution network monitoring terminal collects the bus voltage in real time... With the current of each feeder When the transient voltage amplitude exceeds the trigger threshold, the system immediately captures the transient short-window voltage and current sequence after the fault trigger moment. To ensure the consistency between the online inference data and the training data distribution, the preprocessing procedure of the data in S1 must be strictly followed: perform continuous wavelet transform (CWT) and amplitude normalization on the real-time voltage signal to generate a real-time fault transient time-frequency diagram.

[0121] 2. End-to-end fault inference: The processed real-time fault transient time-frequency map is input into the trained and converged model. The model encoder extracts the global time-frequency features of the real-time signal and outputs the fault feeder number k and the fault ranging result simultaneously through the parallel output layer. This process requires no iterative solution and meets the millisecond-level real-time requirements of online monitoring.

[0122] 3. Confidence assessment based on physical residuals: To evaluate the reliability of the positioning results, the system utilizes the real-time voltage increment within the current short window. With the corresponding feeder current increment The transient physical consistency loss is calculated using the physical consistency formula and compared with a preset physical tolerance threshold L (which can be set according to actual needs):

[0123] like If the prediction result conforms to the electromagnetic transient mechanism, the system outputs a high-confidence positioning command, which can be directly used to guide fault line inspection.

[0124] like If the current operating condition is found to be subject to complex interference or abnormal parameters, the system will add a low-confidence alarm while outputting the location result, prompting maintenance personnel to conduct manual analysis.

[0125] To further implement the above technical solution, the specific content of S1 includes:

[0126] S11. Construct an electromagnetic transient model of the distribution network, set up multiple feeder branches and simulate the fault conditions formed based on key fault parameters, including single-phase grounding fault type, transition resistance and fault initial phase angle.

[0127] S12. Record the transient voltage of the bus and the transient current of the outgoing line after the fault occurs, and extract the transient short-window voltage sequence and current sequence after the fault triggering time;

[0128] S13. Establish a sample index table and associate the fault location with the branch number label;

[0129] S14. Perform continuous wavelet transform on the voltage sequence and then normalize it to obtain the transient time-frequency diagram of the fault voltage. ;

[0130] S15. Will The mapping is applied to the input matrix X, and positional encoding is then superimposed to serve as the input features for the deep learning model. .

[0131] To further implement the above technical solution, the specific content of S15 includes:

[0132] Transient time-frequency diagram of fault voltage Mapped to input matrix R indicates that X is a real matrix and T is the time step. For feature dimensions;

[0133] Position codes are generated using sine and cosine functions:

[0134] ;

[0135] ;

[0136] In the formula, Indicates the position index of the time step. Indicates the feature dimension index;

[0137] The superimposed positional encoding is then used as the input feature of the deep learning model. Represented as:

[0138] ;

[0139] Where PE represents the location code.

[0140] It should be noted that:

[0141] In this embodiment, the transition resistance is set to 0-3000Ω. During the establishment of the sample index table, each sample file is specifically associated with the fault feeder number and labels such as actual fault distance, fault type, transition resistance, initial phase angle, and target distance. This embodiment uses sine and cosine functions to generate the position code, which can be adjusted as needed in actual applications.

[0142] During training, global initialization is first performed, using the Xavier initialization method to initialize the weight matrix in the Transformer encoder. , , The weights and bias terms of the feedforward network are randomly assigned, the maximum number of iterations (Max_Epochs), the batch size (Batch_Size) and the initial learning rate (Learning Rate) are set, and the Adam optimizer is selected as the core algorithm for parameter update.

[0143] Sample Selection and Enhancement Strategy Description: During dataset construction, this invention only removes non-physically invalid inputs caused by missing or truncated data, while forcibly retaining transient samples under high-impedance grounding and strong noise interference conditions. The technical logic is that although transient features under such conditions exhibit instability due to reduced signal-to-noise ratio, they significantly violate the voltage-current consistency law at the physical level, leading to a surge in the magnitude of the physical residual term. This invention introduces such samples into the training process and strengthens the constraint on the model output using the physical penalty term in the joint loss function. This mechanism effectively suppresses the overfitting tendency of the model under complex features, forcing the network to learn deeper electromagnetic transient consistency features, thereby achieving precise locking of the solution space in unstable feature regions and significantly improving the ranging accuracy of the system under complex conditions.

[0144] To further implement the above technical solutions, such as Figure 2 As shown, the specific content of S2 includes:

[0145] S21. Input Features The inputs are fed into N encoder modules in parallel, and the following steps are performed in each encoder module:

[0146] ① Input features Enter h attention points respectively;

[0147] ② Input features in each attention head After weight matrix , and Mapping yields the query matrix Key matrix Sum matrix ,based on , and Obtain the corresponding single-head attention score ;

[0148] ③ Multi-head attention executes step ② in parallel i times, concatenates the outputs of all attention heads, and then performs a linear transformation. Obtain the final output of the multi-head attention layer The output is after residual connection and layer normalization. ;

[0149] ④ The input is fed into a feedforward neural network (FFN), and the output features of the FFN are obtained after passing through two fully connected layers. After residual connection and layer normalization, the final output feature vector is generated. ;

[0150] S22. The fault selection branch and fault location branch are input in parallel, and the fault probability distribution of each feeder branch is output respectively. and predicted fault distance .

[0151] To further implement the above technical solution, step ② includes the following specific contents:

[0152] To uncover the global correlations between features at different times and in different frequency bands in the time-frequency graph, a multi-head attention mechanism is first used to process the input features. After weight matrix , and Mapping yields the query matrix Key matrix Sum matrix :

[0153] ;

[0154] Calculate the single-head attention score for each attention head, normalize it using the Softmax function to obtain weights, and then apply them to... The corresponding single-head attention score is obtained. :

[0155] ;

[0156] In the formula, for and The number of columns, This is a scaling factor used to prevent the gradient from vanishing due to excessively large dot product values.

[0157] To further implement the above technical solution, the final output of the multi-head attention layer in step ③ for:

[0158] ;

[0159] ;

[0160] In the formula, It is the first Each self-attention head output, , , It is the first The weight matrix of each attention head.

[0161] To further implement the above technical solution, step ④ includes the following specific contents:

[0162] The input is fed into a feedforward neural network (FFN), passing through two fully connected layers. The intermediate layer uses the ReLU activation function to enhance the model's nonlinear representation capability, resulting in the FFN output features. :

[0163] ;

[0164] In the formula, This is the weight matrix. For bias terms;

[0165] To address the degradation problem in deep network training and accelerate convergence, the output of each sublayer is processed as follows: After residual connection and layer normalization, the final output feature vector is generated. :

[0166] ;

[0167] ;

[0168] In the formula, and These are the mean and standard deviation of the input, respectively. To prevent tiny amounts of division by zero, and These are the learnable affine transformation parameters.

[0169] It should be noted that:

[0170] The fault localization model constructed in this invention uses a Transformer encoder as the core feature extraction unit, aiming to capture long-range dependencies in the transient time-frequency features of faults using its global attention mechanism. Since the Transformer architecture abandons the recursive structure of recurrent neural networks, positional encoding needs to be superimposed on the input matrix to enable the model to perceive the evolution order of the fault traveling wave on the time axis. In this embodiment, N is set to 6.

[0171] To further implement the above technical solution, the specific content of S3 includes:

[0172] Within each training epoch, the training set samples are shuffled and divided into batches, and the following is performed on each batch:

[0173] Perform forward propagation, based on the fault transient time-frequency diagram corresponding to this batch. and The joint loss value is calculated based on the joint loss function;

[0174] Perform backpropagation and gradient calculation, and calculate the joint loss value for all learnable parameters of the network based on the chain rule. gradient ;

[0175] By calculating the gradient First moment estimation and second-order moment estimation Furthermore, the parameters at the current time step are tuned by combining the estimated value and the current learning rate;

[0176] After each training epoch, the model's generalization performance and physical residual level are evaluated using the validation set:

[0177] When the joint loss of the validation set no longer decreases, the physical residual converges to the preset physical tolerance threshold, or the maximum number of iterations is reached, the model is determined to have converged, the early stopping mechanism is triggered to end the training, and the current optimal parameters are solidified to obtain the final PINN-Transformer fault location model.

[0178] Otherwise, continue to the next training round until convergence.

[0179] It should be noted that:

[0180] The system enters the iterative training phase. Within each training epoch, the training set samples are shuffled and divided into batches. For each batch, forward propagation is performed, and the transient time-frequency graph of the fault in that batch is input into the model. After feature extraction via a multi-layer self-attention mechanism and a feedforward network, the fault selection probability distribution is output in parallel. With fault ranging scalar Following this, physical consistency verification and loss calculation are performed, and the system synchronously retrieves the original bus transient voltage corresponding to this batch of samples. With outgoing transient current Calculate the corresponding and According to the single-ended traveling wave propagation equation Verify predicted distance The physical rationality is determined, and the calculation of data-driven error and physical consistency residuals is performed in conjunction with data labels. and boundary constraint penalties The combined loss value.

[0181] Based on this, backpropagation and gradient calculation are performed, and the joint loss is calculated for all learnable parameters of the network using the chain rule. gradient .

[0182] To further implement the above technical solution, gradient for:

[0183] ;

[0184] At this point, the gradient flow generated by the physical residual term will be fed back to the encoder layer as a correction signal, forcing the model to suppress the weights that focus on non-physical features.

[0185] The parameter update process uses the Adam optimizer, which estimates the first moment of the gradient. and second-order moment estimation This allows for dynamic adjustment of the learning rate for each parameter. The specific process is as follows:

[0186] Update first-order moment estimate and second-order moment estimation :

[0187] ;

[0188] ;

[0189] In the formula, and All are exponential decay rates; in this embodiment =0.9, =0.999;

[0190] First-order moment estimation and second-order moment estimation Perform deviation correction to eliminate the influence of zero deviation caused by initialization:

[0191] ;

[0192] ;

[0193] In the formula, and They are respectively and Correction value;

[0194] The parameters at the current time step are tuned by combining the correction value and the current learning rate as follows:

[0195] ;

[0196] In the formula, The parameters to be updated in the next moment. For the parameters at the current time, This is the learning rate.

[0197] It should be noted that:

[0198] In the specific training process, the above process is repeated on all batches, and the generalization performance and physical residual level of the model are evaluated using the validation set after each epoch. When the joint loss of the validation set no longer decreases, the physical residual converges to the preset physical tolerance threshold, or the maximum number of iterations is reached, the model is determined to have converged, the early stopping mechanism is triggered to end the training, and the current optimal parameters are fixed to obtain the final PINN-Transformer fault location model.

[0199] To further implement the above technical solution, a joint loss function is used to balance the contributions of data fitting and physical constraints to the total loss. This joint loss function serves as the sole scalar basis for parameter updates during subsequent model training; the joint loss function is:

[0200] ;

[0201] In the formula, For data loss items; This is the transient physical consistency loss term; These are boundary constraint terms; and These are the regularization weight coefficients for the physical constraint terms;

[0202] Data loss items :

[0203] ;

[0204] ;

[0205] ;

[0206] In the formula, and These are the weighting coefficients. For regression loss; and The first The distance between the actual fault location and the model prediction distance for each sample, where n is the number of samples in the current training batch. The cross-entropy loss is K, where K is the total number of feeder branches. The actual label indication value, This represents the model's predicted probability that the i-th sample belongs to the k-th branch;

[0207] Transient physical consistency loss term :

[0208] Select a short time window near the fault wavefront And take the baseline value (e.g., the average of several sampling points before the window) before the short window to construct the incremental signal:

[0209] ;

[0210] ;

[0211] In the formula, For transient voltage increments, This is the transient bus voltage. For transient current increment, This corresponds to the transient outgoing line current. and They are respectively and The baseline value, according to traveling wave theory, shows that the short-time traveling wave components satisfy an approximate consistency relation:

[0212] ;

[0213] ;

[0214] In the formula, Here, L is the characteristic impedance of the line, C is the line inductance, and C is the line capacitance.

[0215] Define the transient voltage-current consistency residual as:

[0216] ;

[0217] To eliminate amplitude scale differences caused by different voltage levels and load fluctuations, a normalization factor (den) is introduced to construct a dimensionless relative physical residual, and the physical loss is defined in the form of mean absolute error (MAE).

[0218] ;

[0219] ;

[0220] but:

[0221] ;

[0222] in, Essentially, it is a dimensionless index obtained by normalizing the magnitude of the physical consistency residual, which is also referred to as the relative physical residual in the paper. This is used for training constraints and result credibility evaluation (the smaller the residual, the higher the physical consistency).

[0223] (3) Boundary constraint terms :

[0224] To avoid the model outputting invalid solutions with negative distances or exceeding the actual length of the line, for each line... Set the maximum feasible distance (This can be given by the route topology parameters, or obtained from the statistics of the maximum labeled distance of the route in the training data, with a 5% safety margin added). Apply a constraint penalty to the predicted distance:

[0225] ;

[0226] In the formula, This corresponds to the maximum distance from the measuring end to the measuring end of the line; when When the value is 0; when When the boundary is exceeded, the penalty increases rapidly. Gradient descent forces the model to constrain the prediction results to the physical feasible region.

[0227] A single-ended traveling wave fault location system for a distribution network based on the PINN-Transformer model includes a processor, a memory, and a computer program stored in the memory and executable by the processor. When the processor runs the computer program, it implements the single-ended traveling wave fault location method for a distribution network based on the PINN-Transformer model as described above.

[0228] This invention addresses the technical bottlenecks of single-ended traveling wave ranging under high-resistance grounding and high-noise conditions, where data-driven models are prone to generating physically feasible solutions, and traditional analytical models are highly dependent on line parameters and threshold tuning. It proposes a PINN-Transformer joint inference framework that integrates transient time-frequency representation and physical consistency constraints. This framework utilizes continuous wavelet transform to extract the time-frequency graph of transient bus voltage, and uses the Transformer architecture to deeply model the global dependencies across time and frequency bands in the time-frequency domain, enabling automatic fault feeder identification and fault distance prediction. Simultaneously, it embeds the voltage-current consistency residual within the transient wavefront short window and the distance feasible region constraint into the joint loss function, effectively suppressing prediction results that violate physical laws under complex conditions through a physical penalty mechanism. This significantly improves the model's ranging accuracy and engineering interpretability, while simultaneously outputting the physical residual as an engineering confidence evaluation index for the location result, forming a precise location mode that outputs fault location and confidence simultaneously.

[0229] The invention will be further illustrated below through specific simulation experiments:

[0230] based on Figure 3 The diagram shows the IEEE 14-node distribution network topology. The following comparison model is used:

[0231] (1) Model 1: CWT time-frequency plot + Transformer (with noise added);

[0232] (2) Model 2: CWT time-frequency plot + CNN-LSTM (with noise added);

[0233] (3) Model 3: CWT time-frequency graph + PINN-Transformer (without noise);

[0234] (4) Model 4 (this invention): CWT time-frequency graph + PINN-Transformer (noise added).

[0235] The MAE / MSE of positioning was statistically analyzed under different transition resistances, initial phase angles, and added noise conditions, and positioning error distribution diagrams for four models are presented, such as... Figure 4-5 ,according to Figure 4-5 It can be seen that under the same conditions, the localization error of models such as Transformer is significantly higher than that of the PINN-Transformer model. This shows that the fault localization accuracy is higher and less affected by noise after adding physical constraints using PINN. The convergence curve of feeder identification accuracy with training epochs is also provided to illustrate the contribution of the physical constraints of this model to the feeder identification accuracy. Figure 6 As shown.

[0236] Using feeder identification accuracy and distance positioning error (MAE / MSE) as the main evaluation indicators, the feeder identification accuracy was statistically analyzed for the full sample, and the distance positioning error was statistically analyzed for the subset of samples with correct feeder identification. Under the same dataset and partitioning method, the branch identification accuracy and distance of different models under different transition resistances were compared and statistically analyzed, as shown in Tables 1 and 2:

[0237] Table 1. Positioning results under different transition resistances

[0238] ;

[0239] Table 2. Localization results under different models

[0240] ;

[0241] As shown in Table 3, this embodiment constructs 5760 transient fault samples by combining 30 fault locations, 3 types of single-phase grounding faults, 8 sets of transition resistances, and 8 sets of fault initial phase angles. For each sample, a transient short-window sequence after fault triggering is extracted and mapped to a time-frequency feature map. All samples are randomly divided into training, validation, and test sets in an 8:1:1 ratio. Furthermore, full-band Gaussian white noise with a signal-to-noise ratio of 20dB–40dB (preferably 30dB) is superimposed at the input of the test set to verify the model's adaptability to operating conditions. This invention analyzes the correlation between physical consistency residuals and ranging errors. It defines the relative physical residuals within the [0,1] interval as an evaluation index characterizing the degree of physical inconsistency. Using non-equal-width intervals, it performs 12 sets of intervalization (sample n=35, 26, 36, 32, 60, 104, 77, 31, 12, 23, 30, 110 for each bin) statistical analysis on 576 valid samples in the test set. The mean absolute ranging error and its statistical trend for both the conventional deep learning model and the model of this invention are calculated within each interval. Figure 7 As shown in the figure. Experimental results show that as the relative physical residual increases (i.e., the degree of physical inconsistency increases), the ranging stability of the conventional model deteriorates significantly. However, the model of this invention, through real-time penalty of non-physically feasible solutions by physical constraint terms, makes the growth rate of ranging error under complex working conditions much lower than that of the comparative model. This effectively demonstrates that the model guided by physical mechanisms has stronger engineering reliability when dealing with high-resistance grounding and strong noise interference conditions.

[0242] Table 3 Fault Parameter Traversal Table for Total Sample Set

[0243] ;

[0244] The above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application, and should all be included within the protection scope of this application.

Claims

1. A method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model, characterized in that, Includes the following steps: Real-time acquisition of fault voltage transient time-frequency diagrams is input into the trained PINN-Transformer model, and fault feeder results and precise location distance are output synchronously. The training process of the PINN-Transformer model includes: S1: Collect fault traveling wave samples within a set time window after the fault, and obtain the corresponding input features of the sample fault voltage transient time-frequency diagram. ; S2: Will N encoder modules are input in parallel to capture the global time-frequency features of the input signal. Each encoder module includes a multi-head attention layer and a feedforward neural network. The output feature vectors of the N encoder modules are output in parallel to the fault selection branch and the fault ranging branch to obtain the probability distribution and the predicted fault distance, respectively. The specific content of S2 includes: S21. Input Features The inputs are fed into N encoder modules in parallel, and the following steps are performed in each encoder module: ① Input features Enter h attention points respectively; ② Input features in each attention head After weight matrix , and Mapping yields the query matrix Key matrix Sum matrix ,based on , and Obtain the corresponding single-head attention score ; ③ Multi-head attention executes step ② in parallel i times, concatenates the outputs of all attention heads, and then performs a linear transformation. Obtain the final output of the multi-head attention layer The output is after residual connection and layer normalization. ; ④ The input is fed into a feedforward neural network (FFN), and the output features of the FFN are obtained after passing through two fully connected layers. After residual connection and layer normalization, the final output feature vector is generated. ; S22. The fault selection branch and fault location branch are input in parallel, and the fault probability distribution of each feeder branch is output respectively. and predicted fault distance ; S3: The backpropagation algorithm transforms the loss function into a gradient flow, driving the network parameters to approach the physically consistent region in the solution space until the PINN-Transformer model converges. The joint loss function includes... Data loss item, transient physical consistency loss item and boundary constraint terms The specific content of S3 includes: Within each training epoch, the training set samples are shuffled and divided into batches, and the following is performed on each batch: Perform forward propagation, based on the fault transient time-frequency diagram corresponding to this batch. and The joint loss value is calculated based on the joint loss function; Perform backpropagation and gradient calculation, and calculate the joint loss value for all learnable parameters of the network based on the chain rule. gradient ; By calculating the gradient First moment estimation and second-order moment estimation Furthermore, the parameters at the current time step are tuned by combining the estimated value and the current learning rate; After each training epoch, the model's generalization performance and physical residual level are evaluated using the validation set: When the joint loss of the validation set no longer decreases, the physical residual converges to the preset physical tolerance threshold, or the maximum number of iterations is reached, the model is determined to have converged, the early stopping mechanism is triggered to end the training, and the current optimal parameters are solidified to obtain the final PINN-Transformer fault location model. Otherwise, continue to the next training round until convergence.

2. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 1, characterized in that, The specific content of S1 includes: S11. Construct an electromagnetic transient model of the distribution network, set up multiple feeder branches and simulate the fault conditions formed based on key fault parameters, including single-phase grounding fault type, transition resistance and fault initial phase angle. S12. Record the transient voltage of the bus and the transient current of the outgoing line after the fault occurs, and extract the transient short-window voltage sequence and current sequence after the fault triggering time; S13. Establish a sample index table and associate the fault location with the branch number label; S14. Perform continuous wavelet transform on the voltage sequence and then normalize it to obtain the transient time-frequency diagram of the fault voltage. ; S15. Will The mapping is applied to the input matrix X, and positional encoding is then superimposed to serve as the input features for the deep learning model. .

3. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 2, characterized in that, The specific content of S15 includes: Transient time-frequency diagram of fault voltage Mapped to input matrix R indicates that X is a real matrix and T is the time step. For feature dimensions; Position codes are generated using sine and cosine functions: ；； In the formula, Indicates the position index of the time step. Indicates the feature dimension index; The superimposed positional encoding is then used as the input feature of the deep learning model. Represented as: ； Where PE represents the location code.

4. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 1, characterized in that, Specific details of step ② include: Input features After weight matrix , and Mapping yields the query matrix Key matrix Sum matrix : ； The weights are obtained by normalization using the Softmax function and then applied to... The corresponding single-head attention score is obtained. : ； In the formula, for and The number of columns, This is the scaling factor.

5. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 1, characterized in that, The final output of the multi-head attention layer in step ③ for: ；； In the formula, It is the first Each self-attention head output, , , It is the first The weight matrix of each attention head.

6. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 1, characterized in that, Step ④ includes the following: The input is fed into a feedforward neural network (FFN), passing through two fully connected layers, with the intermediate layer employing the ReLU activation function to obtain the FFN output features. : ； In the formula, This is the weight matrix. For bias terms; After residual connection and layer normalization, the final output feature vector is generated. : ；； In the formula, and These are the mean and standard deviation of the input, respectively. To prevent tiny amounts of division by zero, and These are the learnable affine transformation parameters.

7. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 1, characterized in that, gradient for: ； Update first-order moment estimate and second-order moment estimation : ；； In the formula, and All are exponential decay rates; First-order moment estimation and second-order moment estimation Perform deviation correction to eliminate the influence of zero deviation caused by initialization: ；； In the formula, and They are respectively and Correction value; The parameters at the current time step are tuned by combining the correction value and the current learning rate as follows: ； In the formula, The parameters to be updated in the next moment. For the parameters at the current time, This is the learning rate.

8. The method for locating single-ended traveling wave faults in a distribution network based on the PINN-Transformer model according to claim 1, characterized in that, The joint loss function is: ； In the formula, For data loss items; This is the transient physical consistency loss term; These are boundary constraint terms; and These are the regularization weight coefficients for the physical constraint terms; Data loss items : ；；； In the formula, and These are the weighting coefficients. For regression loss; and The first The distance between the actual fault location and the model prediction distance for each sample, where n is the number of samples in the current training batch. The cross-entropy loss is K, where K is the total number of feeder branches. The actual label indication value, This represents the model's predicted probability that the i-th sample belongs to the k-th branch; Transient physical consistency loss term : Select a short time window near the fault wavefront And take the baseline value before the short window to construct the incremental signal: ；； In the formula, For transient voltage increments, This is the transient bus voltage. For transient current increment, This corresponds to the transient outgoing line current. and They are respectively and The baseline value, according to traveling wave theory, shows that the short-time traveling wave components satisfy an approximate consistency relation: ；； In the formula, The characteristic impedance of the line. For line inductance, For line capacitance; Define the transient voltage-current consistency residual as: ； To eliminate amplitude scale differences caused by different voltage levels and load fluctuations, a normalization factor (den) is introduced to construct a dimensionless relative physical residual, and the physical loss is defined in the form of mean absolute error (MAE). ；； but: ； in, It is a dimensionless index that is normalized by amplitude of physical consistency residuals and is used for training constraints and result confidence evaluation. (3) Boundary constraint terms : ； In the formula, This corresponds to the maximum distance from the measuring end to the measuring end of the line; when When the value is 0; when When the boundary is exceeded, the penalty increases rapidly. Gradient descent forces the model to constrain the prediction results to the physical feasible region.

9. A single-ended traveling wave fault location system for a distribution network based on the PINN-Transformer model, comprising a processor, a memory, and a computer program stored in the memory and executable by the processor, characterized in that, When the processor runs the computer program, it implements a single-end traveling wave fault location method for distribution networks based on the PINN-Transformer model as described in any one of claims 1-8.