Transformer-based transformer-based transformer-based transformer-based transformer-based transformer-based transformer-based transformer-based transformer-based transformer-based transformer-based
By combining waveform attention and LSTM-based methods with a Shapelet generator and a dual-gated LSTM model, the problems of high false positive rate and low phase identification accuracy in traditional methods are solved. This enables accurate identification and phase determination of the relationship between low-voltage transformer substations and households, and is suitable for lean management and anomaly analysis of smart distribution networks.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN UNIV
- Filing Date
- 2026-05-18
- Publication Date
- 2026-06-12
AI Technical Summary
Traditional distribution network topology and phase identification methods are easily affected by distributed photovoltaic access, environmental noise, and three-phase load imbalance in low-voltage distribution areas when faced with highly complex and dynamic low-voltage distribution area operation data. This results in high false positive and false negative rates, and the inability to accurately capture minute synchronous fluctuation characteristics, leading to disordered household-transformer relationships and low phase identification accuracy.
A transformer substation and phase identification method based on waveform attention and LSTM is adopted. Data is collected from smart meters on the user side and monitoring terminals on the transformer side of the low-voltage transformer substation. Positive and negative correlation sample pairs are constructed, physical difference sequences are calculated, temporal attention weights are extracted using a Shapelet generator, and a dual-gated LSTM model is trained. The forget gate and input gate are dynamically intervened, and a multi-task loss function is constructed for identification.
It effectively shields redundant noise, improves the accuracy and robustness of capturing synchronous fluctuation characteristics, enhances the accuracy of topology and phase identification, and is suitable for lean management and anomaly analysis of smart distribution networks, reducing the false positive and false negative rates.
Smart Images

Figure CN122196755A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of smart grid and distribution network operation monitoring technology, and relates to a method for identifying transformer substations and phases based on waveform attention and LSTM. Specifically, it is a method for determining the relationship and phase of low-voltage transformer substations by integrating the same-phase voltage homomorphic fluctuation mechanism and deep time series learning. Background Technology
[0002] With the rapid development of new power systems and smart distribution networks, the dynamic identification of low-voltage distribution area topology and the accurate determination of user phases play a crucial role in lean line loss management, three-phase imbalance mitigation, abnormal power consumption analysis, and distributed energy access. In these applications, effectively identifying the affiliation and phase information of user transformers from massive and noisy measurement data has become a core technical challenge in improving the quality of distribution network operation and maintenance data and management efficiency.
[0003] Traditional distribution network topology and phase identification methods have many shortcomings when dealing with highly complex and dynamic low-voltage transformer substation operating data. These methods typically rely on manual on-site verification, simple voltage threshold judgments, or global similarity measurements of the entire voltage sequence. These methods are easily affected by abnormal voltage fluctuations caused by distributed photovoltaic (PV) installations, environmental noise, and three-phase load imbalances in the substation, leading to extremely high false positive and false negative rates. Furthermore, these methods lack adaptability to complex voltage fluctuation patterns. Especially when the substation structure is complex and similar load patterns are superimposed, traditional global sequence comparison methods often fail to accurately capture minute synchronous fluctuation characteristics and are insensitive to details of local voltage fluctuations, resulting in incorrect transformer-household relationships and low phase identification accuracy.
[0004] Existing distribution network topology identification technologies still face numerous challenges. In actual transformer substation operating environments, the voltage difference characteristics between the user end and the transformer end often exhibit high nonlinearity and time-varying characteristics. Accurately extracting local waveform factors reflecting the tightness of the actual physical connection and using a deep time-series memory mechanism to shield redundant interference signals is a significant challenge in current distribution network topology identification technologies. Therefore, researching a robust and accurate topology and phase identification method that deeply integrates the same-phase voltage homomorphic fluctuation mechanism with data-driven local feature extraction and attention mechanisms becomes an effective way to solve the current problems of chaotic user-transformer relationships and phase identification in low-voltage distribution areas. Summary of the Invention
[0005] The purpose of this invention is to propose a method for identifying transformer substations and phases based on waveform attention and LSTM, which can efficiently and accurately perform topology and phase identification.
[0006] A method for identifying transformer substations and phases based on waveform attention and LSTM includes the following steps: Step 1: Data acquisition and establishment of a power distribution network voltage sample database; By utilizing smart meters on the user side and monitoring terminals on the transformer side in the low-voltage distribution area, the user voltage time series and the transformer three-phase voltage time series are collected synchronously to form a basic voltage sample database. The collected raw data is parsed into a structured numerical sequence, and the data is cleaned by removing abnormal transition points, and then transformed into continuous, equal-frequency sampled standard time-series voltage data to form a basic voltage sample database. Step 2: Data preprocessing and construction of positive and negative correlation sample pairs; The construction of positive and negative associated sample pairs refers to the construction of a training dataset, in which user and transformer data belonging to the same transformer area are labeled as positive samples with a label of 1, and user and transformer data randomly paired from different transformer areas are labeled as negative samples with a label of 0, while the phase labels of the users are retained simultaneously. In step 2, data preprocessing refers to: repairing missing data using an improved forward padding data repair method, then aligning the base voltage data in time; and performing Z-Score normalization on the base voltage data to eliminate differences in amplitude dimensions. Step 3: Physical difference sequence calculation and Shapelet waveform factor extraction; Based on the principle of the same-state change of the in-phase voltage in the physical circuit, the difference between the user voltage and the three-phase voltage of the transformer is calculated to construct a physical differential voltage sequence. This differential voltage sequence is then input into a time series segment generator, namely a Shapelet generator, which consists of one-dimensional convolutional layers. The self-learning (also known as learnable) convolutional kernel is used to capture the synchronous morphological features in the voltage fluctuations. After processing by the activation function, a time-series attention weight sequence reflecting the tightness of the physical connection is generated. Step 4, Model construction and training of dual-gated LSTM; A dual-gated LSTM model is constructed, which refers to a Long Short-Term Memory (LSTM) network containing dual-gated guiding units. The temporal attention weight sequence generated in step 3 is decoupled and mapped to a forgetting modulation signal and an input modulation signal through two independent linear projection layers. These two signals are used to dynamically intervene in the forget gate and input gate of the LSTM unit, respectively. The output of the last time step of the LSTM is extracted as a global sequence feature. Shared high-level features are extracted through a shared fully connected layer and then fed into parallel topology and phase classification fully connected layers. Finally, the dual-gated LSTM model is iteratively trained by jointly optimizing the topology binary classification loss and the phase multi-class classification loss. The training process is monitored by a validation set. Training is stopped when the validation set loss tends to converge after a preset number of iterations or reaches the maximum number of iterations, and the model parameters with the minimum validation set loss are saved. Step 5: Online identification and determination of the relationship between the household and the transformer; The voltage time series of the user to be classified and the three-phase voltage time series of the target transformer are input into the trained dual-gated LSTM model. The dual-gated LSTM model first generates physical difference features internally and performs gated inference, and finally outputs topology recognition and phase recognition results through the fully connected layer.
[0007] In step 1, the data collected by the smart meters on the user side and the monitoring terminal on the transformer side in the low-voltage distribution area is voltage data, with a sampling interval of 5 minutes, that is, a total of 288 sampling points per day.
[0008] In step 2, the improved forward imputation data repair method refers to the following: for missing values, the three nearest valid observations preceding the missing value are used, and these values are assigned different weights for weighted average imputation. Values with larger weights are closer to the missing value. The calculation formula is as follows: ;(Formula 1) in: This is the missing voltage filler value at the corresponding time. , , These are the three most recent valid observations preceding the missing value; , , for , , The corresponding weight; typically taken as a value , , .
[0009] The cleaned (outlier removal) user voltage sequences are merged and aligned with the transformer three-phase voltage sequences; to meet the model's fixed-length input requirement, a target sequence length threshold is set. If the sequence length exceeds Then cut off the first part Data at each time step; if the sequence length is less than [number] time steps; If zero values are not filled at the end of the sequence, then the sequence features are normalized using the Z-Score normalization method to eliminate the absolute amplitude differences between different measurement nodes. The calculation formula is as follows: ;(Formula 2) in, These are the original voltage observations. Let be the mean of the sequence. This represents the standard deviation of the sequence, which is calculated statistically from the actual collected voltage observation data. The voltage value is the standardized value. The voltage sequence of a single user is combined and spliced with the three-phase voltage sequence of the target transformer to form a feature tensor containing four channel dimensions; For user and transformer data combinations belonging to the same distribution network area and feeder, they are marked as positive samples, assigned a topology binary classification label of 1, and the actual access phase label of the user is extracted and retained: phase A, phase B, or phase C, corresponding to multi-class labels 0, 1, and 2, respectively; for randomly selected user and transformer data combinations that do not belong to the same distribution network area, they are marked as negative samples, assigned a topology binary classification label of 0, and their phase label is set to -1 to be ignored in loss calculation. Finally, all the constructed positive and negative samples are randomly shuffled to generate the final training dataset.
[0010] In step 3, the time series data output in step 2 is obtained and denoted as the input feature matrix. This includes user voltage sequences. With transformer three-phase voltage sequence ; Based on the principle of in-phase voltage fluctuation in physical connection relationships, the time series difference between the user-side voltage and the transformer three-phase voltage is calculated. This eliminates the absolute amplitude information of the voltage, extracting only the synchronous fluctuation differential sequence that reflects the line topology connectivity. The calculation formula is as follows: ;(Formula 3) in, Indicates the current time step. This indicates the three-phase ports of the transformer. This represents the vector of voltage differences between the user voltage and the three-phase voltages on the low-voltage side of the transformer. Indicates user At any moment Voltage data, Indicates transformer exist Moment Phase voltage value.
[0011] To maintain the causality of the time series and prevent information leakage at future time steps during subsequent waveform feature extraction, the difference sequence... Zero-padding is applied at the very beginning of the time dimension, with a padding length of [value missing]. , The set shapelet waveform factor length, i.e., the target sequence length threshold mentioned earlier, ; The causally-filled three-channel difference sequence is input into a time series shapelet generator. This generator is essentially a one-dimensional convolutional network, utilizing... The size is A one-dimensional self-learning convolutional kernel is used as a shapelet waveform factor to perform sliding feature matching on the difference sequence, thereby keenly capturing the local synchronization morphological features in minute voltage fluctuations; A Shapelet at time step Extracted output features The calculation formula is as follows: ;(Formula 4) in, For the first Each Shapelet corresponds to a phase sequence channel. At the offset position The weight of the position, As a bias term, an initial random value is assigned using the Kaiming initialization method during the initial model construction. Taken from a mean of 0 and a variance of Random samples are taken from a normal distribution as initial weight values (where (This refers to the number of input nodes of the one-dimensional convolutional layer). Furthermore, during subsequent iterative training of the model, the backpropagation algorithm combined with the Adam optimizer is used to perform self-updates and dynamic adjustments based on the loss gradient. This represents the set number of Shapelets; its value is determined by optimizing the model through grid search and cross-validation on the validation set, balancing the richness of feature extraction with the computational complexity of the model. In this example... .
[0012] The convolutional features extracted in the above steps are non-linearly mapped using the Sigmoid activation function and scaled to... The probability interval is used to ultimately generate a temporal attention weight sequence that reflects the strength of physical connections and the degree of feature matching. The calculation formula is as follows: ;(Formula 5) In step 4, at the current time step Obtain the temporal attention weights generated in step 3. The signals are then input into two independent linear projection layers that do not contain bias terms, and the resulting signals are decoupled to generate forgetting modulation signals. With input modulation signal The calculation formula is as follows: ;(Formula 6) ;(Formula 7) in, The Shapelet projection weight matrix is specific to the Forgotten Gate. The shapelet projection weight matrix is specific to the input gate. and All of these are learnable parameter matrices within the dual-gated LSTM model. They are randomly initialized using the Kaiming uniform distribution and continuously iterated and updated to obtain the optimal weights during the subsequent model training phase based on the error gradient of the joint loss function through the backpropagation algorithm.
[0013] Input the original voltage characteristics of the current time step Hidden state compared to the previous time step The data is spliced together and combined with the forgotten modulation signal. and input modulation signal Calculate the forgetting gate under attentional intervention separately. and input gate The calculation formula is as follows: ;(Formula 8) ;(Formula 9) in, This represents the concatenated vector of the input and the hidden state from the previous time step. This is the feature vector that the loop unit calculated and output at the previous time step, carrying historical time-series information; and These represent the weights and biases of the native forget gate, respectively. and These represent the weights and biases of the native input gate, respectively. It is the Sigmoid activation function. and These are the learnable modulation coefficients of the forget gate and the input gate, respectively, used to adaptively adjust the degree of intervention of physical difference features on network memory during training. The initial value is 0.05. The initial value is 0.05.
[0014] Based on the computational rules of standard Long Short-Term Memory (LSTM) networks, and combined with the calculated forgetting gate... and input gate Update the cell state at the current time step. With hidden state : ;(Formula 10) ;(Formula 11) ;(Formula 12) ;(Formula 13) in, Candidate cell state, For candidate state weights and biases, For output gate, For the output gate weights and biases, Represents the Hadama product. It is the hyperbolic tangent activation function; Use Xavier uniform initialization. =0, Use Xavier uniform initialization. =0 All learnable parameters are automatically updated by the Adam optimizer during training, without the need for manual preset of specific values.
[0015] The output features of the last time step in the hidden state sequence are mapped through a fully connected layer to output logical values for topology recognition and phase recognition, respectively; a joint multi-task loss function is constructed, which includes topology binary classification loss and phase multi-class classification loss. .
[0016] For determining the ownership of a household / transformer, calculate the binary cross-entropy loss for topological binary classification. The calculation formula is as follows: (Formula 14); in, This represents the total number of samples in a single training batch. For the first The true topological labels for each sample are: 1 represents a positive sample from the same platform area, and 0 represents a negative sample from a different platform area. The first output of the model The topological logic value of each sample, This is the Sigmoid activation function.
[0017] For specific phase classification, calculate the cross-entropy loss for multi-class classification. To avoid the invalid phase labels of negative samples interfering with the network gradient, an indicator mask mechanism is introduced, calculating the phase loss only for positive samples. Therefore, the cross-entropy loss for phase-based multi-class classification is... The calculation formula is as follows: ;(Formula 15) in, This represents the number of topologically positive samples in the current batch. This is an indicator function that takes the value 1 when the condition inside the parentheses is true, and 0 otherwise. This ensures that only samples with genuine topological relationships participate in the phase error calculation. This represents the total number of phase categories, with a value of 3, corresponding to phases A, B, and C. The true phase labels for positive samples; For the model to predict the first The sample belongs to the first The original logic value of the phase.
[0018] By combining the two losses mentioned above and incorporating dynamic task weight coefficients, a total loss function is obtained. Its function is to quantify the model's combined prediction error of topology and phase, and to provide the optimization objective for gradient descent in the backpropagation algorithm. The calculation formula is as follows: ;(Formula 16) in, These are the weighting coefficients for the topology recognition task. The weight coefficients for the phase recognition task satisfy the following conditions: By adjusting and The ratio of gradient contributions from topology determination and phase classification is used to balance the gradient contributions of these two methods, preventing a single task from dominating network parameter updates during backpropagation (generally taken as...). Then, the normalized four-channel voltage data is fed into the constructed dual-gated LSTM model, and the gradient is calculated through the backpropagation algorithm to iteratively update the parameters of the dual-gated LSTM model and each weight matrix.
[0019] In step 5, the actual voltage time series of the user to be classified and the three-phase voltage time series of the target transformer are extracted simultaneously and concatenated in the feature dimension to form an original input matrix containing 4 channels; the global normalization parameters saved during model training in step 2 are loaded and the original input matrix is normalized; then, according to the preset target time step, the normalized sequence is truncated to the same length or zero-padded at the end to construct a standardized test tensor.
[0020] The test tensor is input into the dual-gated LSTM model for distribution network topology and phase identification trained and converged in steps 3 and 4. The test tensor drives the calculation of physical difference features, the generation of Shapelet temporal attention weights, and the deduction of memory units of the dual-gated LSTM inside the dual-gated LSTM model. It extracts and splices the hidden layer features of the final time of the forward and reverse bidirectional sequences. Finally, through the fully connected layer network at the end, it outputs the topology identification log probability and the phase classification score vector containing three dimensions.
[0021] The topology identification log odds are calculated using the Sigmoid function to obtain the topology belonging probability; the optimal threshold is dynamically found by maximizing the F1 score on the validation set. The calculation formula is as follows: ;(Formula 17) in, For the probability of belonging The accuracy rate is calculated using the following formula: ; For the probability of belonging The recall rate is calculated using the following formula: ; Among them, TP (True Positives): True positives, which are predicted to be positive and are actually positive; FP (False Positives): False positives, which are predicted to be positive but are actually negative; FN (False Negatives): False negatives, which are predicted to be negative but are actually positive; and TN (True Negatives): True negatives, which are predicted to be negative and are actually negative.
[0022] If the probability of topology affiliation is greater than the set classification threshold If the topological relationship of the user to be classified is established, then it is determined that the user belongs to the target transformer substation area; and under the condition that the topological relationship is established, the phase classification score vector is applied. The function extracts the category index corresponding to the maximum score and outputs the user's specific access type based on the index mapping relationship.
[0023] If the probability of topology affiliation is less than or equal to the classification threshold If the topological relationship of the user to be classified is not valid, the user will be identified as a non-local or abnormal user, and no further specific classification results will be output.
[0024] Beneficial Effects: The transformer substation and phase identification method based on waveform attention and LSTM of this invention has a solid theoretical basis and scientific rationality in each step. The differential sequence is calculated based on the in-phase voltage fluctuation mechanism in physical circuit principles, effectively eliminating absolute amplitude information and accurately highlighting the synchronous fluctuation characteristics reflecting the line topology connectivity. A one-dimensional convolutional network is used to construct a Shapelet generator, which keenly captures the local synchronous morphological characteristics in minute voltage fluctuations. A dual-gated LSTM model is introduced, based on sequence memory network theory, decoupling the temporal attention weights into forgetting and input modulation signals and dynamically intervening in the input and forget gates of the LSTM unit, ensuring accurate deduction of long-term temporal dependence characteristics from both data-driven and physical structure perspectives. Simultaneously, a multi-task joint loss function with an indicator mask mechanism is applied, effectively avoiding gradient interference from invalid labels of negative samples when jointly optimizing topology and phase classification, thus improving the system's generalization ability and convergence efficiency. These steps are based on the theoretical foundation of multiple disciplines such as physical electrical engineering, deep learning, and time-series signal processing. They are scientific methods that have been rigorously mathematically derived and verified by engineering logic. This invention integrates the physical voltage fluctuation mechanism into deep learning and uses the Shapelet attention mechanism to accurately capture minute synchronous fluctuation characteristics. It can effectively solve the problems of chaotic relationships between low-voltage transformer substations and low phase identification accuracy, and has high system robustness and practical engineering application value.
[0025] The method of this invention is used for determining the ownership of transformers and identifying the phase of users in low-voltage distribution networks. It can accurately capture the small synchronous fluctuation characteristics of voltage through the Shapelet attention mechanism, and efficiently and accurately output the topology classification and phase category results. It is applicable to fields such as lean management of smart distribution networks, power line loss anomaly analysis, dynamic identification of transformer area topology, and operation and maintenance data governance.
[0026] This invention integrates the homomorphic fluctuation mechanism of in-phase voltage into deep learning, and combines the Shapelet temporal attention mechanism with a dual-gated LSTM model to effectively capture minute synchronous fluctuation features in low-voltage distribution area voltage data, and accurately determine the user's transformer affiliation and phase category. The technical solution of this invention solves the technical problems of high false positive and false negative rates, insensitivity to minute voltage features, chaotic user-transformer relationships, and low phase identification accuracy in traditional methods during global sequence comparison. It can achieve more accurate and efficient topology and phase identification in complex distribution network environments including distributed photovoltaic access and three-phase load imbalance, and is widely applicable to fields such as lean management of smart distribution networks, power line loss anomaly analysis, and operational and distribution data governance.
[0027] This invention provides a method for identifying transformer substations and phases in low-voltage distribution networks based on a dual-gated LSTM model with waveform attention. This method offers significant advantages, particularly in determining the relationships between transformer substations and phases in low-voltage distribution networks. Compared to traditional methods, this invention effectively reduces the high false positive and false negative rates caused by distributed photovoltaic (PV) integration or complex environmental noise, and solves the problem that traditional algorithms rely on global similarity and cannot accurately capture subtle local synchronous changes. By introducing a Shapelet attention mechanism combined with a dual-gated LSTM model, efficient and accurate local waveform feature extraction and temporal reasoning can be achieved, thereby improving the accuracy and robustness of distribution network topology and phase identification. Its efficient and accurate identification mechanism not only enhances the real-time performance and reliability of the distribution data management system but also provides strong technical support for three-phase imbalance management, abnormal line loss analysis, and lean operation of smart distribution networks, demonstrating significant theoretical value and broad application prospects.
[0028] As can be seen from the above technical solution of the present invention, the present invention has the following technical effects: First, this invention proposes an identification method that combines physical differential mechanisms with data-driven approaches, enabling accurate topology identification and phase determination in complex low-voltage distribution network data. By extracting shapelet waveform factors and generating temporal attention weights, redundant voltage fluctuation noise can be effectively shielded, reducing false positive and false negative rates and improving the accuracy of capturing synchronous fluctuation characteristics. This method exhibits higher sensitivity to the evolution of small local in-phase voltage fluctuations in transformer substation operation monitoring, making it particularly suitable for complex electrical environments with distributed energy access and widespread nonlinear loads. Secondly, this invention constructs a novel dual-gated LSTM network architecture, which can deeply decouple and utilize the extracted shapelet temporal attention, effectively improving the joint recognition accuracy of multi-task models. By dynamically adjusting the forget gate and input gate of the LSTM through two independent linear projection layers, the model can accurately evaluate and retain the most discriminative topological and phase features, thereby improving the inference efficiency and generalization ability of the transformer substation topology recognition system and ensuring accurate reflection of the real physical connection status of household transformers in a highly dynamic distribution network environment. Attached Figure Description
[0029] Figure 1 This is a schematic diagram illustrating the implementation process of the present invention.
[0030] Figure 2 This is a schematic diagram of the dual-gated LSTM model structure of the present invention.
[0031] Figure 3Figure (a) shows a comparison of the topology recognition accuracy of the dual-gated Shapelet LSTM, Pearson algorithm, CNN algorithm, and basic LSTM model in an embodiment of the present invention. Figure (b) shows a comparison of the topology recognition task performance, and Figure (a) shows a comparison of the phase recognition task performance.
[0032] Figure 4 is a comparison of the topology recognition accuracy of the physical difference-free model, the single-gated model and the complete model of the present invention in the embodiments of the present invention, wherein (a) is a comparison of the topology recognition task performance and (b) is a comparison of the phase recognition task performance.
[0033] Figure 5 This is the loss curve during the training process.
[0034] Figure 6 This is the accuracy curve for topology recognition.
[0035] Figure 7 The accuracy curve for phase recognition. Detailed Implementation
[0036] The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Example 1
[0037] This invention addresses the challenges of confusing transformer ownership relationships and phase identification in low-voltage distribution networks. Traditional methods often suffer from high false positive and false negative rates under conditions of voltage anomalies caused by distributed photovoltaic (PV) integration and complex environmental noise interference. By introducing physical differential sequence calculation and Shapelet waveform factor extraction techniques, this invention effectively captures subtle synchronous voltage fluctuations, improving the sensitivity and accuracy of topology and phase feature extraction. Combining a dual-gated LSTM network architecture and a multi-task learning training method, this invention deeply decouples temporal attention weights to dynamically adjust the forget gate and input gate of the memory unit, effectively shielding redundant interference and significantly improving the overall recognition accuracy and robustness of the model. It is widely applicable to areas such as lean management of smart distribution networks, anomaly management of transformer line losses, and dynamic maintenance of operational and distribution data. This invention proposes a transformer and phase identification method based on waveform attention dual-gated LSTM, effectively solving the problems in existing technologies.
[0038] A method for identifying transformer substations and phases based on waveform attention dual-gated LSTM is as follows: Figure 1As shown, firstly, using smart meters on the user side and monitoring terminals on the transformer side in the low-voltage distribution area, the user voltage time series and the transformer three-phase voltage time series are collected synchronously, respectively. The collected data is analyzed and cleaned to construct a basic voltage sample database. An improved forward imputation data repair method is introduced to preprocess missing data and perform standard normalization. A training dataset containing positive and negative correlation sample pairs and phase labels is constructed. The difference between the user voltage and the transformer three-phase voltage is calculated according to the physical circuit principle to construct a physical differential voltage sequence. Then, a one-dimensional convolutional network is used to extract shapelet waveform factors to generate a temporal attention weight sequence reflecting the tightness of physical connection. An LSTM model containing a dual-gated guiding unit is constructed, and the attention weights are decoupled into a forgetting modulation signal and an input modulation signal to dynamically intervene in the cell state. The model is jointly trained with a multi-task loss function. The voltage sequence of the user to be tested and the three-phase voltage sequence of the target transformer are input for online identification. Differential features are generated internally by the model and gated inference is performed to select the most discriminative topology and phase features, while avoiding interference from redundant fluctuation information. The final user transformer attribution and phase category results are output.
[0039] The specific implementation steps of the transformer substation and phase identification method based on waveform attention dual-gated LSTM are as follows: Step 1: Using the smart meters on the user side and the monitoring terminals on the transformer side in the low-voltage distribution area, the user voltage time series and the transformer three-phase voltage time series are collected synchronously, respectively. The raw data messages are parsed into structured numerical sequences, and the data is cleaned by removing abnormal jump points. The data is then transformed into continuous, equal-frequency sampled standard time series voltage data to form a basic voltage sample database. Step 2: For missing data, an improved forward imputation data repair method is used for repair, and then the basic voltage data is time-aligned; the basic voltage data is Z-Score normalized to eliminate amplitude dimension differences; then a training dataset is constructed, in which user and transformer data belonging to the same transformer area are labeled as positive samples (label 1), and randomly paired user and transformer data from different transformer areas are labeled as negative samples (label 0), while the phase labels of users are retained simultaneously; Step 3: Based on the principle of in-phase voltage homomorphic change in physical circuits, calculate the difference between the user voltage and the three-phase voltage of the transformer to construct a physical differential voltage sequence; input the differential sequence into a time series shapelet generator composed of one-dimensional convolutional layers, use learnable convolutional kernels to capture the synchronous morphological features in voltage fluctuations, and generate a time-series attention weight sequence that reflects the tightness of physical connections after activation function processing. Step 4: Construct a Long Short-Term Memory (LSTM) network with dual-gated guidance units. Decouple the attention weight sequence generated in Step 3 into a forgetting modulation signal and an input modulation signal through two independent linear projection layers. Use these two signals to dynamically intervene in the forget gate and input gate of the LSTM unit. Iteratively train the model by jointly optimizing the topological binary classification loss and the phase-based multi-class classification loss. Monitor the training process through a validation set. Stop training when the validation set loss tends to converge after a preset number of iterations or reaches the maximum number of iterations, and save the model parameters when the validation set loss is minimized. Step 5: Input the voltage time series of the user to be classified and the three-phase voltage time series of the target transformer into the trained model. The model first generates physical difference features internally and performs gated inference. Finally, it outputs topology identification and phase identification results through the fully connected layer.
[0040] In step 1, the data collected by the smart meters on the user side and the monitoring terminal on the transformer side in the low-voltage distribution area is voltage data. The sampling interval is 5 minutes, that is, a total of 288 sampling points per day. The specific raw voltage acquisition data structure and variable definition are shown in Table 1. .
[0041] In step 2, for missing values, a weighted average is used to impute them using the three nearest valid observations preceding the missing value, with each value assigned a different weight. Values with larger weights are closer to the missing value. The calculation is as follows: ;(Formula 1) in: This is the missing voltage filler value at the corresponding time. , , These are the three most recent valid observations preceding the missing value. , , For the corresponding weight, it is usually set to a value of , , .
[0042] The cleaned user voltage sequence is merged and aligned with the transformer three-phase voltage sequence. To meet the model's fixed-length input requirement, a target sequence length threshold is set. If the sequence length exceeds Then cut off the first part Data at each time step; if the sequence length is less than [number] time steps; If zero values are not found at the end of the sequence, zero-value padding is performed. Then, the sequence features are normalized using the Z-Score normalization method to eliminate absolute amplitude differences between different measurement nodes, as calculated below: ;(Formula 2) in, These are the original voltage observations. Let be the mean of the sequence. The standard deviation is calculated statistically from the actual collected voltage observation data. This is the standardized voltage value.
[0043] The voltage sequences of individual users are combined and concatenated with the three-phase voltage sequences of the target transformer to form a feature tensor with four channel dimensions. For user and transformer data combinations belonging to the same distribution network area and feeder, they are marked as positive samples, assigned a topology binary classification label of 1, and the actual access phase label of the user (phase A, phase B, or phase C, corresponding to multi-class labels 0, 1, and 2, respectively) is extracted and retained. For randomly selected user and transformer data combinations not belonging to the same distribution network area, they are marked as negative samples, assigned a topology binary classification label of 0, and their phase label is set to -1 to be ignored in the loss calculation. Finally, all constructed positive and negative samples are randomly shuffled to generate the final training dataset. Specific label metrics are shown in Table 2. .
[0044] In step 3, the time series data output in step 2 is obtained and denoted as the input feature matrix. This includes user voltage sequences. With transformer three-phase voltage sequence Based on physical connection relationships The mechanism of in-phase voltage fluctuation involves calculating the time alignment difference between the user-side voltage and the three-phase voltage of the transformer. This eliminates the absolute amplitude information of the voltage, extracting a synchronous fluctuation differential sequence that only reflects the line topology connectivity. The calculation is as follows: ;(Formula 3) in, Indicates the current time step. This indicates the three-phase ports of the transformer. This represents the vector of voltage differences between the user voltage and the three-phase voltages on the low-voltage side of the transformer. Indicates user At any moment Voltage data, Indicates transformer exist Moment Phase voltage value.
[0045] To maintain the causality of the time series and prevent information leakage at future time steps during subsequent waveform feature extraction, the difference sequence... Zero-padding is applied at the very beginning of the time dimension, with a padding length of [value missing]. ,in The set length of the Shapelet waveform factor.
[0046] The causally-filled three-channel difference sequence is input into a time series shapelet generator. This generator is essentially a one-dimensional convolutional network, utilizing... The size is A one-dimensional learnable convolutional kernel is used as a shapelet waveform factor to perform sliding feature matching on the difference sequence, thereby keenly capturing the local synchronization morphological features in minute voltage fluctuations. A Shapelet at time step Extracted feature output The formula for calculation is as follows: ;(Formula 4) in, For the first Each Shapelet corresponds to a phase sequence channel. At the offset position The weight of the position, As a bias term, an initial random value is assigned using the Kaiming initialization method during the initial model construction. Taken from a mean of 0 and a variance of Random samples are taken from a normal distribution as initial weight values (where (This refers to the number of input nodes of the one-dimensional convolutional layer). Furthermore, during subsequent iterative training of the model, the backpropagation algorithm combined with the Adam optimizer is used to perform self-updates and dynamic adjustments based on the loss gradient. The number of shapes is defined and its value is determined by optimizing the model through grid search and cross-validation on the validation set, balancing the richness of feature extraction with the computational complexity of the model. In this example... ;
[0047] The convolutional features extracted in the above steps are non-linearly mapped using the Sigmoid activation function and scaled to... The probability interval is used to ultimately generate a temporal attention weight sequence that reflects the strength of physical connections and the degree of feature matching. The calculation is as follows: ;(Formula 5)
[0048] In step 4, at the current time step Obtain the temporal attention weights generated in step 3. The signals are then input into two independent linear projection layers that do not contain bias terms, and the resulting signals are decoupled to generate forgetting modulation signals. With input modulation signal The calculation is as follows: ;(Formula 6) ;(Formula 7) in, The Shapelet projection weight matrix is specific to the Forgotten Gate. The shapelet projection weight matrix is specific to the input gate. and All of these are learnable parameter matrices within the dual-gated LSTM model. They are randomly initialized using the Kaiming uniform distribution and continuously iterated and updated to obtain the optimal weights during the subsequent model training phase based on the error gradient of the joint loss function through the backpropagation algorithm.
[0049] Input the original voltage characteristics of the current time step Hidden state compared to the previous time step Perform splicing, then hide the state. The feature vector carrying historical time-series information, calculated and output by the cyclic unit at the previous time step, is combined with the forgotten modulation signal. and input modulation signal Calculate the forgetting gate under attentional intervention separately. and input gate The calculation is as follows: ;(Formula 8) ;(Formula 9) in, This represents the concatenated vector of the input and the hidden state from the previous time step. and These represent the weights and biases of the native forget gate, respectively. and These represent the weights and biases of the native input gate, respectively. It is the Sigmoid activation function. and These are the learnable modulation coefficients of the forget gate and the input gate, respectively, used to adaptively adjust the degree of intervention of physical difference features on network memory during training.
[0050] Based on the computational rules of standard Long Short-Term Memory (LSTM) networks, and combined with the calculated forgetting gate... and input gate Update the cell state at the current time step. With hidden state : ;(Formula 10) ;(Formula 11) ;(Formula 12) ;(Formula 13) in, Candidate cell state, For candidate state weights and biases, For output gate, For the output gate weights and biases, Represents the Hadama product. It is the hyperbolic tangent activation function.
[0051] To clarify the flow of physical difference features between layers in the aforementioned dual-gated LSTM model, this embodiment sets a specific network architecture and tensor dimension transformation. The specific hyperparameter configurations and dimension mappings are shown in Table 3. The time step is set... (Corresponding to 5-minute sampling points within 24 hours), batch size ; .
[0052] To ensure the convergence and generalization ability of the model in practical applications, this embodiment adopts the training hyperparameter configuration shown in Table 4. .
[0053] The output features of the last time step in the hidden state sequence are mapped through a fully connected layer to output logical values for topology recognition and phase recognition, respectively. A joint multi-task loss function is constructed, incorporating both topology binary classification loss and phase multi-class classification loss. .
[0054] For determining the ownership of a household / transformer, calculate the binary cross-entropy loss for topological binary classification. The calculation is as follows: ;(Formula 14) in, This represents the total number of samples in a single training batch. For the first The true topological labels for each sample are: 1 represents a positive sample from the same platform area, and 0 represents a negative sample from a different platform area. The first output of the model The topological logic value of each sample, This is the Sigmoid activation function.
[0055] For the determination of phase classification, calculate the cross-entropy loss for multi-class classification. To avoid the invalid phase labels of negative samples interfering with the network gradient, an indicator mask mechanism is introduced, calculating the phase loss only for positive samples, as follows: ;(Formula 15) in, This represents the number of topologically positive samples in the current batch. This is an indicator function that takes the value 1 when the condition inside the parentheses is true, and 0 otherwise. This ensures that only samples with genuine topological relationships participate in the phase error calculation. This represents the total number of phase categories (with a value of 3, corresponding to phases A, B, and C). The true phase labels for positive samples; For the model to predict the first The sample belongs to the first The original logic value of the phase.
[0056] By combining the two losses mentioned above and incorporating dynamic task weight coefficients, a total loss function is obtained. Its function is to quantify the model's combined prediction error of topology and phase, and to provide the optimization objective for gradient descent in the backpropagation algorithm. The calculation is as follows: ;(Formula 16) in, These are the weighting coefficients for the topology recognition task. The weight coefficients for the phase recognition task satisfy the following conditions: By adjusting and The ratio of gradient contributions from topology determination and phase classification is used to balance the gradient contributions of topology determination and phase classification, preventing a single task from dominating network parameter updates during backpropagation. It is generally taken as [value missing]. The normalized four-channel voltage data was then fed into the constructed LSTM model. The gradient was calculated using the backpropagation algorithm, and the parameters of the dual-gated LSTM model and each weight matrix were iteratively updated. The loss value during training and the accuracy on the validation and test sets are shown in Table 5, and the curves are as follows: Figure 5-7 As shown; .
[0057] In step 5, the actual voltage time series of the user to be classified and the three-phase voltage time series of the target transformer are extracted simultaneously and concatenated in the feature dimension to form an original input matrix containing 4 channels; the global normalization parameters saved during model training in step 2 are loaded and the original input matrix is normalized; then, according to the preset target time step, the normalized sequence is truncated to the same length or zero-padded at the end to construct a standardized test tensor.
[0058] The test tensor is input into the distribution network topology and phase identification model trained and converged in steps 3 and 4. The test tensor drives the calculation of physical difference features, the generation of Shapelet temporal attention weights, and the deduction of memory units of dual-gated LSTM within the model. It extracts and splices the hidden layer features of the final time of the forward and reverse bidirectional sequences. Finally, through the fully connected layer network at the end, it outputs the topology identification log probability and the phase classification score vector containing three dimensions.
[0059] The topology identification log odds are calculated using the Sigmoid function to obtain the topology belonging probability. The optimal threshold can be dynamically found by maximizing the F1 score on the validation set. The calculation is as follows: ;(Formula 17) in, For the probability of belonging The accuracy is calculated as , For the probability of belonging Recall rate, Among them, TP (True Positives): True positives, which are predicted to be positive and are actually positive; FP (False Positives): False positives, which are predicted to be positive but are actually negative; FN (False Negatives): False negatives, which are predicted to be negative but are actually positive; and TN (True Negatives): True negatives, which are predicted to be negative and are actually negative.
[0060] If the probability of topology affiliation is greater than the set classification threshold If the topological relationship of the user to be classified is established, then it is determined that the user belongs to the target transformer substation area; and under the condition that the topological relationship is established, the phase classification score vector is applied. The function extracts the category index corresponding to the maximum score and outputs the user's specific access type based on the index mapping relationship.
[0061] If the probability of topology affiliation is less than or equal to the classification threshold If the topological relationship of the user to be classified is not valid, the user will be identified as a non-local or abnormal user, and no further specific classification results will be output.
[0062] The performance of the proposed method for identifying transformer substations and phases based on waveform attention dual-gated LSTM is evaluated. Database and Experiment Setup
[0063] The experimental hardware consisted of an Intel(R) Core(TM) i9-7920X CPU @ 2.90GHz, an NVIDIA GeForce RTX 2080 Ti graphics card, and 64GB of RAM. The program was written in Python 3.9. The data used in this experiment came from the European low-voltage test feeder data published by the IEEE PESAMPS DSAS Test Feeder Working Group. This topology is radial, with a fundamental frequency of 50Hz. The feeder's head end connects to the medium-voltage distribution system via a 400kVA, 11 / 0.416kV distribution transformer at a substation. The user load curves were derived from the 24-hour load curves with a 5-minute granularity from the European low-voltage test feeder data, assuming 30% of users are grid-connected photovoltaic (PV) installations. In the power flow time series, the original power factor was constant at 0.95, but this value is overly idealized; therefore, the power factor was randomly distributed within the range of 0.90 to 0.95. Furthermore, the deep learning framework used in this embodiment was PyTorch 2.8.0.
[0064] To objectively and comprehensively evaluate the effectiveness of the method described in this invention in the dual tasks of distribution network topology identification and phase identification, this embodiment uniformly adopts accuracy, precision, recall, and F1 score as quantitative evaluation indicators. Accuracy characterizes the reliability of the model's global classification, i.e., the empirical probability that the model makes correct joint inferences across the entire data distribution; precision characterizes the confidence of the model's positive prediction results; recall characterizes the model's ability to capture features and its sensitivity in identifying true positive samples; and the F1 score is a comprehensive evaluation metric characterizing the confidence of positive predictions and global sensitivity. The calculation formulas are as follows: ; ; ; .
[0065] Among them, TP (True Positives): True positives, which are predicted to be positive and are actually positive; FP (False Positives): False positives, which are predicted to be positive but are actually negative; FN (False Negatives): False negatives, which are predicted to be negative but are actually positive; and TN (True Negatives): True negatives, which are predicted to be negative and are actually negative.
[0066] Given the limited sample size of the original standard topology, a Python script was first used to randomly attach user nodes from the original topology to different locations on the network structure, thereby expanding and deriving the topology on a large scale. Then, real user load time-series data from the dataset was loaded into these low-voltage user nodes, and 30% of the users were randomly selected as photovoltaic grid-connected users. Load consumption and photovoltaic power generation output models were configured for them simultaneously to accurately simulate the bidirectional power flow interleaving scenario under a high proportion of distributed power source access. Next, the constructed derived topologies, along with the source-load configuration data, were input into the OpenDSS simulation engine to perform quasi-dynamic time-series power flow simulation, and the single-phase voltage time series of each low-voltage user and the three-phase voltage time series of the corresponding transformer low-voltage side were exported. To further simulate common "cross-regional user" errors in actual distribution network areas, 20% of external user data from other independent distribution areas was randomly mixed into the user voltage data of the target distribution area as negative samples for topology identification. All data underwent multi-task joint annotation to generate the final dataset containing topology attribution labels and phase labels. Finally, the complete dataset was strictly divided into training, testing, and validation sets in a 7:2:1 ratio and input into the aforementioned model based on physical difference waveform factor and dual-gated LSTM for iterative training and performance testing, thereby comprehensively evaluating the model's recognition accuracy and robustness under complex power grid environments and noise interference. Experimental performance evaluation
[0067] like Figure 3As shown, the dual-gated Shapelet LSTM model proposed in this invention significantly improves accuracy, precision, recall, and F1 score in both distribution network topology identification and phase identification tasks. This indicates a comprehensive enhancement in the accuracy and robustness of identifying the affiliation of low-voltage transformer substations and their phases. This means that combining the physical differential waveform factor with the dual-gated mechanism effectively limits the interference caused by similar voltage fluctuations in complex distribution networks. Whether in binary topology identification or multi-class phase identification tasks, the overall performance of the dual-gated Shapelet LSTM significantly outperforms traditional Pearson algorithms, CNN algorithms, and basic LSTM models. This is because traditional temporal similarity methods or conventional deep learning methods struggle to accurately decouple weak temporal dependencies and physical connectivity features. In topology recognition tasks, compared with the basic LSTM model, the Shapelet LSTM of this invention improves accuracy, precision, recall, and F1 by 0.06, 0.13, 0.10, and 0.11, respectively. In phase recognition tasks, which are relatively difficult to identify, the basic LSTM model has a recall of only 0.70, resulting in an F1 score of 0.80. This is because conventional networks struggle to effectively capture subtle differences in synchronization fluctuations between the three phases, leading to serious missed detections. The model of this invention effectively overcomes this deficiency. Taking phase recognition results as an example, the Shapelet LSTM achieves optimal performance across all metrics. Compared with the CNN algorithm, its accuracy is improved by 0.14 to 0.99, precision by 0.13 to 0.99, recall by 0.10 to 0.98, and F1 score by 0.11 to 0.98.
[0068] like Figure 4As shown, the model incorporating gated modulation and physical differential features outperforms the model without physical differential (No-Diff LSTM) overall. This is because the physical correlation features in the original voltage sequence are not obvious, making it more difficult for the network to extract key synchronization fluctuation information. In the phase identification task, compared with the model without physical differential, the dual-gated ShapeletLSTM improves the accuracy, precision, recall, and F1 by 0.08, 0.06, 0.28, and 0.18, respectively; in the topology identification task, these four metrics also improve by 0.06, 0.13, 0.10, and 0.11, respectively. The poor performance of the model without physical differential in phase identification is due to the lack of prior knowledge of physical differential, making it difficult to effectively distinguish small voltage changes between three phases, resulting in a high false negative rate and high identification difficulty. Furthermore, the single-gated modulation model performs better than the model without differential, but weaker than the dual-gated combination, because a single gating cannot simultaneously and perfectly balance the updating of the time sequence state and the accurate preservation of historical information. Taking single forget gate modulation in topology recognition as an example, although its precision is high, its recall is a shortcoming. Compared with it, dual-gated ShapeletLSTM improves accuracy by 0.03, precision by 0.02, recall by 0.18, and F1 by 0.10.
[0069] The above results demonstrate that the proposed method for identifying transformer substations and phases based on waveform attention dual-gated LSTM in this embodiment offers two approaches to improve the performance of transformer substation relationships and phase identification: introducing physical differential features and constructing a dual-gated network model. Experimental results verify the rationality and effectiveness of these two methods. Compared with other methods and variant models, the method in this embodiment achieves improvements in all four measurement indicators. It can overcome, to some extent, the shortcomings of ordinary methods in incomplete extraction of small synchronous fluctuation features in low-voltage transformer substations and high false negative rates, and has significant reference value in practical applications of transformer substation topology management.
[0070] The foregoing has shown and described the basic principles, main features, and advantages of this embodiment. Those skilled in the art should understand that this embodiment is not limited to the specific embodiments described above. The specific embodiments and descriptions in the specification are merely for further illustrating the principles of this embodiment. Various changes and modifications can be made to this embodiment without departing from the spirit and scope of this embodiment, and all such changes and modifications fall within the scope of this embodiment as claimed. The scope of protection of this embodiment is defined by the claims and their equivalents.
Claims
1. A method for identifying transformer substations and phases based on waveform attention and LSTM, characterized in that, Includes the following steps: Step 1: Data acquisition and establishment of a distribution network voltage sample database By utilizing smart meters on the user side and monitoring terminals on the transformer side in the low-voltage distribution area, the user voltage time series and the transformer three-phase voltage time series are collected synchronously to form a basic voltage sample database. Step 2: Data preprocessing and construction of positive and negative correlation sample pairs Constructing positive and negative correlation sample pairs refers to building a training dataset where user and transformer data belonging to the same transformer substation are labeled as positive samples with a label of 1, and randomly paired user and transformer data from different substations are labeled as negative samples with a label of 0. This process is then simultaneously retained. Step 3: Physical difference sequence calculation and Shapelet waveform factor extraction Based on the principle of the same-state change of the in-phase voltage in the physical circuit, the difference between the user voltage and the three-phase voltage of the transformer is calculated to construct a physical differential voltage sequence. This differential voltage sequence is then input into a time series segment generator, namely a Shapelet generator, which consists of one-dimensional convolutional layers. The self-learning convolutional kernel is used to capture the synchronous morphological features in the voltage fluctuations. After processing by the activation function, a time-series attention weight sequence reflecting the tightness of the physical connection is generated. Step 4, Model Construction and Training of Dual-Gated LSTM A dual-gated LSTM model is constructed, which refers to a long short-term memory network containing dual-gated guiding units. The temporal attention weight sequence generated in step 3 is decoupled and mapped to forget modulation signal and input modulation signal through two independent linear projection layers. The forget gate and input gate of the LSTM unit are dynamically intervened using these two signals respectively. The output of the last time step of the LSTM is extracted as global sequence features. Shared high-level features are extracted through a shared fully connected layer and then fed into parallel topology and phase classification fully connected layers respectively. Finally, the dual-gated LSTM model is iteratively trained by jointly optimizing the topology binary classification loss and the phase multi-classification loss. Step 5: Online identification and determination of the relationship between the household and the transformer. The voltage time series of the user to be classified and the three-phase voltage time series of the target transformer are input into the trained dual-gated LSTM model. The dual-gated LSTM model first generates physical difference features internally and performs gated inference, and finally outputs topology recognition and phase recognition results through the fully connected layer.
2. The method for identifying transformer substations and phases based on waveform attention and LSTM according to claim 1, characterized in that, In step 1, the data collected by the smart meters on the user side and the monitoring terminal on the transformer side in the low-voltage distribution area is voltage data, with a sampling interval of 5 minutes, that is, a total of 288 sampling points per day.
3. The method for identifying transformer substations and phases based on waveform attention and LSTM according to claim 1, characterized in that, In step 2, the improved forward imputation data repair method refers to: for missing values, using the three most recent valid observations preceding the missing value, and assigning different weights to these values to perform a weighted average imputation, as calculated in the following formula: ; in: This is the missing voltage filler value at the corresponding time. , , These are the three most recent valid observations preceding the missing value; , , for , , The corresponding weights; The cleaned user voltage sequence is merged and aligned with the transformer three-phase voltage sequence; to meet the fixed-length input requirement of the model, a target sequence length threshold is set. If the sequence length exceeds Then cut off the first part Data at each time step; if the sequence length is less than [number] time steps; If zero values are not filled at the end of the sequence, then the sequence features are normalized using the Z-Score normalization method to eliminate the absolute amplitude differences between different measurement nodes. The calculation formula is as follows: ; in, These are the original voltage observations. Let be the mean of the sequence. This represents the standard deviation of the sequence, which is calculated statistically from the actual collected voltage observation data. The voltage value is the standardized value. The voltage sequence of a single user is combined and spliced with the three-phase voltage sequence of the target transformer to form a feature tensor containing four channel dimensions; For user and transformer data combinations belonging to the same distribution network area and feeder, they are marked as positive samples, assigned a topology binary classification label of 1, and the actual access phase label of the user is extracted and retained: phase A, phase B, or phase C, corresponding to multi-class labels 0, 1, and 2, respectively; for randomly selected user and transformer data combinations that do not belong to the same distribution network area, they are marked as negative samples, assigned a topology binary classification label of 0, and their phase label is set to -1 to be ignored in loss calculation. Finally, all the constructed positive and negative samples are randomly shuffled to generate the final training dataset.
4. The method for identifying transformer substations and phases based on waveform attention and LSTM according to claim 1, characterized in that, In step 3, the time series data output in step 2 is obtained and denoted as the input feature matrix. This includes user voltage sequences. With transformer three-phase voltage sequence ; Based on the principle of in-phase voltage fluctuation in physical connection relationships, the time series difference between the user-side voltage and the transformer three-phase voltage is calculated. This eliminates the absolute amplitude information of the voltage, extracting only the synchronous fluctuation differential sequence that reflects the line topology connectivity. The calculation formula is as follows: ; in, Indicates the current time step. This indicates the three-phase ports of the transformer. This represents the vector of voltage differences between the user voltage and the three-phase voltages on the low-voltage side of the transformer. Indicates user At any moment Voltage data, Indicates transformer exist Moment Phase voltage value; To maintain the causality of the time series and prevent information leakage at future time steps during subsequent waveform feature extraction, the difference sequence... Zero padding is applied at the very beginning of the time dimension, with a padding length of [value missing]. ; The causally-filled three-channel difference sequence is input into a time series segment generator; the generator is essentially a one-dimensional convolutional network, utilizing... The size is A one-dimensional self-learning convolutional kernel is used as a shapelet waveform factor to perform sliding feature matching on the difference sequence, thereby keenly capturing the local synchronization morphological features in minute voltage fluctuations; A Shapelet at time step Extracted output features The calculation formula is as follows: ; in, For the first Each Shapelet corresponds to a phase sequence channel. At the offset position The weight of the position, For bias terms, This represents the set number of Shapelets; The convolutional features extracted in the above steps are non-linearly mapped using the Sigmoid activation function and scaled to... The probability interval is used to ultimately generate a temporal attention weight sequence that reflects the strength of physical connections and the degree of feature matching. The calculation formula is as follows: 。 5. The method for identifying transformer substations and phases based on waveform attention and LSTM according to claim 1, characterized in that, In step 4, at the current time step Obtain the temporal attention weights generated in step 3. The signals are then input into two independent linear projection layers that do not contain bias terms, and the resulting signals are decoupled to generate forgetting modulation signals. With input modulation signal The calculation formula is as follows: ; ; in, The Shapelet projection weight matrix is specific to the Forgotten Gate. The shapelet projection weight matrix is specific to the input gate. and All of these are learnable parameter matrices within the dual-gated LSTM model. They are randomly initialized using the Kaiming uniform distribution and continuously iterated and updated to obtain the optimal weights in the subsequent model training phase based on the error gradient of the joint loss function through the backpropagation algorithm. Input the original voltage characteristics of the current time step Hidden state compared to the previous time step The data is spliced together and combined with the forgotten modulation signal. and input modulation signal Calculate the forgetting gate under attentional intervention separately. and input gate The calculation formula is as follows: in, This represents the concatenated vector of the input and the hidden state from the previous time step. This is the feature vector that the loop unit calculated and output at the previous time step, carrying historical time-series information; and These represent the weights and biases of the native forget gate, respectively. and These represent the weights and biases of the native input gate, respectively. It is the Sigmoid activation function. and These are the learnable modulation coefficients of the forget gate and the input gate, respectively, used to adaptively adjust the degree of intervention of physical difference features on network memory during training. The initial value is 0.
05. The initial value is 0.05; Based on the computational rules of standard Long Short-Term Memory (LSTM) networks, and combined with the calculated forgetting gate... and input gate Update the cell state at the current time step. With hidden state : ; ; ; ; in, Candidate cell state, For candidate state weights and biases, For output gate, For the output gate weights and biases, Represents the Hadama product. It is the hyperbolic tangent activation function; The output features of the last time step in the hidden state sequence are mapped through a fully connected layer to output logical values for topology recognition and phase recognition, respectively; a joint multi-task loss function is constructed, which includes topology binary classification loss and phase multi-class classification loss. ; For determining the ownership of a household / transformer, calculate the binary cross-entropy loss for topological binary classification. The calculation formula is as follows: ; in, This represents the total number of samples in a single training batch. For the first The true topological labels for each sample are: 1 represents a positive sample from the same platform area, and 0 represents a negative sample from a different platform area. The first output of the model The topological logic value of each sample, Use the Sigmoid activation function; For specific phase classification, calculate the cross-entropy loss for multi-class classification. To avoid the invalid phase labels of negative samples interfering with the network gradient, an indicator mask mechanism is introduced, calculating the phase loss only for positive samples. Therefore, the cross-entropy loss for phase-based multi-class classification is... The calculation formula is as follows: ; in, This represents the number of topologically positive samples in the current batch. This is an indicator function that takes the value 1 when the condition inside the parentheses is true, and 0 otherwise. This ensures that only samples with genuine topological relationships participate in the phase error calculation. This represents the total number of phase categories, with a value of 3, corresponding to phases A, B, and C. The true phase labels for positive samples; For the model to predict the first The sample belongs to the first The original logic value of the phase; By combining the two losses mentioned above and incorporating dynamic task weight coefficients, a total loss function is obtained. Its function is to quantify the model's combined prediction error of topology and phase, and to provide the optimization objective for gradient descent in the backpropagation algorithm. The calculation formula is as follows: ; in, These are the weighting coefficients for the topology recognition task. The weight coefficients for the phase recognition task satisfy the following conditions: By adjusting and The ratio of gradient contributions between topology determination and phase classification is balanced to avoid a single task dominating the update of network parameters during backpropagation. Subsequently, the normalized four-channel voltage data is fed into the constructed dual-gated LSTM model, and the gradient is calculated through the backpropagation algorithm to iteratively update the parameters of the dual-gated LSTM model and each weight matrix.
6. The method for identifying transformer substations and phases based on waveform attention and LSTM according to any one of claims 1-5, characterized in that, In step 5, the actual voltage time series of the user to be classified and the three-phase voltage time series of the target transformer are extracted simultaneously and concatenated in the feature dimension to form an original input matrix containing 4 channels; the global normalization parameters saved during model training in step 2 are loaded and the original input matrix is normalized; then, according to the preset target time step, the normalized sequence is truncated to the same length or zero-padded at the end to construct a standardized test tensor. The test tensor is input into the dual-gated LSTM model for distribution network topology and phase identification trained and converged in steps 3 and 4. The test tensor drives the calculation of physical difference features, the generation of Shapelet temporal attention weights, and the deduction of memory units of the dual-gated LSTM inside the dual-gated LSTM model. It extracts and splices the hidden layer features of the final time of the forward and reverse bidirectional sequences. Finally, through the fully connected layer network at the end, it outputs the topology identification log probability and the phase classification score vector containing three dimensions. The topology identification log odds are calculated using the Sigmoid function to obtain the topology belonging probability; the optimal threshold is dynamically found by maximizing the F1 score on the validation set. The calculation formula is as follows: ; in, For the probability of belonging The accuracy rate is calculated using the following formula: ; For the probability of belonging The recall rate is calculated using the following formula: ; Where TP stands for True Instance, which is predicted to be a positive instance and is actually a positive instance. FP: False positive, which is predicted to be a positive example but is actually a negative example; FN: False negative, which is predicted to be a negative but is actually a positive. TN: True negative example, which is predicted to be negative and is actually negative; If the probability of topology affiliation is greater than the set classification threshold If the topological relationship of the user to be classified is established, then it is determined that the user belongs to the target transformer substation area; and under the condition that the topological relationship is established, the phase classification score vector is applied. The function extracts the category index corresponding to the maximum score and outputs the user's specific access type based on the index mapping relationship. If the probability of topology affiliation is less than or equal to the classification threshold If the topological relationship of the user to be classified is not valid, the user will be identified as a non-local or abnormal user, and no further specific classification results will be output.