A neural network-based power distribution network abnormal data detection and repair method
By combining convolutional neural networks and fuzzy self-organizing neural networks, efficient anomaly detection and repair of distribution network data is achieved, solving the problem of low efficiency in traditional methods and improving data quality and operational efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- STATE GRID ECONOMIC TECH RES INST CO LTD
- Filing Date
- 2024-08-13
- Publication Date
- 2026-06-26
AI Technical Summary
Traditional methods for diagnosing and repairing data anomalies in power distribution networks rely on expert experience, which is inefficient and fails to meet the demands of modern industry for speed and accuracy.
Anomaly detection is performed using a convolutional neural network, combined with a fuzzy self-organizing neural network for data repair. Anomaly data is generated by a variational autoencoder, the detection threshold is adjusted using a particle swarm optimization algorithm, and the time and location of anomalies are determined by K-means clustering analysis.
It improves the efficiency of data processing in the distribution network and the accuracy of anomaly detection and repair, enhances data quality, saves construction costs, and supports the efficient operation of the distribution network.
Smart Images

Figure CN119179940B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of distribution network data detection and repair, and in particular to a method for detecting and repairing abnormal distribution network data based on neural networks. Background Technology
[0002] Smart distribution networks are an important component of the national power grid structure and play a driving role in national economic development. Traditional distribution systems have relatively low growth rates in source and load data, limited equipment types, and simple structures.
[0003] With the large-scale integration of new energy equipment, electricity data has undergone significant changes. Not only has its structure and types become more diverse, but the scale of electricity data has also grown rapidly, exceeding hundreds of millions of records. Accurate and reliable source-load data enables future predictions and provides strong data support for power production, distribution, and dispatch decisions, which is crucial for the efficient operation of the distribution network. Traditional methods for diagnosing and repairing data anomalies mainly rely on expert experience and the subjective judgment of technicians. These methods are limited by human experience and knowledge, and are inefficient, failing to meet the demands of modern industry for speed and accuracy.
[0004] Therefore, how to efficiently identify and repair abnormal data in the distribution network in order to improve the data quality of the distribution network is an urgent problem to be solved. Summary of the Invention
[0005] This application provides a method for detecting and repairing abnormal data in distribution networks based on neural networks. It can detect anomalies in distribution network data based on convolutional neural networks and repair data based on fuzzy self-organizing neural networks, thereby improving data quality, providing a data foundation for the construction and operation of distribution networks, improving operational efficiency and saving construction costs.
[0006] To achieve the above objectives, the present invention adopts the following technical solution:
[0007] In a first aspect, the present invention provides a method for detecting and repairing abnormal data in a power distribution network based on a neural network, the method comprising:
[0008] Collect abnormal day data of the distribution network, train a variational autoencoder using the abnormal day data, and then generate multiple sets of abnormal data using the trained encoder;
[0009] Based on the daily load curves of the multiple sets of abnormal data, the multiple sets of abnormal data are normalized to obtain load data.
[0010] For the load data on the same day, a convolutional neural network is used to determine the probability that the load data is abnormal, and the probability is classified by a preset adaptive detection threshold. Based on the classification result, it is determined whether the load data is abnormal.
[0011] For the load data that is judged to be abnormal, cluster analysis is used to determine the time when the load data became abnormal, and the abnormal time is obtained.
[0012] Based on the abnormal time, the hyperparameters of the fuzzy self-organizing neural network are determined. Then, all normal values in the load data of the day are input into the fuzzy self-organizing neural network to obtain the repair results of the abnormal data.
[0013] In a preferred embodiment of this application, the convolutional neural network may be further configured to include a first convolutional layer and a second convolutional layer, both of which are one-dimensional convolutional operation structures, and a pooling layer that performs one-dimensional max pooling operation.
[0014] In a preferred embodiment of this application, the step of classifying probabilities using a preset adaptive detection threshold includes:
[0015] The particle swarm optimization algorithm is used to adaptively adjust and optimize multiple detection thresholds.
[0016] In a preferred embodiment of this application, the method may be further configured to determine the time when the load data anomalies occur through cluster analysis, including:
[0017] The time when the load data became abnormal was determined using the k-means clustering algorithm.
[0018] In a preferred embodiment of this application, the step of determining the time when the load data judged to be abnormal occurs through cluster analysis to obtain the abnormal time includes:
[0019] The power curve of the abnormal load data is used as the input to the model, denoted as... The number of sampling points per day is n:
[0020] ;
[0021] Randomly select k center points as cluster centers, denoted as . Calculate the Euclidean distance from each point to the cluster center. :
[0022] ;
[0023] Take distance The index of the nearest centroid is used as the category to which the sample belongs, denoted as . The clustering results are then updated to minimize the average distance from the center point to all points in the current cluster, satisfying the following formula:
[0024] ;
[0025] In the formula, the ave function calculates the average of all values in the set. This represents the updated cluster centers, and the total number of updates is m steps.
[0026] Repeat the Euclidean distance calculation and centroid content update until the iteration ends or converges. Then, output the abnormal day clustering results of the load data, and determine the time of the abnormality by the category to obtain the abnormal time.
[0027] In a preferred example of this application, the process can be further configured such that, based on the abnormal time, the hyperparameters of the fuzzy self-organizing neural network are determined, and then all normal values from the load data of that day are input into the fuzzy self-organizing neural network to obtain the abnormal data repair result, including:
[0028] In total Based on the identification results of the abnormal time, n-dimensional normal data is determined as the input data for the input layer from the load data of the n-dimensional data, denoted as . ;
[0029] A Gaussian membership function is used as the fuzzy membership function layer, and h membership rules are set. The output of this layer is... , and Here, represents the center and width of the Gaussian membership function, respectively, both of which are learnable parameters, as shown in the following equation:
[0030] ;
[0031] In the rule layer, all inputs are integrated into a single fuzzy rule, where each neuron represents a fuzzy rule, and the number of neurons represents the number of fuzzy rules. The output of this layer is... The process is as follows:
[0032] ;
[0033] In the selection layer, the Sigmoid function and learnable parameters are used. This layer performs selection and dimensionality reduction of fuzzy rules, and its output is... The process is as follows:
[0034] ;
[0035] The membership functions are classified in the fuzzification layer, and defuzzification is achieved through the mean. The output of this layer is... The process is as follows:
[0036] ;
[0037] Using a fully connected layer as the output layer, a weighted sum of fuzzy rules is applied to the m-dimensional output, and the output of this layer is: ,Will As a result of the repair, the process is as follows:
[0038] ;
[0039] in, and Let m be a learnable parameter, and let m be the dimension of the repair result. .
[0040] Secondly, this application provides a distribution network anomaly data detection and repair device based on neural networks, the device comprising:
[0041] The data acquisition module is used to collect abnormal day data of the distribution network, train the variational autoencoder with the abnormal day data, and then generate multiple sets of abnormal data with the trained encoder.
[0042] The data processing module is used to normalize the multiple sets of abnormal data based on the daily load curves of the multiple sets of abnormal data to obtain load data.
[0043] The anomaly determination module is used to determine the probability that the load data is abnormal for the same day's load data through a convolutional neural network, and classify the probability through a preset adaptive detection threshold. Based on the classification result, it determines whether the load data is abnormal. For the load data that is determined to be abnormal, cluster analysis is used to determine the time when the load data is abnormal, and the abnormal time is obtained.
[0044] The anomaly repair module is used to determine the hyperparameters of the fuzzy self-organizing neural network based on the anomaly time, and then input all the normal values in the load data of the day into the fuzzy self-organizing neural network to obtain the repair result of the anomaly data.
[0045] In a preferred embodiment of this application, the device may be further configured such that the convolutional neural network includes a first convolutional layer and a second convolutional layer, both of which are one-dimensional convolutional operation structures, and a pooling layer that performs one-dimensional max pooling operation.
[0046] In a preferred embodiment of this application, the device may be further configured such that classifying probabilities using a preset adaptive detection threshold includes:
[0047] The particle swarm optimization algorithm is used to adaptively adjust and optimize multiple detection thresholds.
[0048] In a preferred embodiment of this application, the apparatus may be further configured such that, for the load data determined to be abnormal, the time when the abnormality occurred is determined through cluster analysis to obtain the abnormality time, including:
[0049] The power curve of the abnormal load data is used as the input to the model, denoted as... The number of sampling points per day is n:
[0050] ;
[0051] Randomly select k center points as cluster centers, denoted as . Calculate the Euclidean distance from each point to the cluster center. :
[0052] ;
[0053] Take distance The index of the nearest centroid is used as the category to which the sample belongs, denoted as . The clustering results are then updated to minimize the average distance from the center point to all points in the current cluster, satisfying the following formula:
[0054] ;
[0055] In the formula, the ave function calculates the average of all values in the set. This represents the updated cluster centers, and the total number of updates is m steps.
[0056] Repeat the Euclidean distance calculation and centroid content update until the iteration ends or converges. Then, output the abnormal day clustering results of the load data, and determine the time of the abnormality by the category to obtain the abnormal time.
[0057] In a preferred embodiment of this application, the device may be further configured to determine the hyperparameters of the fuzzy self-organizing neural network based on the abnormal time, and then input all normal values from the load data of the day into the fuzzy self-organizing neural network to obtain the repair result of the abnormal data, including:
[0058] In total Based on the identification results of the abnormal time, n-dimensional normal data is determined as the input data for the input layer from the load data of the n-dimensional data, denoted as . ;
[0059] A Gaussian membership function is used as the fuzzy membership function layer, and h membership rules are set. The output of this layer is... , and Here, represents the center and width of the Gaussian membership function, respectively, both of which are learnable parameters, as shown in the following equation:
[0060] ;
[0061] In the rule layer, all inputs are integrated into a single fuzzy rule, where each neuron represents a fuzzy rule, and the number of neurons represents the number of fuzzy rules. The output of this layer is... The process is as follows:
[0062] ;
[0063] In the selection layer, the Sigmoid function and learnable parameters are used. This layer performs selection and dimensionality reduction of fuzzy rules, and its output is... The process is as follows:
[0064] ;
[0065] The membership functions are classified in the fuzzification layer, and defuzzification is achieved through the mean. The output of this layer is... The process is as follows:
[0066] ;
[0067] Using a fully connected layer as the output layer, a weighted sum of fuzzy rules is applied to the m-dimensional output, and the output of this layer is: ,Will As a result of the repair, the process is as follows:
[0068] ;
[0069] in, and Let m be a learnable parameter, and let m be the dimension of the repair result. .
[0070] Thirdly, this application provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the neural network-based distribution network abnormal data detection and repair method as described in any of the preceding claims.
[0071] Fourthly, this application provides a computer-readable storage medium storing a program, wherein when the program is executed by a processor, it implements the neural network-based distribution network anomaly data detection and repair method as described in any of the preceding claims.
[0072] Fifthly, this application provides a computer program product, including computer instructions that, when executed by a processor, implement the steps of the neural network-based distribution network anomaly data detection and repair method as described in any of the preceding claims.
[0073] In summary, compared with the prior art, the beneficial effects of the technical solution provided in this application include at least the following:
[0074] This application provides a method for detecting and repairing abnormal data in power distribution networks based on neural networks. It can detect anomalies in power distribution network data based on convolutional neural networks and repair data based on fuzzy self-organizing neural networks. This improves the efficiency of processing power distribution network data and the accuracy of anomaly detection and repair, enhances data quality, provides a data foundation for the construction and operation of power distribution networks, improves operational efficiency, and saves construction costs. Attached Figure Description
[0075] Figure 1 This is a flowchart illustrating a method for detecting and repairing abnormal data in a power distribution network based on a neural network, as provided in one embodiment of this application.
[0076] Figure 2 A convolutional neural network structure diagram of a method for detecting and repairing abnormal data in a power distribution network based on a neural network, provided in one embodiment of this application.
[0077] Figure 3 A fuzzy self-organizing neural network structure diagram of a method for detecting and repairing abnormal data in a power distribution network based on a neural network, provided in one embodiment of this application.
[0078] Figure 4 This application provides an embodiment of a method for detecting and repairing abnormal data in a power distribution network based on neural networks, showing a load curve with significant winter variations.
[0079] Figure 5 This application provides a method for detecting and repairing abnormal data in a power distribution network based on a neural network, showing a load curve with significant summer variations.
[0080] Figure 6 The graph shows the decrease curve of the loss function during encoder training in a neural network-based method for detecting and repairing abnormal data in a power distribution network, as provided in one embodiment of this application.
[0081] Figure 7 This application provides a comparison chart of generated data and original data for a method for detecting and repairing abnormal data in a distribution network based on a neural network, using an abnormal load curve of a certain day as an example.
[0082] Figure 8A convolutional neural network structure diagram of a method for detecting and repairing abnormal data in a power distribution network based on a neural network, provided in one embodiment of this application.
[0083] Figure 9 This application provides a load curve diagram for a method of detecting and repairing abnormal data in a distribution network based on a neural network, as one embodiment of the present application.
[0084] Figure 10 This application provides a load curve diagram for a method of detecting and repairing abnormal data in a distribution network based on a neural network, as one embodiment of the present application.
[0085] Figure 11 This application provides a load curve diagram for a method of detecting and repairing abnormal data in a distribution network based on a neural network, as one embodiment of the present application.
[0086] Figure 12 The graph shows the decrease in the loss function value of a fuzzy self-organizing neural network for a method of detecting and repairing abnormal data in a power distribution network based on a neural network, provided in one embodiment of this application.
[0087] Figure 13 This image shows the repair result of a neural network-based method for detecting and repairing abnormal data in a power distribution network, provided as an embodiment of this application.
[0088] Figure 14 This is a block diagram of a distribution network anomaly data detection and repair device based on a neural network, provided as an embodiment of this application. Detailed Implementation
[0089] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0090] In one embodiment of this application, a method for detecting and repairing abnormal data in a power distribution network based on a neural network is provided. Please refer to [link to relevant documentation]. Figure 1 As shown, the method includes:
[0091] S100: Collect abnormal daily data of the distribution network, train a variational autoencoder using the abnormal daily data, and then generate multiple sets of abnormal data using the trained encoder;
[0092] S200: Based on the daily load curves of the multiple sets of abnormal data, the multiple sets of abnormal data are normalized to obtain load data;
[0093] S300: For the load data on the same day, a convolutional neural network is used to determine the probability that the load data is abnormal, and a preset adaptive detection threshold is used to classify the probability. Based on the classification result, it is determined whether the load data is abnormal.
[0094] S400: For the load data that is judged to be abnormal, the time when the load data is abnormal is determined by cluster analysis to obtain the abnormal time;
[0095] S500: Based on the abnormal time, determine the hyperparameters of the fuzzy self-organizing neural network, and then input all the normal values in the load data of the day into the fuzzy self-organizing neural network to obtain the repair result of the abnormal data.
[0096] In practice, the method includes the following steps, which are executed sequentially:
[0097] Step S11, data collection, including:
[0098] As a generative model, variational autoencoders require abnormal day data as input and generate a set of similar data to expand the amount of abnormal data and improve the training accuracy of the model.
[0099] Let the input n-dimensional data be denoted as The encoder generates k sets of normally distributed data, and the encoder structure is as follows:
[0100] (1)
[0101] Wherein, the current output and Here, represents the mean and variance of each normal distribution. , , and These are the learnable parameters. is the number of hidden layers, and ReLU is the activation function.
[0102] Let Z be the randomly generated normally distributed data, which has k dimensions. For the normally distributed data Z, a decoder is used to generate data, and the result is required to be as similar as possible to the input result. The decoder structure is as follows:
[0103] (2)
[0104] in For output, the size is the same as the input. same, , , and Similarly, these are learnable parameters. is the number of hidden layers, and ReLU is the activation function.
[0105] Since Z is randomly generated during data generation, any set of data can be generated after model training, serving as multiple sets of anomalous data. The generator produces more anomalous data, providing a foundation for anomalous data analysis.
[0106] Step S12, data collection, including:
[0107] First, refer to the collected data, analyze the power change curves (i.e., load curves) when the nodes are operating normally, summarize the variation patterns of load curves in different seasons, and extract and analyze the daily load curves.
[0108] Because power varies significantly over time within a single day, common normalization methods become insensitive to outliers due to the large range of variation. Therefore, based on m known daily variation patterns of nodes, the power coefficients at n sampling times within a day are summarized, denoted as... This ensures that, for each variation pattern, the average power at each sampling time is... The following formula is satisfied, where The values can be arbitrary and will not affect the normalization result:
[0109] (3)
[0110] Based on the power coefficient and the following formula, modify the power values for all m possible variations, and all subsequent data in this step are the modified values. Represents the original value for each day. This represents the modified value, and :
[0111] (4)
[0112] Based on existing data and in conjunction with the safety operation rules of the distribution network, the maximum and minimum power values during normal operation are analyzed and denoted as follows: and The following normalization method is defined for all power values:
[0113] (5)
[0114] Let P represent the normalized power value. Unless otherwise specified, the normalized value, i.e., the load data, is used in S13 to S15.
[0115] Step S13, Data anomaly identification based on convolutional neural network:
[0116] The normalized data for each day is input into a convolutional neural network, and a threshold between [0,1] is output to represent the probability of outliers in the data for that day.
[0117] The input is the power data at n sampling times on the same day, denoted as :
[0118] (6)
[0119] The designed convolutional neural network ultimately outputs the anomaly probability. The specific network structure is as follows: Figure 2 As shown.
[0120] Convolutional layers I and II have the same structure, both involving one-dimensional convolution operations:
[0121] (7)
[0122] In each convolutional layer, , , These are the input and output dimensions, respectively. This represents the kernel size, and * indicates a discrete convolution operation. It is a learnable matrix.
[0123] The pooling layer employs one-dimensional max pooling:
[0124] (8)
[0125] In the formula, The pooling kernel size, For pooling step size, For output dimensions.
[0126] The fully connected layer reduces the data to a single value for subsequent probability calculations:
[0127] (9)
[0128] In the formula, , Inputs Output The dimension is 1, and the output dimension is generally 1. and These are the learnable parameters.
[0129] Finally, the model output is obtained through the activation function. :
[0130] (10)
[0131] Final output result This indicates the probability that the data for the day is an outlier, and is used in conjunction with S14 to implement the overall detection process.
[0132] Step S14, perform adaptive detection threshold adjustment, including:
[0133] Output of S13 The probability of anomalies in the current data is a value within the range [0,1], and requires an adaptively adjusted detection threshold. Implement binary classification of this value:
[0134] (11)
[0135] The adjustment method employs a particle swarm optimization algorithm, and the detection threshold is optimized using a validation set. The model trained in S13 is used in the validation set to obtain the anomaly probability for each day, and the anomaly probability at the detection threshold is calculated. Model accuracy The mapping process between the two is denoted as the following formula:
[0136] (12)
[0137] Randomly select n thresholds to form an initial particle swarm, denoted as , Each threshold is updated and optimized within m steps using the following formula:
[0138] (13)
[0139] In the formula, It represents the update rate of each particle, and it has upper and lower limits; and The learning factor is denoted by rand, which is a random number within the range [0,1]. and These are the optimal thresholds for each individual particle and the optimal thresholds for all particles, respectively. The optimal evaluation criterion is accuracy. Maximum. S13 and S14 together enable the analysis of whether the daily load data is abnormal.
[0140] Step S15, identify abnormal areas, including:
[0141] When repairing data, the anomaly type is rarely used as the basis for method design. Instead, the method design is based more on the location of the anomaly. Different locations of anomalies will affect the network parameters during repair.
[0142] The abnormal parts of the data are identified by clustering the daily load data marked as abnormal, using the K-means clustering algorithm.
[0143] First, based on the analysis results of the abnormal patterns, the power curve of the abnormal day is used as the input to the model, denoted as... The number of sampling points per day is n:
[0144] (14)
[0145] Randomly select k center points as cluster centers, denoted as . Calculate the Euclidean distance from each point to the cluster center. :
[0146] (15)
[0147] Take distance The index of the nearest centroid is used as the category to which the sample belongs, denoted as . And update the center point positions based on the current clustering results, so that the average distance from the center point to all points in the current cluster is minimized, which satisfies the following formula:
[0148] (16)
[0149] In the formula, the ave function represents calculating the average of all values in the set. This represents the updated cluster centers, and a total of m steps are performed. The Euclidean distance calculation and center point updates are repeated until the iteration ends or convergence occurs. This step outputs the clustering results for abnormal days and determines the time of the anomaly based on its category, serving as the basis for data repair in S16.
[0150] Step S16, perform abnormal data repair, including:
[0151] Abnormal data repair primarily references normal values from all sampling times within the same day, without using specific outliers or outlier types as the basis for repair. That is, in total... Based on the recognition results in S15, the sampled data consists of n-dimensional normal data as input and m-dimensional abnormal data repair results as output. The repair method uses a fuzzy self-organizing neural network, and the specific process is as follows: Figure 3 As shown.
[0152] The input layer takes an n-dimensional vector as input, denoted as... .
[0153] The fuzzy membership function layer uses a Gaussian membership function and sets h membership rules. The output of this layer is: In the formula and These are the center and width of the Gaussian membership function, respectively, both of which are learnable parameters:
[0154] (17)
[0155] The rule layer integrates all inputs into a single fuzzy rule, where each neuron represents a fuzzy rule, and the number of neurons represents the number of fuzzy rules. The output of this layer is... :
[0156] (18)
[0157] The selection layer enables the selection and dimensionality reduction of fuzzy rules through the sigmoid function and learnable parameters. The output of this layer is: :
[0158] (19)
[0159] The defuzzing layer classifies membership functions and achieves defuzzification through the mean. The output of this layer is: :
[0160] (20)
[0161] The output layer is a fully connected layer that performs a weighted summation of fuzzy rules on the m-dimensional output. The output of this layer is: In the formula and Learnable parameters:
[0162] ;(twenty one)
[0163] Model final output As a result of the repair.
[0164] In this embodiment, anomaly detection of distribution network data can be performed based on convolutional neural networks, and data repair can be performed based on fuzzy self-organizing neural networks. This improves the efficiency of processing distribution network data and the accuracy of anomaly detection and repair, enhances data quality, provides a data foundation for the construction and operation of the distribution network, improves operational efficiency, and saves construction costs.
[0165] In some embodiments, using the power data of a node in a local low-carbon power distribution network as a benchmark, with a sampling frequency of 1 hour and a total sampling cycle of 360 cycles, the feasibility of the abnormal data detection and repair method described in Embodiment 1 is verified, as detailed below:
[0166] Step S21, generate abnormal data, including:
[0167] The collected datasets show two main power change trends: winter trends and summer trends. Figure 4 These are load curves showing a clear winter variation trend. Figure 5 These are load curves showing significant summer variation trends.
[0168] Using the 24-hour change curve of abnormal data in the collected data as input, 10 sets of similar data are generated for each day's abnormal data to increase the number of abnormal data. Figure 6 and Figure 7 The following images show the loss function descent curve during encoder training, and a comparison between generated and original data using an abnormal load curve for a certain day as an example. The specific model parameters are as follows:
[0169] ;(twenty two)
[0170] Figure 6 The midpoint represents the original data, and the line represents the generated data. Figure 6 It can be seen that the two trends are the same and have a certain generalization ability.
[0171] Step S22, perform abnormal data detection, including:
[0172] Based on the results of S21, the feasibility of the outlier detection method described in S13 and S14 of Example 1 is verified below, as detailed in the following description.
[0173] First, all load data are normalized according to the formula described in S12. Since the minimum values in the steps are based on normal operation, the normalization results will contain values greater than 1 or less than 0. The normalized data is then divided into training set, validation set, and test set.
[0174] Secondly, the convolutional neural network in S13, with the specific structure as follows: Figure 8 As shown, the classification and recognition functions are implemented, and the network parameters are trained using the training set. In this step, the parameters for each layer of the network are selected as follows:
[0175] ;(twenty three)
[0176] Finally, the particle swarm optimization algorithm of S14 is combined with the output of a convolutional neural network, and the detection threshold is adaptively selected using a validation set. This algorithm achieves an anomaly detection accuracy of 99.805%. In the test set, only one set of anomaly data was not detected; the rest were accurately detected. At this point, the detection threshold is 0.4729, which is the threshold for each convolutional neural network output. All of them are:
[0177] ;(twenty four)
[0178] Step S23, perform abnormal time location, including:
[0179] The feasibility of the abnormal time location method described in S15 of Example 1 is verified below, using the normalized data in S22. See the description below for details.
[0180] K-means clustering algorithm was used to cluster all data marked as anomalies. Clustering was performed for different anomaly days to determine the time period of the anomalies, which facilitates the selection of hyperparameters during data repair. In this step, a k value of 15 was selected. Figures 9 to 11 Different types of load curves are shown. Figure 9 The categories shown indicate anomalies at 3 PM and 9 PM. Figure 10 The categories shown indicate that anomalies occurred at 18:00 and 19:00. Figure 11 The categories shown indicate anomalies at 3:00, 12:00, and 13:00.
[0181] Step S24, perform abnormal data repair, including:
[0182] Based on the abnormal time location results in S23, the feasibility of the abnormal data repair method described in S16 of Example 1 will be verified, as detailed in the following description.
[0183] Based on the fuzzy self-organizing neural network designed in S16 and the anomaly time localization results in S23, the abnormal data is repaired. Since the repair results need to be compared with normal values, both the training set and the test set are composed of normal datasets, and the repair results are verified by simulating abnormal times. In different embodiments, except for the different input and output dimensions, other model structures and hyperparameters are the same, and the initial number of fuzzy rules is set to 200.
[0184] This step simulates anomalies in node power data for some days at 13:00. Figure 12 and Figure 13 The diagrams show the loss function value decrease curve during neural network training and the repair results. In the current step, the input is defined as 23 sets of data for all nodes at the current time, excluding the power value at that time. The output is the repaired data at that time. 73 fuzzy rules were actually applied, and the repair error rate was 4.404%. Other indicators, such as the mean absolute error of 5.2537 and the root mean square error of 6.8597, indicate a certain degree of repair effectiveness.
[0185] This application also provides a distribution network anomaly data detection and repair device based on neural networks. Please refer to [link to relevant documentation]. Figure 14 As shown, the device includes:
[0186] The data acquisition module 100 is used to collect abnormal day data of the distribution network, train a variational autoencoder with the abnormal day data, and then generate multiple sets of abnormal data with the trained encoder.
[0187] Data processing module 200 is used to normalize the multiple sets of abnormal data based on the daily load curves of the multiple sets of abnormal data to obtain load data;
[0188] The anomaly determination module 300 is used to determine the probability that the load data is abnormal for the same day's load data through a convolutional neural network, and classify the probability through a preset adaptive detection threshold. Based on the classification result, it determines whether the load data is abnormal. For the load data that is determined to be abnormal, cluster analysis is used to determine the time when the load data is abnormal, and the abnormal time is obtained.
[0189] The anomaly repair module 400 is used to determine the hyperparameters of the fuzzy self-organizing neural network based on the anomaly time, and then input all the normal values in the load data of the day into the fuzzy self-organizing neural network to obtain the repair result of the anomaly data.
[0190] The functions of each module in the above-mentioned distribution network anomaly data detection and repair device based on neural networks correspond to the steps in the above-mentioned distribution network anomaly data detection and repair method embodiment based on neural networks. Their functions and implementation processes will not be described in detail here.
[0191] This application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the steps of the neural network-based distribution network abnormal data detection and repair method as described in any of the above embodiments.
[0192] This application also provides a computer-readable storage medium storing a program. The computer-readable storage medium refers to a data storage medium, which may include, but is not limited to, floppy disks, optical disks, hard disks, flash memory, USB flash drives, and / or Memory Sticks. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The working process, details, and technical effects of the computer-readable storage medium provided in this embodiment can be found in the above embodiment regarding a method for detecting and repairing abnormal data in a power distribution network based on a neural network, and will not be repeated here.
[0193] The application also provides a computer program product, including computer instructions that, when executed by a processor, implement the steps of the neural network-based distribution network anomaly data detection and repair method as described in any of the above embodiments.
[0194] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, storage, databases, or other media used in the embodiments provided in this application can include non-volatile and / or volatile memory. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), RAMbus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and RAMbus dynamic RAM (RDRAM).
[0195] The technical features of the above embodiments can be combined arbitrarily. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as the combination of these technical features does not contradict each other, it should be considered within the scope of this specification. The above embodiments only illustrate several implementation methods of this application, and their descriptions are relatively specific and detailed, but they should not be construed as limiting the scope of the invention patent. It should be noted that for those skilled in the art, several modifications and improvements can be made without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this patent application should be determined by the appended claims.
Claims
1. A method for detecting and repairing abnormal data in a power distribution network based on a neural network, characterized in that, include: Collect abnormal day data of the distribution network, train a variational autoencoder using the abnormal day data, and then generate multiple sets of abnormal data using the trained encoder; Based on the daily load curves of the multiple sets of abnormal data, the multiple sets of abnormal data are normalized to obtain load data. For the load data on the same day, a convolutional neural network is used to determine the probability that the load data is abnormal, and the probability is classified by a preset adaptive detection threshold. Based on the classification result, it is determined whether the load data is abnormal. For the load data that is judged to be abnormal, cluster analysis is used to determine the time when the load data became abnormal, and the abnormal time is obtained. Based on the abnormal time, the hyperparameters of the fuzzy self-organizing neural network are determined. Then, all normal values from the load data of the day are input into the fuzzy self-organizing neural network to obtain the abnormal data repair results, including: In total Based on the identification results of the abnormal time, n-dimensional normal data is determined as the input data for the input layer from the load data of the n-dimensional data, denoted as . The Gaussian membership function is used as the fuzzy membership function layer, and h membership rules are set. The output of this layer is... , and Here, represents the center and width of the Gaussian membership function, respectively, both of which are learnable parameters, as shown in the following equation: In the rule layer, all inputs are integrated into a single fuzzy rule, where each neuron represents a fuzzy rule, and the number of neurons represents the number of fuzzy rules. The output of this layer is... The process is as follows: ; In the selection layer, the Sigmoid function and learnable parameters are used. This layer performs selection and dimensionality reduction of fuzzy rules, and its output is... The process is as follows: The membership functions are classified in the fuzzification layer, and defuzzification is achieved through the mean. The output of this layer is... The process is as follows: Using a fully connected layer as the output layer, a weighted sum of fuzzy rules is applied to the m-dimensional output, and the output of this layer is: ,Will As a result of the repair, the process is as follows: ; in, and Let m be a learnable parameter, and let m be the dimension of the repair result. .
2. The method for detecting and repairing abnormal data in a distribution network based on a neural network according to claim 1, characterized in that, The convolutional neural network includes a first convolutional layer and a second convolutional layer, both of which are one-dimensional convolutional operation structures, as well as a pooling layer that performs one-dimensional maximum pooling operation.
3. The method for detecting and repairing abnormal data in a distribution network based on a neural network according to claim 1, characterized in that, The process of classifying probabilities using a preset adaptive detection threshold includes: The particle swarm optimization algorithm is used to adaptively adjust and optimize multiple detection thresholds.
4. The method for detecting and repairing abnormal data in a distribution network based on a neural network according to claim 1, characterized in that, Cluster analysis was used to determine the times when the load data showed anomalies, including: The time when the load data became abnormal was determined using the k-means clustering algorithm.
5. The method for detecting and repairing abnormal data in a distribution network based on a neural network according to claim 1, characterized in that, For the load data identified as abnormal, cluster analysis is used to determine the time when the abnormality occurred, resulting in the abnormal time, including: The power curve of the abnormal load data is used as the input to the model, denoted as... The number of sampling points per day is n: ; Randomly select k center points as cluster centers, denoted as . Calculate the Euclidean distance from each point to the cluster center. : ; Take distance The index of the nearest centroid is used as the category to which the sample belongs, denoted as . The clustering results are then updated to minimize the average distance from the center point to all points in the current cluster, satisfying the following formula: ; In the formula, the ave function calculates the average of all values in the set. This represents the updated cluster centers, and the total number of updates is m steps. Repeat the Euclidean distance calculation and centroid content update until the iteration ends or converges. Then, output the abnormal day clustering results of the load data, and determine the time of the abnormality by the category to obtain the abnormal time.
6. A distribution network anomaly data detection and repair device based on neural networks, characterized in that, include: The data acquisition module is used to collect abnormal day data of the distribution network, train the variational autoencoder with the abnormal day data, and then generate multiple sets of abnormal data with the trained encoder. The data processing module is used to normalize the multiple sets of abnormal data based on the daily load curves of the multiple sets of abnormal data to obtain load data. The anomaly determination module is used to determine the probability that the load data is abnormal for the same day's load data through a convolutional neural network, and classify the probability through a preset adaptive detection threshold. Based on the classification result, it determines whether the load data is abnormal. For the load data that is determined to be abnormal, cluster analysis is used to determine the time when the load data is abnormal, and the abnormal time is obtained. An anomaly repair module is used to determine the hyperparameters of a fuzzy self-organizing neural network based on the anomaly time, and then input all normal values from the load data of the day into the fuzzy self-organizing neural network to obtain the anomaly data repair result, including: In total Based on the identification results of the abnormal time, n-dimensional normal data is determined as the input data for the input layer from the load data of the n-dimensional data, denoted as . The Gaussian membership function is used as the fuzzy membership function layer, and h membership rules are set. The output of this layer is... , and Here, represents the center and width of the Gaussian membership function, respectively, both of which are learnable parameters, as shown in the following equation: ; In the rule layer, all inputs are integrated into a single fuzzy rule, where each neuron represents a fuzzy rule, and the number of neurons represents the number of fuzzy rules. The output of this layer is... The process is as follows: ; In the selection layer, the Sigmoid function and learnable parameters are used. This layer performs selection and dimensionality reduction of fuzzy rules, and its output is... The process is as follows: ; The membership functions are classified in the fuzzification layer, and defuzzification is achieved through the mean. The output of this layer is... The process is as follows: ; Using a fully connected layer as the output layer, a weighted sum of fuzzy rules is applied to the m-dimensional output, and the output of this layer is: ,Will As a result of the repair, the process is as follows: ; in, and Let m be a learnable parameter, and m be the dimension of the repair result, and have... .
7. The distribution network anomaly data detection and repair device based on neural networks according to claim 6, characterized in that, The convolutional neural network includes a first convolutional layer and a second convolutional layer, both of which are one-dimensional convolutional operation structures, as well as a pooling layer that performs one-dimensional maximum pooling operation.
8. The distribution network anomaly data detection and repair device based on neural networks according to claim 6, characterized in that, The process of classifying probabilities using a preset adaptive detection threshold includes: The particle swarm optimization algorithm is used to adaptively adjust and optimize multiple detection thresholds.
9. The distribution network anomaly data detection and repair device based on neural networks according to claim 6, characterized in that, For the load data identified as abnormal, cluster analysis is used to determine the time when the abnormality occurred, resulting in the abnormal time, including: The power curve of the abnormal load data is used as the input to the model, denoted as... The number of sampling points per day is n: ; Randomly select k center points as cluster centers, denoted as . Calculate the Euclidean distance from each point to the cluster center. : ; Take distance The index of the nearest centroid is used as the category to which the sample belongs, denoted as . The clustering results are then updated to minimize the average distance from the center point to all points in the current cluster, satisfying the following equation: ; In the formula, the ave function calculates the average of all values in the set. This represents the updated cluster centers, and the total number of updates is m steps. Repeat the Euclidean distance calculation and centroid content update until the iteration ends or converges. Then, output the abnormal day clustering results of the load data, and determine the time of the abnormality by the category to obtain the abnormal time.
10. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the steps of the distribution network anomaly data detection and repair method based on neural networks as described in any one of claims 1 to 5.
11. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a program, wherein when the program is executed by a processor, it implements the method for detecting and repairing abnormal data in a power distribution network based on a neural network as described in any one of claims 1 to 5.
12. A computer program product comprising computer instructions, characterized in that, When executed by a processor, the computer instructions implement the steps of any one of the methods described in claims 1 to 5.