A data anomaly detection method and system for a power distribution network voltage monitoring point
By employing a two-layer progressive detection method, including over-limit detection and median deviation method, the problem of inaccurate detection results of abnormal data at distribution network voltage monitoring points is solved, enabling efficient identification and accurate screening of different types of abnormal data.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGDONG POWER GRID CO LTD
- Filing Date
- 2023-04-28
- Publication Date
- 2026-06-30
AI Technical Summary
Existing methods for detecting data anomalies at voltage monitoring points in power distribution networks suffer from inaccurate results, particularly in the difficulty of effectively detecting smaller anomalies, leading to significant errors.
A two-layer progressive detection method is adopted. First, zero and null values are eliminated by exceeding the limit detection. Then, the interquartile range criterion is used to screen out abnormal data with large dispersion. Finally, the median deviation method is used to identify low-to-medium degree abnormal data. Combining the identification methods of different outlier types improves the detection accuracy.
It effectively eliminates interference items, improves the accuracy of data anomaly detection, can effectively identify different types of outliers, reduce errors, and provide accurate basis for judging anomalies.
Smart Images

Figure CN116298698B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of data anomaly detection technology, and in particular to data anomaly detection at voltage monitoring points in power distribution networks. Background Technology
[0002] Distribution network voltage monitoring points refer to nodes that monitor distribution network voltage values and assess voltage quality. They typically represent the general voltage level of the distribution network. With the continuous improvement of the digitalization and intelligence of the power grid, the data on substation busbars, feeders, distribution transformers, and user voltages monitored at distribution network voltage monitoring points are becoming increasingly abundant.
[0003] However, in reality, when voltage monitoring terminals transmit data to automated measurement and back-end management systems, the voltage data collected may contain missing values, null values, outliers, or other anomalies due to equipment or network failures, abnormal events, and human factors. Therefore, anomaly detection at distribution network voltage monitoring points has become an essential step.
[0004] Currently, although there are various methods for detecting abnormal values at voltage monitoring points in power distribution networks, the types of abnormal values are numerous, and when detecting different types of abnormal values, the presence of larger abnormal values often prevents the effective detection of smaller abnormal values. This results in generally large errors in the detection results, making it difficult to provide accurate and effective judgments on the occurrence of abnormal situations at voltage monitoring points. Summary of the Invention
[0005] This invention provides a method and system for detecting data anomalies at voltage monitoring points in power distribution networks, which solves the problem of inaccurate results from current methods for detecting data anomalies at voltage monitoring points in power distribution networks.
[0006] The first aspect of this invention provides a method for detecting data anomalies at voltage monitoring points in a power distribution network, characterized by comprising the following steps:
[0007] Voltage data from distribution network voltage monitoring points are acquired according to a preset sampling frequency to obtain a voltage dataset.
[0008] The first dataset to be detected is obtained by removing zero and null values from the voltage dataset.
[0009] Perform limit detection on the first dataset to be detected to obtain a dataset of limit-exceeding values;
[0010] The dataset exceeding the limit is subjected to first-level detection based on the interquartile range criterion, and data with abnormal detection results in the first-level detection are removed to obtain the second dataset to be detected.
[0011] The second-level detection is performed on the second dataset to be detected using the median deviation method to obtain the first abnormal dataset.
[0012] Specifically, the step of performing limit detection on the first dataset to be detected to obtain the limit value dataset is as follows:
[0013] Obtain the preset voltage management upper limit percentage, the preset voltage management lower limit percentage, and the distribution network voltage level;
[0014] The voltage management upper limit is calculated based on the voltage management upper limit percentage and the distribution network voltage level.
[0015] The lower voltage management limit is calculated based on the voltage management lower limit percentage and the distribution network voltage level.
[0016] Data in the first dataset to be detected that are greater than the upper limit of voltage management or less than the lower limit of voltage management are identified as out-of-limit data, and the set of all out-of-limit data is used to obtain the out-of-limit dataset.
[0017] Specifically, the first-level detection of the out-of-limit dataset based on the quartile data spacing criterion involves:
[0018] Obtain the preset upper limit abnormality coefficient and the preset lower limit abnormality coefficient;
[0019] Divide the data in the dataset of out-of-limit values into four equal parts, and select the first quartile and the third quartile data;
[0020] The abnormal data exceeding the upper limit are determined based on the first quartile data, the third quartile data, and the upper limit abnormality coefficient;
[0021] The abnormal data that exceeds the lower limit is determined based on the first quartile data, the third quartile data, and the lower limit abnormality coefficient.
[0022] The abnormal data exceeding the upper limit and the abnormal data exceeding the lower limit are marked as data with abnormal detection results.
[0023] Specifically, the second-level detection of the second dataset to be detected based on the median deviation method includes the following steps:
[0024] S1: Obtain the preset first anomaly judgment coefficient, the preset second anomaly judgment coefficient, and the preset scaling factor constant;
[0025] S2: Starting from the first data point of the data to be detected, set a sliding window with a preset data length;
[0026] S3: Arrange the data in the sliding window in ascending order, and determine the median of the data in the sliding window after the arrangement;
[0027] S4: Calculate the absolute median difference data based on the median of the data within the sliding window and the data within the sliding window;
[0028] S5: Arrange the absolute median difference data from smallest to largest, and determine the median of the arranged absolute median difference data;
[0029] S6: Determine the abnormal data in the sliding window based on the median of the data in the sliding window, the first anomaly judgment coefficient, the second anomaly judgment coefficient, the scale factor constant, and the median of the absolute median difference data;
[0030] S7: Slide the sliding window to a preset data length and repeat steps S3-S6 until all data in the second dataset to be detected is detected.
[0031] Specifically, a method for detecting data anomalies at a distribution network voltage monitoring point is characterized in that, determining the window anomaly data within the sliding window based on the median of the data within the sliding window, the first anomaly judgment coefficient, the second anomaly judgment coefficient, the scaling factor constant, and the median of the absolute median difference data specifically involves:
[0032] The abnormal data in the first window of the sliding window are determined based on the median of the data in the sliding window, the first anomaly judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0033] The abnormal data in the second window within the sliding window are determined based on the median of the data within the sliding window, the second anomaly judgment coefficient, the scaling factor constant, and the median of the absolute median difference data.
[0034] The first window abnormal data and the second window abnormal data are identified as the window abnormal data.
[0035] Another aspect of the present invention provides a data anomaly detection system for voltage monitoring points in a power distribution network, comprising: a sampling module, a first dataset acquisition module, an over-limit detection module, a first-level detection module, and a second-level detection module, wherein:
[0036] The sampling module is used to acquire voltage data from voltage monitoring points in the power distribution network according to a preset sampling frequency, and obtain a voltage dataset.
[0037] The first dataset acquisition module is connected to the sampling module and is used to remove zero-value data and null-value data from the voltage dataset to obtain the first dataset to be detected.
[0038] The limit-crossing detection module is connected to the first detection dataset acquisition module, and performs limit-crossing detection on the first detection dataset to obtain the limit-crossing value dataset;
[0039] The first-level detection module is connected to the limit-breaking detection module and is used to perform first-level detection on the limit-breaking value dataset according to the interquartile data spacing criterion, and remove data with abnormal detection results in the first-level detection to obtain the second dataset to be detected;
[0040] The second-level detection module is connected to the first-level detection module and is used to perform second-level detection on the second dataset to be detected according to the median deviation method to obtain the first abnormal dataset.
[0041] Specifically, the limit-crossing detection module includes: a first acquisition unit, an upper limit calculation unit, a lower limit calculation unit, and a first judgment unit;
[0042] The first acquisition unit is used to acquire a preset voltage management upper limit percentage, a preset voltage management lower limit percentage, and a distribution network voltage level;
[0043] The upper limit calculation unit is connected to the first acquisition unit and is used to calculate the voltage management upper limit based on the voltage management upper limit percentage and the distribution network voltage level.
[0044] The lower limit calculation unit is connected to the first acquisition unit and is used to calculate the lower limit of voltage management based on the voltage management lower limit percentage and the distribution network voltage level.
[0045] The first judgment unit is connected to the upper limit calculation unit and the lower limit calculation unit respectively, and is used to judge the data in the first dataset to be detected that is greater than the voltage management upper limit or less than the voltage management lower limit as the over-limit data, and to obtain the over-limit dataset by collecting all the over-limit data sets.
[0046] Specifically, the first-level detection module includes: an anomaly coefficient acquisition unit, a quartile selection unit, an out-of-limit anomaly data determination unit, and a marking unit;
[0047] The anomaly coefficient acquisition unit is used to acquire the upper limit anomaly coefficient and the lower limit anomaly coefficient;
[0048] The quartile selection unit is used to divide the data in the over-limit value dataset into four equal parts and select the first quartile data and the third quartile data;
[0049] The out-of-limit abnormal data determination unit is connected to the abnormal coefficient acquisition unit and the quartile selection unit, and is used to determine out-of-limit abnormal data based on the first quartile data, the third quartile data, and the out-of-limit abnormal coefficient, and is also used to determine out-of-limit abnormal data based on the first quartile data, the third quartile data, and the out-of-limit abnormal coefficient;
[0050] The marking unit is connected to the out-of-limit abnormal data determination unit and is used to mark the out-of-limit abnormal data and the out-of-limit abnormal data as data with abnormal detection results.
[0051] Specifically, the second-level detection module includes: a second acquisition unit, a sliding window setting unit, a first median determination unit, an absolute median difference calculation unit, a second median determination unit, an abnormal data determination unit, and a sliding unit;
[0052] The second acquisition unit is used to acquire a preset first anomaly judgment coefficient, a preset anomaly judgment coefficient, and a preset scaling factor constant;
[0053] The sliding window setting unit is used to set a sliding window of a preset data length, starting from the first data of the data to be detected;
[0054] The first median determination unit is connected to the sliding window setting unit and is used to arrange the data in the sliding window from smallest to largest and determine the median of the data in the arranged sliding window.
[0055] The absolute median difference calculation unit is connected to the first median determination unit and is used to calculate the absolute median difference data based on the median of the data in the sliding window and the data in the sliding window.
[0056] The second median determination unit is connected to the absolute median difference calculation unit and is used to arrange the absolute median difference data from smallest to largest and determine the median of the arranged absolute median difference data.
[0057] The abnormal data determination unit is connected to the first median determination unit, the coefficient and constant addition unit, and the second median determination unit, respectively, and is used to determine the abnormal data in the sliding window based on the median of the data in the sliding window, the first abnormal judgment coefficient, the second abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0058] The sliding unit is connected to the abnormal data determination unit and is used to slide the sliding window for a preset data length and repeat steps S3-S6 until the detection of all data in the second dataset to be detected is completed.
[0059] Specifically, the abnormal data determination unit includes: a first abnormal data determination subunit, a second abnormal data determination subunit, and a summarization subunit, wherein:
[0060] The first abnormal data determination subunit is used to determine the first window abnormal data in the sliding window based on the median of the data in the sliding window, the first abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0061] The second abnormal data determination subunit is used to determine the second window abnormal data in the sliding window based on the median of the data in the sliding window, the second abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0062] The aggregation subunit is used to identify the first window abnormal data and the second window abnormal data as the window abnormal data.
[0063] The beneficial effects of the present invention are as follows: the present invention provides a method for detecting data anomalies at voltage monitoring points in a distribution network, comprising the following steps: acquiring voltage data from voltage monitoring points in a distribution network according to a preset sampling frequency to obtain a voltage dataset; removing zero-value data and null-value data from the voltage dataset to obtain a first dataset to be detected; performing over-limit detection on the first dataset to be detected to obtain an over-limit value dataset; performing a first-level detection on the over-limit value dataset according to the quartile data spacing criterion, and removing data with abnormal detection results in the first-level detection to obtain a second dataset to be detected; performing a second-level detection on the second dataset to be detected according to the median deviation method to obtain a first abnormal dataset.
[0064] The present invention provides a data anomaly detection method for distribution network voltage monitoring points. This method removes interference items such as zero and null values to obtain an out-of-limit dataset. The out-of-limit dataset undergoes a first-level detection. During this first-level detection, the interquartile range criterion is used to easily distinguish discrete values, filtering out anomalies with large dispersion. This increases the contrast between the second-level anomaly data (which has smaller dispersion and is more difficult to distinguish from normal values) and normal data, further reducing interference items. Then, the median deviation method is used to detect data with deviations from the normal data one by one, resulting in the first anomaly dataset. By employing a two-layer identification approach, different identification methods are used for different anomaly types, effectively detecting smaller anomalies and further improving detection accuracy. This effectively solves the problem of inaccurate detection results in current distribution network voltage monitoring point data anomaly detection methods. Attached Figure Description
[0065] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0066] Figure 1 This is a flowchart of a data anomaly detection method for voltage monitoring points in a power distribution network. Detailed Implementation
[0067] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present invention, and not all of them. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0068] This invention provides a method for detecting data anomalies at voltage monitoring points in a power distribution network. Please refer to [link / reference]. Figure 1 , Figure 1 The flowchart shows the method for detecting data anomalies at voltage monitoring points in a power distribution network, which includes the following steps:
[0069] Voltage data from distribution network voltage monitoring points are acquired according to a preset sampling frequency. Obtain the voltage dataset ;
[0070] The first dataset to be detected is obtained by removing zero and null values from the voltage dataset.
[0071] Perform limit detection on the first dataset to be detected to obtain the limit value dataset. ;
[0072] Data sets exceeding limits are analyzed based on the interquartile range (IQR) criterion. Perform the first level of detection and remove data with abnormal detection results in the first level of detection to obtain the second dataset to be detected;
[0073] The second-level detection of the second dataset to be detected is performed using the median deviation method to obtain the first abnormal dataset. .
[0074] In this embodiment, data with abnormal detection results in the first level of detection are considered highly abnormal data. First abnormal dataset For low- to medium-level anomaly data, since high-level anomaly data has larger outliers compared to low- to medium-level anomaly data, the interquartile range criterion is first used to easily distinguish discrete values during detection. This allows high-level anomalies to be filtered out from the out-of-limit data, making the differences between the remaining low- to medium-level anomaly data and normal data more obvious. This makes it easier to distinguish low- to medium-level anomaly data from normal data. Then, the median deviation method based on data statistics is used to distinguish low- to medium-level anomaly data from normal data, effectively solving the problem of inaccurate results from current methods for detecting data anomalies at voltage monitoring points in distribution networks.
[0075] In a specific embodiment of the present invention, based on the foregoing embodiments, an over-limit detection is performed on the first dataset to be detected to obtain an over-limit value dataset. The formula is as follows:
[0076]
[0077] in: To preset the upper voltage limit, The preset lower voltage limit;
[0078]
[0079] In a more specific embodiment:
[0080] in, This indicates the standard rated voltage level of the power distribution network, including 110kV, 35kV, 10kV, and 380V / 220V. The percentage of the lower limit coefficient and the upper limit coefficient corresponding to the voltage level, respectively.
[0081] In a specific embodiment of the present invention, the voltage dataset is determined. Voltage data in Does it meet the requirements? If so, then Recorded as zero value data ;
[0082] Determine the voltage dataset Voltage data in Does it meet the requirements? If so, then Output null values one by one .
[0083] In a more specific embodiment of the present invention, based on the foregoing embodiments, the dataset exceeding the limit value is processed according to the interquartile range criterion. The first level of detection is performed, specifically as follows:
[0084] Obtain the preset height anomaly coefficient exceeding the upper limit. And the preset lower limit height anomaly coefficient ;
[0085] Data set of out-of-limit values The data in the image is divided into four equal parts, and the first quartile is selected. and third and fourth percentile data ;
[0086] Based on the first quartile data Third and fourth quartile data , Exceeding the upper limit height anomaly coefficient and the first preset threshold Identify outliers exceeding the upper limit height;
[0087] Based on the first quartile data Third and fourth quartile data , Exceeding the upper limit height anomaly coefficient and the second preset threshold The formula for determining outliers exceeding the lower limit height is as follows:
[0088]
[0089] Outliers exceeding the upper and lower height limits are marked as data with abnormal detection results.
[0090] In a more specific embodiment of the present invention, based on the foregoing embodiments, a second-level detection is performed on the second dataset to be detected according to the median deviation method, specifically including the following steps:
[0091] S1: Obtain the preset first anomaly judgment coefficient Second anomaly judgment coefficient and the preset scaling factor constant ;
[0092]
[0093] S2: From the data to be detected Start with the first data element and set the preset data length. sliding window Then we have:
[0094] ;
[0095] in: The first element representing the data. Represents the last digit of the data;
[0096] S3: Arrange the data in the sliding window in ascending order to obtain the sorted data. ;
[0097] Determine the data within the sliding window the median of ;
[0098] in: To find the median;
[0099] S4: Based on the data in the sliding window the median of and the data within the sliding window The formula for calculating the absolute median is as follows;
[0100] ;
[0101] in, The data represents the absolute median difference. For absolute median data The last digit of the data. Data within the sliding window The first data point, Data in the sliding window The last digit;
[0102] in: Indicates the position of the window;
[0103] S5: Transform the absolute median difference dataset The data in the dataset is arranged in ascending order to obtain the absolute median difference dataset. ;
[0104] Determine the absolute median difference dataset The median of: ;
[0105] S6: Based on the median of the data within the sliding window First anomaly judgment coefficient Second anomaly judgment coefficient Scale factor constant and the median of the absolute median difference data To determine abnormal data within the sliding window, i.e. when... Then:
[0106]
[0107] in: For outlier detection, = or = ;
[0108] S7: Slide the sliding window by a preset data length Repeat steps S3-S6 until all data to be tested is collected. Detection of all data in the system.
[0109] In a more specific embodiment of the present invention, the step of determining the median of the data within the sliding window and the first anomaly detection coefficient is... The second anomaly judgment coefficient The scaling factor constant and the median of the absolute median difference data To determine abnormal window data within the sliding window, specifically:
[0110] Based on the median of the data within the sliding window The first anomaly judgment coefficient The scaling factor constant and the median of the absolute median difference data Determine the abnormal data in the first window within the sliding window;
[0111] Based on the median of the data within the sliding window The second anomaly judgment coefficient The scaling factor constant and the median of the absolute median difference data Determine abnormal data in the second window within the sliding window;
[0112] The first window abnormal data and the second window abnormal data are identified as the window abnormal data.
[0113] In another specific embodiment of the present invention, the first anomaly judgment coefficient With the second anomaly judgment coefficient There is a preset interval between them.
[0114] In the specific implementation process, different anomaly judgment coefficients are set to identify the first abnormal data. Classified into moderately abnormal datasets and low-degree anomaly dataset When the first anomaly judgment coefficient Greater than the second anomaly judgment coefficient At that time, based on the first anomaly judgment coefficient The collection of first-window anomaly data from all windows obtained is the moderate anomaly dataset. Based on the first anomaly judgment coefficient The collection of second-window outlier data from all windows obtained is the low-degree outlier dataset. .
[0115] In a more specific embodiment of the present invention, based on the foregoing embodiments, the first anomaly judgment coefficient... The second anomaly detection coefficient is 3. The value is 2.
[0116] In another aspect, the present invention provides a two-layer progressive data anomaly detection system for distribution network voltage monitoring points, comprising: a sampling module, a first dataset acquisition module, an over-limit detection module, a first-level detection module, and a second-level detection module, wherein:
[0117] The sampling module is used to acquire voltage data from voltage monitoring points in the power distribution network according to a preset sampling frequency, and obtain a voltage dataset.
[0118] The first dataset acquisition module is connected to the sampling module and is used to remove zero and null data from the voltage dataset to obtain the first dataset to be detected.
[0119] The limit detection module is connected to the first dataset acquisition module to perform limit detection on the first dataset to be detected to obtain the limit value dataset;
[0120] The first-level detection module is connected to the limit-breaking detection module. It is used to perform first-level detection on the limit-breaking value dataset according to the interquartile data spacing criterion, and remove data with abnormal detection results in the first-level detection to obtain the second dataset to be detected.
[0121] The second-level detection module is connected to the first-level detection module and is used to perform second-level detection on the second dataset to be detected based on the median deviation method to obtain the first abnormal dataset.
[0122] In a more specific embodiment of the present invention, based on the foregoing embodiments, the limit-crossing detection module includes: a first acquisition unit, an upper limit calculation unit, a lower limit calculation unit, and a first judgment unit;
[0123] The first acquisition unit is used to acquire a preset voltage management upper limit percentage, a preset voltage management lower limit percentage, and a distribution network voltage level;
[0124] The upper limit calculation unit is connected to the first acquisition unit and is used to calculate the voltage management upper limit based on the voltage management upper limit percentage and the distribution network voltage level.
[0125] The lower limit calculation unit is connected to the first acquisition unit and is used to calculate the lower limit of voltage management based on the voltage management lower limit percentage and the distribution network voltage level.
[0126] The first judgment unit is connected to the upper limit calculation unit and the lower limit calculation unit respectively, and is used to judge the data in the first dataset to be detected that is greater than the voltage management upper limit or less than the voltage management lower limit as the over-limit data, and to obtain the over-limit dataset by collecting all the over-limit data sets.
[0127] In a more specific embodiment of the present invention, based on the foregoing embodiments, the first-level detection module includes: an anomaly coefficient acquisition unit, a quartile selection unit, an out-of-limit anomaly data determination unit, and a marking unit;
[0128] The anomaly coefficient acquisition unit is used to acquire the upper limit anomaly coefficient and the lower limit anomaly coefficient;
[0129] The quartile selection unit is used to divide the data in the over-limit value dataset into four equal parts and select the first quartile data and the third quartile data;
[0130] The out-of-limit abnormal data determination unit is connected to the abnormal coefficient acquisition unit and the quartile selection unit, and is used to determine out-of-limit abnormal data based on the first quartile data, the third quartile data, and the out-of-limit abnormal coefficient, and is also used to determine out-of-limit abnormal data based on the first quartile data, the third quartile data, and the out-of-limit abnormal coefficient;
[0131] The marking unit is connected to the out-of-limit abnormal data determination unit and is used to mark the out-of-limit abnormal data and the out-of-limit abnormal data as data with abnormal detection results.
[0132] In a more specific embodiment of the present invention, based on the foregoing embodiments, the second-level detection module includes: a second acquisition unit, a sliding window setting unit, a first median determination unit, an absolute median difference calculation unit, a second median determination unit, an abnormal data determination unit, and a sliding unit;
[0133] The second acquisition unit is used to acquire a preset first anomaly judgment coefficient, a preset anomaly judgment coefficient, and a preset scaling factor constant;
[0134] The sliding window setting unit is used to set a sliding window of a preset data length, starting from the first data of the data to be detected;
[0135] The first median determination unit is connected to the sliding window setting unit and is used to arrange the data in the sliding window from smallest to largest and determine the median of the data in the arranged sliding window.
[0136] The absolute median difference calculation unit is connected to the first median determination unit and is used to calculate the absolute median difference data based on the median of the data in the sliding window and the data in the sliding window.
[0137] The second median determination unit is connected to the absolute median difference calculation unit and is used to arrange the absolute median difference data from smallest to largest and determine the median of the arranged absolute median difference data.
[0138] The abnormal data determination unit is connected to the first median determination unit, the coefficient and constant addition unit, and the second median determination unit, respectively, and is used to determine the abnormal data in the sliding window based on the median of the data in the sliding window, the first abnormal judgment coefficient, the second abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0139] The sliding unit is connected to the abnormal data determination unit and is used to slide the sliding window for a preset data length and repeat steps S3-S6 until the detection of all data in the second dataset to be detected is completed.
[0140] In a more specific embodiment of the present invention, based on the foregoing embodiments, the abnormal data determination unit includes: a first abnormal data determination subunit, a second abnormal data determination subunit, and a summarizing subunit, wherein:
[0141] The first abnormal data determination subunit is used to determine the first window abnormal data in the sliding window based on the median of the data in the sliding window, the first abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0142] The second abnormal data determination subunit is used to determine the second window abnormal data in the sliding window based on the median of the data in the sliding window, the second abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data.
[0143] The aggregation subunit is used to identify the first window abnormal data and the second window abnormal data as the window abnormal data.
[0144] The present invention also provides a specific embodiment:
[0145] In this embodiment, the voltage data of the power grid voltage monitoring point is shown in Table 1;
[0146] Table 1: Voltage data for 24 hours (unit: V)
[0147]
[0148] Table 1 shows that zero values exist in the voltage data. null value ;
[0149] According to the first-level interquartile range criterion, in data larger than... The out-of-limit data, after sorting, are {236.55, 236.92, 237.37, 238.3, 246.46, 526.33};
[0150] in, According to IQR detection, the data is highly abnormal;
[0151] Thus, the second level of data to be detected is obtained. As shown in Table 2:
[0152] Table 2:
[0153]
[0154] For simplicity, the size of the sliding window is set to [size to be specified]. Then half window size ;
[0155] So, The data will be checked for anomalies using two W windows; the data in the first window...
[0156] right The elements are sorted in ascending order, and the sorted data is shown in Table 3:
[0157] Table 3:
[0158]
[0159] calculate The median is as follows:
[0160]
[0161] Calculate absolute median data :
[0162]
[0163] absolute median data The data in the dataset are arranged in ascending order to obtain the sorted absolute median difference data. ;
[0164] Next, calculate the absolute median data. The median is as follows:
[0165] ;
[0166] Take the first anomaly judgment coefficient 1=3, then the window The threshold for judging abnormal internal voltage data is:
[0167]
[0168] It can be seen that the window Inside All are less than 3.4692, i.e., window There are no abnormal voltage values inside.
[0169] Similarly, anomaly detection was performed on the voltage data within window W2. The process and results are as follows:
[0170] The voltage data within window W2 is shown in Table 4:
[0171] Table 4:
[0172]
[0173] The voltage data within window W2 are sorted from smallest to largest, as shown in Table 5:
[0174] Table 5:
[0175]
[0176] The median of the data within window W2: 237.145V;
[0177] Absolute median difference data within window W2 As shown in Table 6:
[0178] Table 6:
[0179]
[0180] Median of the absolute median difference data within window W2:
[0181]
[0182] Voltage data anomaly detection threshold within window W2:
[0183]
[0184] Therefore, 268.82 in window W2 is abnormal data;
[0185] Anomaly detection results of voltage data at output voltage monitoring points:
[0186]
[0187] The anomaly detection is now complete.
[0188] The terms “first,” “second,” “third,” “fourth,” etc. (if present) in the specification and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented, for example, in orders other than those illustrated or described herein. Furthermore, the terms “comprising” and “having,” and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.
[0189] It should be understood that in this application, "and / or" is used to describe the relationship between related objects, indicating that there can be three relationships. For example, "A and / or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character " / " generally indicates that the related objects before and after it are in an "or" relationship.
[0190] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
[0191] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment, depending on actual needs.
[0192] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0193] The above-described embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this application.
Claims
1. A method for detecting data anomalies at voltage monitoring points in a power distribution network, characterized in that, Includes the following steps: Voltage data from distribution network voltage monitoring points are acquired according to a preset sampling frequency to obtain a voltage dataset. The first dataset to be detected is obtained by removing zero and null values from the voltage dataset. Perform limit detection on the first dataset to be detected to obtain a dataset of limit-exceeding values; The dataset exceeding the limit is subjected to first-level detection based on the interquartile range criterion, and data with abnormal detection results in the first-level detection are removed to obtain the second dataset to be detected. The second-level detection of the second dataset to be detected is performed using the median deviation method to obtain the first abnormal dataset, specifically including the following steps: S1: Obtain the preset first anomaly judgment coefficient, the preset second anomaly judgment coefficient, and the preset scaling factor constant; S2: Starting from the first data point of the data to be detected, set a sliding window with a preset data length; S3: Arrange the data in the sliding window in ascending order, and determine the median of the data in the sliding window after the arrangement; S4: Calculate the absolute median difference data based on the median of the data within the sliding window and the data within the sliding window; S5: Arrange the absolute median difference data from smallest to largest, and determine the median of the arranged absolute median difference data; S6: Based on the median of the data within the sliding window, the first anomaly detection coefficient, the second anomaly detection coefficient, the scale factor constant, and the median of the absolute median difference data, determine the window anomaly data within the sliding window, specifically as follows: The first window of abnormal data within the sliding window is determined based on the median of the data within the sliding window, the first anomaly judgment coefficient, the scale factor constant, and the median of the absolute median difference data. The abnormal data in the second window within the sliding window are determined based on the median of the data within the sliding window, the second anomaly judgment coefficient, the scale factor constant, and the median of the absolute median difference data. The first window abnormal data and the second window abnormal data are identified as the window abnormal data; S7: Slide the sliding window to a preset data length and repeat steps S3-S6 until all data in the second dataset to be detected is detected.
2. The method for detecting data anomalies at voltage monitoring points in a power distribution network according to claim 1, characterized in that, The step of performing limit-crossing detection on the first dataset to be detected to obtain the limit-crossing value dataset is specifically as follows: Obtain the preset voltage management upper limit percentage, the preset voltage management lower limit percentage, and the distribution network voltage level; The voltage management upper limit is calculated based on the voltage management upper limit percentage and the distribution network voltage level. The lower voltage management limit is calculated based on the voltage management lower limit percentage and the distribution network voltage level. Data in the first dataset to be detected that are greater than the upper limit of voltage management or less than the lower limit of voltage management are identified as out-of-limit data, and the set of all out-of-limit data is used to obtain the out-of-limit dataset.
3. The method for detecting data anomalies at voltage monitoring points in a power distribution network according to claim 1, characterized in that, The first-level detection of the out-of-limit dataset based on the quartile data spacing criterion specifically involves: Obtain the preset upper limit abnormality coefficient and the preset lower limit abnormality coefficient; Divide the data in the dataset of out-of-limit values into four equal parts, and select the first quartile and the third quartile data; The abnormal data exceeding the upper limit are determined based on the first quartile data, the third quartile data, and the upper limit abnormality coefficient; The abnormal data that exceeds the lower limit is determined based on the first quartile data, the third quartile data, and the lower limit abnormality coefficient. The abnormal data exceeding the upper limit and the abnormal data exceeding the lower limit are marked as data with abnormal detection results.
4. A data anomaly detection system for voltage monitoring points in a power distribution network, characterized in that, include: The system comprises a sampling module, a first dataset acquisition module, an over-limit detection module, a first-level detection module, and a second-level detection module, wherein: The sampling module is used to acquire voltage data from voltage monitoring points in the power distribution network according to a preset sampling frequency, and obtain a voltage dataset. The first dataset acquisition module is connected to the sampling module and is used to remove zero-value data and null-value data from the voltage dataset to obtain the first dataset to be detected. The limit-crossing detection module is connected to the first detection dataset acquisition module, and performs limit-crossing detection on the first detection dataset to obtain the limit-crossing value dataset; The first-level detection module is connected to the limit-breaking detection module and is used to perform first-level detection on the limit-breaking value dataset according to the interquartile data spacing criterion, and remove data with abnormal detection results in the first-level detection to obtain the second dataset to be detected; The second-level detection module is connected to the first-level detection module and is used to perform second-level detection on the second dataset to be detected according to the median deviation method to obtain the first abnormal dataset; The second-level detection module includes: a second acquisition unit, a sliding window setting unit, a first median determination unit, an absolute median difference calculation unit, a second median determination unit, an abnormal data determination unit, and a sliding unit; The second acquisition unit is used to acquire a preset first anomaly judgment coefficient, a second anomaly judgment coefficient, and a preset scaling factor constant; The sliding window setting unit is used to set a sliding window of a preset data length, starting from the first data of the data to be detected; The first median determination unit is connected to the sliding window setting unit and is used to arrange the data in the sliding window from smallest to largest and determine the median of the data in the arranged sliding window. The absolute median difference calculation unit is connected to the first median determination unit and is used to calculate the absolute median difference data based on the median of the data in the sliding window and the data in the sliding window. The second median determination unit is connected to the absolute median difference calculation unit and is used to arrange the absolute median difference data from smallest to largest and determine the median of the arranged absolute median difference data. The abnormal data determination unit is connected to the first median determination unit, the coefficient and constant addition unit and the second median determination unit respectively, and is used to determine the abnormal data in the sliding window based on the median of the data in the sliding window, the first abnormal judgment coefficient, the second abnormal judgment coefficient, the scale factor constant and the median of the absolute median difference data. The sliding unit is connected to the abnormal data determination unit and is used to slide the sliding window for a preset data length and repeat steps S3-S6 until the detection of all data in the second dataset to be detected is completed. The abnormal data determination unit includes: a first abnormal data determination subunit, a second abnormal data determination subunit, and a summarization subunit, wherein: The first abnormal data determination subunit is used to determine the first window abnormal data in the sliding window based on the median of the data in the sliding window, the first abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data. The second abnormal data determination subunit is used to determine the second window abnormal data in the sliding window based on the median of the data in the sliding window, the second abnormal judgment coefficient, the scale factor constant, and the median of the absolute median difference data. The aggregation subunit is used to identify the first window abnormal data and the second window abnormal data as the window abnormal data.
5. The data anomaly detection system for a distribution network voltage monitoring point according to claim 4, characterized in that, The limit-crossing detection module includes: a first acquisition unit, an upper limit calculation unit, a lower limit calculation unit, and a first judgment unit; The first acquisition unit is used to acquire a preset voltage management upper limit percentage, a preset voltage management lower limit percentage, and a distribution network voltage level; The upper limit calculation unit is connected to the first acquisition unit and is used to calculate the voltage management upper limit based on the voltage management upper limit percentage and the distribution network voltage level. The lower limit calculation unit is connected to the first acquisition unit and is used to calculate the lower limit of voltage management based on the voltage management lower limit percentage and the distribution network voltage level. The first judgment unit is connected to the upper limit calculation unit and the lower limit calculation unit respectively, and is used to judge the data in the first dataset to be detected that is greater than the voltage management upper limit or less than the voltage management lower limit as the over-limit data, and to obtain the over-limit dataset by collecting all the over-limit data sets.
6. The data anomaly detection system for a distribution network voltage monitoring point according to claim 4, characterized in that, The first-level detection module includes: an anomaly coefficient acquisition unit, a quartile selection unit, an out-of-limit anomaly data determination unit, and a marking unit; The anomaly coefficient acquisition unit is used to acquire the upper limit anomaly coefficient and the lower limit anomaly coefficient; The quartile selection unit is used to divide the data in the over-limit value dataset into four equal parts and select the first quartile data and the third quartile data; The out-of-limit abnormal data determination unit is connected to the abnormal coefficient acquisition unit and the quartile selection unit, and is used to determine out-of-limit abnormal data based on the first quartile data, the third quartile data, and the out-of-limit abnormal coefficient, and is also used to determine out-of-limit abnormal data based on the first quartile data, the third quartile data, and the out-of-limit abnormal coefficient; The marking unit is connected to the out-of-limit abnormal data determination unit and is used to mark the out-of-limit abnormal data and the out-of-limit abnormal data as data with abnormal detection results.