An equipment outlier elimination method, system, device and medium
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HANGZHOU ANMAISHENG INTELLIGENT TECH CO LTD
- Filing Date
- 2023-02-27
- Publication Date
- 2026-06-26
AI Technical Summary
In existing technologies for monitoring equipment operating status, manually removing outliers is time-consuming and labor-intensive, while setting fixed thresholds can easily lead to misidentification, especially in long-period data where it is difficult to select appropriate thresholds, resulting in the problem of normal values being removed and outliers being retained.
By acquiring historical operating data of the equipment from multiple measuring points, sliding windowing is performed to calculate the feature frequency of the feature values. Based on the feature frequency, the data to be removed is determined. This process is repeated until there is no abnormal data in the state matrix, thus employing an unsupervised anomaly detection method.
It requires little or no human intervention, greatly freeing up labor, improving the efficiency of outlier identification and removal, and taking into full account the distribution of equipment operation data, thus avoiding the problem of false rejection caused by fixed thresholds.
Smart Images

Figure CN116244293B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of equipment operation status monitoring, and in particular to a method, system, device, and medium for removing abnormal values from equipment. Background Technology
[0002] With the development of big data and intelligent sensing technology, data-driven data reconstruction methods are increasingly used in equipment operation status monitoring. However, data reconstruction methods are heavily dependent on the state matrix, and the quality of the state matrix construction directly affects the estimation accuracy of the data reconstruction method. During the construction of the equipment state matrix, abnormal parameters must be removed, and only the equipment operation parameters under normal operating conditions must be retained.
[0003] Currently, the main methods used are manual removal or setting fixed thresholds. However, manual removal requires engineers to analyze each of the equipment's operating parameters and remove outliers based on their experience, which is quite time-consuming and labor-intensive. On the other hand, setting threshold limits for each feature data in the state matrix and removing data that does not meet the threshold limits can easily lead to misidentification, especially for long-period data. Selecting a suitable threshold is difficult and may result in too many normal values being removed and outliers being retained.
[0004] In view of the above problems, finding a solution to these technical problems is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0005] The purpose of this application is to provide a method, system, apparatus, and medium for removing outlier data from equipment. This method involves acquiring historical operating data from multiple measurement points, performing sliding windowing on the historical operating data, calculating the feature frequencies of each measurement point within each sliding window, determining the data to be deleted based on the feature frequencies, and repeating this process until no outlier data remains in the state matrix. This method enables unsupervised anomaly detection, requiring little or no manual intervention, significantly reducing labor costs and greatly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, effectively avoiding the problem of excessive normal values being removed and outliers being retained due to setting fixed removal thresholds.
[0006] To address the aforementioned technical problems, this application provides a method for eliminating outlier values in equipment, comprising:
[0007] Acquire historical operating data of equipment at multiple measuring points;
[0008] The historical operation data is processed by sliding windowing, and the characteristic values of each measuring point within each sliding window are calculated;
[0009] Obtain the feature frequency corresponding to each feature value;
[0010] The historical operational data that needs to be removed is determined based on the frequency of characteristic features.
[0011] Preferably, determining the historical operational data to be removed based on the characteristic frequency includes:
[0012] Calculate the feature frequency score based on the feature frequency;
[0013] Sort the feature frequency scores in ascending order and mark the window interval of the first β feature frequency scores as the elimination interval;
[0014] Delete historical running data in the exclusion interval.
[0015] Preferably, after acquiring historical operating data of the equipment at multiple measuring points, before performing sliding windowing processing on the historical operating data and calculating the characteristic values of each measuring point within each window, the process further includes:
[0016] The historical operational data of each measuring point were normalized.
[0017] Preferably, after acquiring historical operating data of the device at multiple measuring points, the process further includes:
[0018] The historical operational data of each measuring point are quantitatively described using a correlation analysis algorithm;
[0019] Align the historical operating data of each measuring point on the time coordinate and form the original state matrix of the equipment;
[0020] The historical operation data includes the actual measurement point data, test data, and model calculation data of the equipment. In the original state matrix of the equipment, the horizontal axis is the time axis and the vertical axis is the corresponding selected measurement point.
[0021] Preferably, the historical operating data is processed by sliding windowing, and the characteristic values of each measuring point within each sliding window are calculated, including:
[0022] Obtain the historical operation data corresponding to the measurement points within each sliding window;
[0023] Determine the corresponding feature data based on the historical operation data within each sliding window;
[0024] Confidence parameters corresponding to each of the aforementioned feature data are obtained in advance;
[0025] Determine the confidence interval corresponding to each feature data based on the initial feature value and confidence parameter corresponding to each feature data;
[0026] The overlapping intervals between the confidence intervals in each sliding window are determined based on the feature data in each sliding window and the corresponding confidence intervals.
[0027] The initial feature values within each sliding window corresponding to the overlapping interval are set to be the same, and are the feature values of each corresponding measurement point.
[0028] Preferably, calculating the feature frequency score based on the feature frequency includes:
[0029] Calculate the feature frequency score according to the preset formula;
[0030] The preset formula is:
[0031]
[0032] Where R is the feature frequency score, n is the number of features selected, and M is the number of features selected. i Let f(M) be the eigenvalue of a certain feature. i ) is the eigenvalue M i Frequency of occurrence.
[0033] Preferably, after determining the historical operational data to be removed based on the characteristic frequency, the process further includes:
[0034] Determine if there is any abnormal data in the original state matrix of the device;
[0035] If it exists, return to the step of performing sliding window processing on the historical running data and calculating the characteristic value of each measuring point within each sliding window.
[0036] To address the aforementioned technical problems, this application also provides a system for eliminating outlier equipment values, comprising:
[0037] The acquisition module is used to acquire historical operating data of the equipment at multiple measurement points;
[0038] The processing module is used to perform sliding window processing on historical operation data and calculate the characteristic values of each measurement point within each sliding window;
[0039] The calculation module is used to obtain the feature frequency corresponding to each feature value;
[0040] The determination module is used to determine the historical running data that needs to be removed based on the feature frequency.
[0041] To address the aforementioned technical problems, this application also provides an outlier removal device, including a memory for storing a computer program;
[0042] A processor is used to implement the steps of the device outlier removal method described above when executing a computer program.
[0043] To address the aforementioned technical problems, this application also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the device outlier removal method described above.
[0044] The outlier removal method provided in this application acquires historical operating data from multiple measuring points, performs sliding windowing on the historical operating data, calculates the feature frequency of each measuring point within each sliding window, determines the data to be deleted based on the feature frequency, and repeats the operation until there is no outlier data in the state matrix. This method enables unsupervised anomaly detection, requiring little or no manual intervention, significantly reducing labor costs and greatly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, effectively avoiding the problem of excessive normal values being removed and outliers being retained due to setting fixed removal thresholds.
[0045] The device outlier removal system, apparatus, and storage medium provided in this application have the same beneficial effects as the aforementioned device outlier removal method. Attached Figure Description
[0046] To more clearly illustrate the embodiments of this application, the accompanying drawings used in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0047] Figure 1 A flowchart of a method for removing outlier values in equipment provided in this application;
[0048] Figure 2 A structural diagram of an equipment outlier removal system provided in this application;
[0049] Figure 3 This is a structural diagram of an outlier removal device provided in another embodiment of this application. Detailed Implementation
[0050] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the protection scope of this application.
[0051] The core of this application is to provide a method, system, apparatus, and medium for removing outlier data from equipment. This method involves acquiring historical operating data from multiple measurement points, performing sliding windowing on the historical operating data, calculating the feature frequencies of each measurement point within each sliding window, determining the data to be deleted based on the feature frequencies, and repeating this process until no outlier data remains in the state matrix. This method enables unsupervised anomaly detection, requiring little or no manual intervention, significantly reducing labor costs and greatly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, effectively avoiding the problem of excessive normal values being removed and outliers being retained due to setting fixed removal thresholds.
[0052] To enable those skilled in the art to better understand the present application, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0053] Figure 1 A flowchart of a device outlier removal method provided in this application is shown below. Figure 1 As shown, the method includes:
[0054] S10: Acquire historical operating data of the equipment at multiple measuring points;
[0055] It should be noted that, for different devices, multiple test points are set. This application embodiment does not make specific limitations on the type of device under test or the application scenario of the device. This application embodiment does not make specific limitations on the specific number or type of test points. This application embodiment does not make specific limitations on the type or number of historical operating data. Historical operating data should have a strong correlation with the device. This application does not make specific limitations on the processing method of historical operating data.
[0056] S11: Perform sliding window processing on historical operation data and calculate the characteristic values of each measuring point within each sliding window;
[0057] It should be noted that the method provided in this application embodiment performs sliding windowing processing on historical operating data. It does not directly calculate outliers within the sliding window, but instead calculates the feature values of the measurement points within the sliding window and removes outliers based on their feature values. This application embodiment does not make specific limitations on the number of sliding window divisions, the width of the sliding window, or the step size of the sliding window. This application embodiment does not make specific limitations on the method of determining outliers based on feature values, nor does it make specific limitations on the types and number of feature values. The feature values of the measurement point data within the sliding window can be determined according to the characteristics of the actual operating data of the equipment, including but not limited to data mean, variance, energy value, effective value, peak value, peak-to-peak value, kurtosis, etc.
[0058] S12: Obtain the feature frequency corresponding to each feature value;
[0059] S13: Determine the historical operational data that needs to be removed based on the characteristic frequency.
[0060] It should be noted that the frequency of each feature value is determined based on the feature value, i.e., the feature frequency. The corresponding historical running data to be removed is determined based on the feature frequency. This application embodiment does not limit the method of removing the historical running data to be removed based on the feature frequency. It can be calculated and the removal interval can be determined by a preset formula, etc. This application embodiment only provides a preferred implementation method and is not limited to the above-mentioned methods.
[0061] As can be seen, the method provided in this application acquires historical operating data of the equipment at multiple measuring points, performs sliding windowing processing on the historical operating data, calculates the corresponding feature frequencies of the feature values of each measuring point within each sliding window, determines the data to be deleted based on the feature frequencies, and repeats the operation until there is no abnormal data in the state matrix. This method enables unsupervised anomaly detection, requiring no or only minimal manual intervention, greatly reducing labor costs and significantly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, better avoiding the problem of excessive normal values being removed and abnormal values being retained due to setting fixed removal thresholds.
[0062] Based on the above embodiments, this application provides a preferred embodiment in which the determination of historical operational data to be removed based on feature frequencies includes:
[0063] Calculate the feature frequency score based on the feature frequency;
[0064] Sort the feature frequency scores in ascending order and mark the window interval of the first β feature frequency scores as the elimination interval;
[0065] Delete historical running data in the exclusion interval.
[0066] It should be noted that when determining the historical operational data to be removed based on the feature frequency, the feature frequency score is calculated. This application embodiment does not impose specific limitations on the calculation method of the feature frequency score; it can be calculated according to a preset formula. Specifically, the preset formula can assign corresponding weight scores to obtain the corresponding feature frequency score, or it can divide the specific feature frequency data into corresponding frequency intervals, with different frequency intervals mapped to corresponding feature frequency scores. Alternatively, it can determine the corresponding score based on the frequency of feature value occurrences; this is not limited here and can be calculated according to the actual situation.
[0067] Regarding the processing of feature frequency scores, the frequency probabilities of each feature value appearing within the sliding window interval are different, and their corresponding feature frequency scores are also different. During the statistical process, the feature frequency scores may fluctuate significantly. For example, if the feature frequency scores corresponding to 5 feature values are 10, 10, 5, 10, and 9, the score 5 has a large gap with the other scores and should be removed as an outlier. Existing removal methods rely on manual experience. To achieve automated removal, the feature frequency scores are sorted in a certain order, and the feature values corresponding to the first β or last β sorted feature frequency scores are removed.
[0068] It is understandable that the sorting can be done in a specific order, preferably from smallest to largest, marking the first β window intervals as the elimination intervals. Alternatively, it can be sorted from largest to smallest, marking the last β window intervals as the elimination intervals. For feature frequency scores, the scores are sorted from smallest to largest, and a marking parameter β is set. The window intervals of the first β feature frequency scores are marked as elimination intervals, improving elimination efficiency and increasing the accuracy of the determined elimination intervals. This allows for the deletion of historical data within the elimination intervals. This application does not impose a specific limitation on the value of β. If abnormal data still exists after the historical data in the first β window intervals has been eliminated, the value of β can be modified manually, and the data can be eliminated again. This application is not limited to the methods described above and can be selected according to the actual situation.
[0069] As can be seen, the method provided in this application acquires historical operating data of the equipment at multiple measuring points, performs sliding windowing processing on the historical operating data, calculates the feature values of each measuring point within each sliding window and calculates the feature frequency score, and deletes the original data of the first β feature frequency score window intervals in ascending order, repeating the operation until there is no abnormal data in the state matrix. Through the above method, unsupervised anomaly detection is performed, requiring no or only a small amount of manual intervention, greatly liberating labor and significantly improving the efficiency of identification and removal; moreover, this method can comprehensively consider the distribution of equipment operating data, better avoiding the problem of excessive normal values being removed and abnormal values being retained due to setting a fixed removal threshold.
[0070] Based on the above embodiments, this application provides a preferred embodiment, which, after acquiring historical operating data of the device at multiple measuring points, further includes performing sliding windowing processing on the historical operating data and calculating the feature values of each measuring point within each window before:
[0071] The historical operational data of each measuring point were normalized.
[0072] It should be noted that before performing sliding windowing processing on the historical operating data and calculating the feature values of each measuring point within each window, the historical operating data of each measuring point is normalized, and then the historical operating data after normalization is subjected to sliding windowing processing. This application embodiment only provides a preferred implementation method, and this application embodiment is not limited to the above processing method, and can be selected according to the actual situation.
[0073] As can be seen, the method provided in this application acquires historical operating data of the equipment at multiple measuring points, normalizes the historical operating data, and then performs sliding windowing processing. It calculates the feature values of each measuring point within each sliding window and calculates the feature frequency score. The original data within the window intervals of the first β feature frequency scores are deleted in ascending order. This process is repeated until there is no abnormal data in the state matrix. This method enables unsupervised anomaly detection, requiring little or no manual intervention, greatly reducing labor costs and significantly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, effectively avoiding the problem of excessive normal values being removed and abnormal values being retained due to setting fixed removal thresholds.
[0074] Based on the above embodiments, this application provides a preferred embodiment, which further includes, after acquiring historical operating data of the device at multiple measuring points:
[0075] The historical operational data of each measuring point are quantitatively described using a correlation analysis algorithm;
[0076] Align the historical operating data of each measuring point on the time coordinate and form the original state matrix of the equipment;
[0077] The historical operation data includes the actual measurement point data, test data, and model calculation data of the equipment. In the original state matrix of the equipment, the horizontal axis is the time axis and the vertical axis is the corresponding selected measurement point.
[0078] It should be noted that after obtaining the historical operating data of the equipment, the historical operating data is processed to form the original state matrix of the equipment. The method for forming the original state matrix can be, but is not limited to, quantitatively describing the historical operating data of each measuring point using a correlation analysis algorithm, aligning the historical operating data of each measuring point on the time axis, and forming the original state matrix. The historical operating data can include, but is not limited to, actual measuring point data, test data, and model calculation data. Actual measuring point data can be the actual current, voltage, and motor speed values measured by corresponding sensors on the equipment. Test data is simulated data obtained by artificially pressurizing the equipment in its actual operating environment. Model calculation data can be the flow difference or pressure difference obtained using a certain model between multiple pipe inlets and outlets corresponding to the equipment. This embodiment only refers to the general data of the corresponding equipment; the specific data can be set according to the actual situation. In the original state matrix of the equipment, the horizontal axis is the time axis, and the vertical axis is the corresponding selected measuring point. This embodiment is not limited to the above-mentioned method of forming the original state matrix of the equipment and can be changed according to the actual situation.
[0079] As can be seen, the method provided in this application acquires historical operating data of the equipment from multiple measuring points to form an original equipment state matrix. The historical operating data in the original equipment state matrix is normalized, and then a sliding windowing process is applied to the original equipment state matrix. Feature values of each measuring point within each sliding window are calculated, and feature frequency scores are calculated. The original data within the first β feature frequency score window intervals are deleted in ascending order. This process is repeated until no abnormal data remains in the state matrix. This method enables unsupervised anomaly detection, requiring little or no manual intervention, significantly reducing labor costs and greatly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, effectively avoiding the problem of excessive normal values being removed and abnormal values being retained due to setting fixed removal thresholds.
[0080] Based on the above embodiments, this application provides a preferred embodiment, which performs sliding windowing processing on historical operating data and calculates the feature values of each measuring point within each sliding window, including:
[0081] Obtain historical operation data corresponding to the measurement points within each sliding window;
[0082] Determine the corresponding feature data based on the historical operation data within each sliding window;
[0083] Pre-acquire the confidence parameters corresponding to each feature data;
[0084] Determine the confidence interval for each feature data based on the initial feature value and confidence parameter corresponding to each feature data;
[0085] The overlapping intervals between confidence intervals within each sliding window are determined based on the feature data and corresponding confidence intervals within each sliding window.
[0086] The initial feature values in each sliding window corresponding to the overlapping interval are set to be the same, and are the feature values of the corresponding measurement points.
[0087] It is understandable that the processing within each sliding window, taking one sliding window as an example, involves measuring point data. Based on the historical operating data corresponding to these measuring point data, the characteristic data is determined. Here, the specific characteristic data is determined through the historical operating data. It is worth noting that, assuming a sliding window contains 100 historical operating data points, the characteristic data corresponds to the data processing method used for that historical operating data, such as mean processing or variance processing. The mean obtained after mean processing is considered as a characteristic data point, and the variance obtained after variance processing is also considered as a characteristic data point. Characteristic data includes, but is not limited to, data mean, variance, energy value, effective value, peak value, peak-to-peak value, and kurtosis. In other words, the required characteristic data is determined based on the current historical operating data. In this embodiment, seven characteristic data points can be determined based on the current historical operating data: data mean, variance, energy value, effective value, peak value, peak-to-peak value, and kurtosis. The corresponding characteristic data processing procedures can be the same as or different from existing processing methods; no limitation is made here. Based on each characteristic data point, the corresponding characteristic value can be determined.
[0088] Correspondingly, confidence parameters for each feature data are obtained. Since the initial feature values corresponding to the feature data are the initial result values obtained after different data processing methods (e.g., the mean, variance, and peak value are one feature data), the initial feature values obtained under each feature data are different, and their set confidence parameters can be the same or different. As an example, one feature corresponds to one confidence parameter, that is, the confidence parameters for each feature data are different. The corresponding confidence interval is determined based on the initial feature values and confidence parameters corresponding to each feature data.
[0089] Taking the feature mean as an example, set a confidence parameter and determine the confidence interval of the feature value based on the confidence parameter;
[0090] The confidence interval is MC = [Ma, M + a], where MC represents the confidence interval, M represents the eigenvalue, and a represents the confidence parameter.
[0091] Since each sliding window is in a sliding state, the data within each window may be the same or different. The initial feature values of the feature data corresponding to the historical running data of adjacent sliding windows have small differences. Therefore, adjacent sliding windows can determine the overlapping intervals of their confidence intervals based on their respective feature data and corresponding confidence intervals. Based on the overlapping intervals, the corresponding initial feature values can be updated to be the same. This determination process can be based on the fact that the initial feature value in the sliding window following the overlapping interval is the same as the initial feature value in the sliding window preceding the overlapping interval, thus pre-processing the feature values.
[0092] Taking two sliding windows as an example, each sliding window has 100 historical data points. The characteristic data determined by each 100 historical data points includes seven types of characteristic data: mean, variance, energy value, effective value, peak value, peak-to-peak value, and kurtosis. Taking the mean as an example, the initial characteristic values (average results) calculated based on the 100 historical data points in the first and second sliding windows are 5 and 6, respectively. The confidence parameters for the first and second sliding windows are both 1. Therefore, the first confidence interval of the first sliding window is [4,6], the second confidence interval of the second sliding window is [5,7], and the overlapping interval of the two confidence intervals is [5,6]. Based on the initial characteristic value of 5 and the overlapping interval in the first sliding window, the characteristic value (average) of the characteristic data (data average) of this measuring point is determined to be 5.
[0093] It should be noted that the embodiments of this application do not make specific limitations on the size of the confidence interval, nor do they make specific limitations on the value of the confidence parameter.
[0094] As can be seen, by setting confidence intervals, if the confidence intervals of the feature values of each feature overlap, the feature values are judged to be the same, which greatly improves the efficiency of identification and rejection. Moreover, this method can comprehensively consider the distribution of equipment operation data, and better get rid of the problem of too many normal values being rejected and abnormal values being retained due to setting a fixed rejection threshold.
[0095] Based on the above embodiments, this application provides a preferred embodiment in which the feature frequency score is calculated according to the feature frequency, including:
[0096] Calculate the feature frequency score according to the preset formula;
[0097] The preset formula is:
[0098]
[0099] Where R is the feature frequency score, n is the number of features selected, and M is the number of features selected. i Let f(M) be the eigenvalue of a certain feature. i) is the eigenvalue M i Frequency of occurrence.
[0100] As can be seen, the method provided in this application calculates the feature frequency score based on the relationship between the feature frequency score, the number of features, the feature value of a certain feature, and the frequency of occurrence of the feature value. The original data within the window interval of the first β feature frequency scores are deleted in ascending order, and this operation is repeated until there is no abnormal data in the state matrix. This method enables unsupervised anomaly detection, requiring little or no manual intervention, greatly reducing labor costs and significantly improving identification and removal efficiency. Furthermore, this method comprehensively considers the distribution of equipment operating data, effectively avoiding the problem of excessive normal values being removed and abnormal values being retained due to setting a fixed removal threshold.
[0101] Based on the above embodiments, this application provides a preferred embodiment, which further includes, after determining the historical operating data to be removed based on the feature frequency:
[0102] Determine if there is any abnormal data in the original state matrix of the equipment; if so, return to the step of performing sliding window processing on the historical operation data and calculating the characteristic value of each measurement point in each sliding window; if not, it means that all abnormal values have been removed.
[0103] As can be seen, the method provided in this application embodiment performs unsupervised anomaly detection by repeatedly judging and removing abnormal data. It requires no manual intervention or only a small amount of manual intervention, which greatly liberates labor and significantly improves the efficiency of identification and removal. Moreover, this method can comprehensively consider the distribution of equipment operation data, and better get rid of the problem of too many normal values being removed and abnormal values being retained due to setting a fixed removal threshold.
[0104] From the perspective of functional modules, this application also provides an embodiment of a device outlier removal system, such as... Figure 2 As shown, Figure 2 The present application provides a structural diagram of an equipment outlier removal system, which includes:
[0105] Module 10 is used to acquire historical operating data of the equipment at multiple measuring points;
[0106] Processing module 11 is used to perform sliding window processing on historical operation data and calculate the characteristic values of each measuring point within each sliding window;
[0107] Calculation module 12 is used to obtain the feature frequency corresponding to each feature value;
[0108] The determination module 13 is used to determine the historical running data that needs to be removed based on the feature frequency.
[0109] Since the embodiments of the system part correspond to the embodiments of the method part, please refer to the description of the embodiments of the method part for the embodiments of the system part, and they will not be repeated here.
[0110] The equipment outlier removal system provided in this embodiment corresponds to the equipment outlier removal method described above, and therefore has the same beneficial effects as the method described above.
[0111] Figure 3 A structural diagram of a device outlier removal apparatus provided in another embodiment of this application is shown below. Figure 3 As shown, the device anomaly removal device includes: a memory 20 for storing computer programs;
[0112] The processor 21 is used to implement the steps of the device outlier removal method mentioned in the above embodiments when executing a computer program.
[0113] The device outlier removal device provided in this embodiment may include, but is not limited to, smartphones, tablets, laptops, or desktop computers.
[0114] The processor 21 may include one or more processing cores, such as a quad-core processor or an octa-core processor. The processor 21 may be implemented using at least one of the following hardware forms: Digital Signal Processor (DSP), Field-Programmable Gate Array (FPGA), or Programmable Logic Array (PLA). The processor 21 may also include a main processor and a coprocessor. The main processor, also known as the Central Processing Unit (CPU), is used to process data in the wake-up state; the coprocessor is a low-power processor used to process data in the standby state. In some embodiments, the processor 21 may integrate a Graphics Processing Unit (GPU), which is responsible for rendering and drawing the content to be displayed on the screen. In some embodiments, the processor 21 may also include an Artificial Intelligence (AI) processor, which is used to handle computational operations related to machine learning.
[0115] The memory 20 may include one or more computer-readable storage media, which may be non-transitory. The memory 20 may also include high-speed random access memory and non-volatile memory, such as one or more disk storage devices or flash memory devices. In this embodiment, the memory 20 is used to store at least the following computer program 201, which, after being loaded and executed by the processor 21, is capable of implementing the relevant steps of the device outlier removal method disclosed in any of the foregoing embodiments. In addition, the resources stored in the memory 20 may also include an operating system 202 and data 203, and the storage method may be temporary or permanent storage. The operating system 202 may include Windows, Unix, Linux, etc. The data 203 may include, but is not limited to, the data included in the device outlier removal method.
[0116] In some embodiments, the device outlier removal device may further include a display screen 22, an input / output interface 23, a communication interface 24, a power supply 25, and a communication bus 26.
[0117] Those skilled in the art will understand that Figure 3 The structure shown does not constitute a limitation on the device outlier removal device and may include more or fewer components than shown.
[0118] The device outlier removal apparatus provided in this application includes a memory and a processor. When the processor executes the program stored in the memory, it can implement the following method: device outlier removal method.
[0119] Finally, this application also provides an embodiment corresponding to a computer-readable storage medium. The computer-readable storage medium stores a computer program, which, when executed by a processor, implements the steps described in the above method embodiments.
[0120] It is understood that if the methods in the above embodiments are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and executes all or part of the steps of the methods in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0121] The foregoing has provided a detailed description of a method, system, apparatus, and medium for removing outliers from devices provided in this application. The various embodiments in the specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. For the apparatus disclosed in the embodiments, since it corresponds to the method disclosed in the embodiments, the description is relatively simple; relevant parts can be referred to in the method section. It should be noted that those skilled in the art can make various improvements and modifications to this application without departing from the principles of this application, and these improvements and modifications also fall within the protection scope of the claims of this application.
[0122] It should also be noted that, in this specification, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the element.
Claims
1. A method for eliminating abnormal values in equipment, characterized in that, The method includes: Acquire historical operating data of equipment at multiple measuring points; The historical operation data is processed by sliding windowing, and the characteristic value of each measurement point within each sliding window is calculated. Obtain the feature frequency corresponding to each of the aforementioned feature values; The historical operational data that needs to be removed is determined based on the frequency of the characteristics. Correspondingly, the step of performing sliding windowing processing on the historical operating data and calculating the feature values of each measurement point within each sliding window includes: Obtain the historical operation data corresponding to the measurement points within each sliding window; Determine the corresponding feature data based on the historical operation data within each sliding window; Confidence parameters corresponding to each of the aforementioned feature data are obtained in advance; Determine the confidence interval corresponding to each feature data based on the initial feature value and confidence parameter corresponding to each feature data; The overlapping intervals between the confidence intervals in each sliding window are determined based on the feature data and the corresponding confidence intervals in each sliding window; wherein, adjacent sliding windows determine the overlapping intervals of their confidence intervals based on their respective feature data and the corresponding confidence intervals. The initial feature values within each sliding window corresponding to the overlapping interval are set to be the same, and are the feature values of each corresponding measurement point.
2. The equipment anomaly removal method according to claim 1, characterized in that, The determination of the historical operational data to be removed based on the characteristic frequency includes: Calculate the feature frequency score based on the feature frequency; The feature frequency scores are sorted in ascending order, and the window interval of the first β feature frequency scores is marked as the elimination interval; Delete the historical running data in the exclusion interval.
3. The equipment anomaly removal method according to claim 2, characterized in that, After acquiring historical operating data of the device at multiple measuring points, before performing sliding windowing processing on the historical operating data and calculating the feature values of each measuring point within each window, the process further includes: The historical operating data of each measuring point are normalized.
4. The equipment anomaly removal method according to claim 3, characterized in that, After acquiring the historical operating data of the device at multiple measuring points, the process also includes: The historical operational data of each measuring point are quantitatively described using a correlation analysis algorithm; The historical operating data of each measuring point are aligned on the time coordinate and formed into the original state matrix of the device. The historical operating data includes the actual measurement point data, test data, and model calculation data of the device. The horizontal axis of the original state matrix of the device is the time coordinate, and the vertical axis is the corresponding selected measurement point.
5. The equipment anomaly removal method according to claim 4, characterized in that, The calculation of the feature frequency score based on the feature frequency includes: The feature frequency score is calculated according to a preset formula; The preset formula is as follows: ; in, Score the frequency of the features. To select the number of the features, The feature value is a certain one of the aforementioned features. Eigenvalues Frequency of occurrence.
6. The equipment anomaly removal method according to any one of claims 1 to 5, characterized in that, After determining the historical operational data to be removed based on the characteristic frequency, the process further includes: Determine whether there is any abnormal data in the original state matrix of the device; If it exists, return to the step of performing sliding windowing processing on the historical running data and calculating the characteristic value of each measurement point within each sliding window.
7. A system for rejecting outlier equipment values, characterized in that, The system includes: The acquisition module is used to acquire historical operating data of the equipment at multiple measurement points; The processing module is used to perform sliding windowing processing on the historical operation data and calculate the feature values of each measurement point within each sliding window; The calculation module is used to obtain the feature frequency corresponding to each of the aforementioned feature values; The determination module is used to determine the historical running data that needs to be removed based on the characteristic frequency. Correspondingly, the step of performing sliding windowing processing on the historical operating data and calculating the feature values of each measurement point within each sliding window includes: Obtain the historical operation data corresponding to the measurement points within each sliding window; Determine the corresponding feature data based on the historical operation data within each sliding window; Confidence parameters corresponding to each of the aforementioned feature data are obtained in advance; Determine the confidence interval corresponding to each feature data based on the initial feature value and confidence parameter corresponding to each feature data; The overlapping intervals between the confidence intervals in each sliding window are determined based on the feature data and the corresponding confidence intervals in each sliding window; wherein, adjacent sliding windows determine the overlapping intervals of their confidence intervals based on their respective feature data and the corresponding confidence intervals. The initial feature values within each sliding window corresponding to the overlapping interval are set to be the same, and are the feature values of each corresponding measurement point.
8. A device for rejecting outlier equipment values, characterized in that, Includes memory used to store computer programs; A processor, configured to implement the steps of the device outlier removal method as described in any one of claims 1 to 6 when executing the computer program.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the device outlier removal method as described in any one of claims 1 to 6.