Optimization of time series anomaly detection
By decomposing time series data and generating confidence intervals, outlier data points are identified and processed, thus solving the problems of error and data drift in time series prediction models and achieving more accurate and reliable predictions.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INTERNATIONAL BUSINESS MACHINE CORPORATION
- Filing Date
- 2024-11-26
- Publication Date
- 2026-06-19
AI Technical Summary
Existing time series forecasting models are prone to errors and false positive predictions when faced with outlier data points, and they also struggle to effectively identify and handle data drift issues.
By decomposing time series data into residuals, trend, and seasonal components, confidence intervals are generated, outlier data points are identified and labeled, the model is adjusted to prevent error propagation, and a semi-supervised anomaly detection model is used for training and analysis.
It improves the accuracy and reliability of time series forecasting, reduces false positive predictions, prevents errors caused by data drift, and enhances the robustness of the model.
Smart Images

Figure CN122249816A_ABST
Abstract
Description
Background Technology
[0001] This invention relates to artificial intelligence and machine learning, and more specifically, to anomaly detection in time series data.
[0002] A time series dataset is a collection of points with values arranged chronologically. The data points consist of an independent variable that is a time variable and a dependent variable that is a measured phenomenon. Sensors (e.g., the Internet of Things) or similar measuring devices typically generate data sequences or time series data. Time series data can also originate from financial markets, crop or agricultural yields, sunlight exposure, electricity usage, internet traffic, and even wildlife populations. When sensors operate correctly, variations in the data points are generally within expected ranges.
[0003] In its simplest terms, time series forecasting is a task in which an initial set of elements in a sequence is given, and the number of future elements in the sequence is predicted based on that initial set. Time series forecasting occurs when scientific forecasting is performed based on historical timestamp data. It involves building models through historical analysis and using those models to make observations and drive strategic decisions for the future. Summary of the Invention
[0004] According to embodiments of the present invention, a computer-implemented method for anomaly adjustment in a time series forecasting model can be disclosed. This computer-implemented method includes decomposing time series data into residual components, trend components, and seasonal components. The residual and trend components are segmented based on time values. A first data point of the time series data is classified as an anomaly. It is determined whether the first data point is within an outlier confidence interval. In response to determining that the first data point is within the outlier confidence interval, the first data point is marked as a non-outlier. It is determined whether the first data point is within a horizontally shifted confidence interval of the same residual segment as the first data point. In response to the number of non-outlier data points marked as non-outliers exceeding a threshold, the average difference between a first trend segment corresponding to the residual segment containing the first data point and a second trend segment immediately preceding it is calculated. It is determined whether the average difference between the first and second trend segments is within a horizontally shifted confidence interval. In response to the average difference being within the horizontally shifted confidence interval, the variance difference between the first and second trend segments is calculated. Determine whether the variance difference is within the variance confidence interval. In response to determining that the variance difference is within the variance confidence interval, remove the outlier classification of the first data point. Embodiments of the present invention have the advantage of preventing outlier time series data points from causing errors in the prediction of future data points due to data drift or other machine learning data problems.
[0005] According to a preferred embodiment, the computer-implemented method for anomaly adjustment in a time series forecasting model further includes predicting one or more time series data points of the time series data based at least in part on a first data point, wherein the anomaly classification of the first data point has been removed.
[0006] According to a preferred embodiment, the computer-implemented method for anomaly adjustment in a time series forecasting model further includes training a semi-supervised anomaly detection model, wherein the anomaly detection model explores features based on historical time series training data.
[0007] According to a preferred embodiment, the computer-implemented method for anomaly adjustment in a time series forecasting model further includes decomposing historical time series data into historical residual components. The historical residual components are segmented. The mean of each historical residual component segment is calculated. The difference between the means of each adjacent historical residual component segment is obtained. The mean difference is calculated based on the difference between the means of each adjacent historical residual component segment. The standard deviation is calculated based on the mean difference. Outlier confidence intervals are generated, wherein the maximum value of the outlier confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the outlier confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
[0008] According to a preferred embodiment, the computer-implemented method for anomaly adjustment in a time series forecasting model further includes decomposing historical time series data into historical trend components. The historical trend components are segmented. The average value of each historical trend component segment is calculated. The difference between the average values of adjacent historical trend component segments is obtained. An average difference is calculated based on the difference between the average values of adjacent historical trend component segments. A standard deviation is calculated based on the average difference. A horizontal shift interval is generated, wherein the maximum value of the horizontal shift interval is the average difference plus the standard deviation multiplied by a threshold, and the minimum value of the horizontal shift interval is the average difference minus the standard deviation multiplied by the threshold.
[0009] According to a preferred embodiment, the computer-implemented method for anomaly adjustment in a time series forecasting model further includes: decomposing historical time series data into historical residual components; segmenting the historical residual components; calculating the variance of each historical residual component segment; obtaining the difference between each variance of adjacent historical residual component segments; calculating the average variance difference based on the difference between each mean of adjacent historical residual component segments; calculating the standard deviation based on the average variance difference; and generating variance confidence intervals, wherein the maximum value of the variance confidence interval is the average variance difference plus the standard deviation multiplied by a threshold, and the minimum value of the variance confidence interval is the average variance difference minus the standard deviation multiplied by the threshold.
[0010] According to a preferred embodiment, a computer-implemented method for anomaly adjustment in a time series forecasting model, wherein time series data is associated with power generation.
[0011] According to another embodiment of the present invention, a computer system for anomaly adjustment in a time series forecasting model may be disclosed. The computer system may include a processor, a computer-readable storage device, and a computer-readable tangible storage device and program instructions stored on the computer-readable storage device for execution by the processor via the computer-readable storage device, wherein the computer system is configured to perform a method. The method may include decomposing time series data into residual components, trend components, and seasonal components. The residual components and trend components are segmented based on time values. A first data point of the time series data is classified as an anomaly. It is determined whether the first data point is within an outlier confidence interval. In response to determining that the first data point is within an outlier confidence interval, the first data point is marked as a non-outlier. It is determined whether the first data point is within a horizontally shifted confidence interval of the same residual segment as the first data point. In response to the number of non-outlier data points marked as non-outliers exceeding a threshold, the average difference between a first trend segment corresponding to the residual segment containing the first data point and a second trend segment immediately preceding it is calculated. It is determined whether the average difference between the first trend segment and the second trend segment is within a horizontally shifted confidence interval. In response to the mean difference being within the horizontal shift confidence interval, the variance difference between the first and second trend segments is calculated. It is then determined whether the variance difference is within the variance confidence interval. In response to determining that the variance difference is within the variance confidence interval, the outlier classification of the first data point is removed. Embodiments of the present invention can have the advantage of preventing outlier time series data points from causing errors in the prediction of future data points due to data drift or other machine learning data problems.
[0012] According to another embodiment of the present invention, a computer program product for anomaly adjustment in a time series forecasting model may be disclosed. The computer program product may include a computer-readable storage device having program instructions embodied therein, wherein execution of the program instructions by a computer processor causes the computing device to perform a method. The method may include decomposing time series data into residual components, trend components, and seasonal components. The residual components and trend components are segmented based on time values. A first data point of the time series data is classified as an anomaly. Whether the first data point is within an outlier confidence interval is determined. In response to determining that the first data point is within an outlier confidence interval, the first data point is marked as a non-outlier. Whether the first data point is within a horizontally shifted confidence interval of the same residual segment as the first data point is determined. In response to the number of non-outlier data points marked as non-outliers exceeding a threshold, the average difference between a first trend segment corresponding to the residual segment containing the first data point and a second trend segment immediately preceding it is calculated. Whether the average difference between the first trend segment and the second trend segment is within a horizontally shifted confidence interval is determined. In response to the mean difference being within the horizontal shift confidence interval, the variance difference between the first trend segment and the second trend segment is calculated. It is then determined whether the variance difference is within the variance confidence interval. In response to determining that the variance difference is within the variance confidence interval, the outlier classification of the first data point is removed. Embodiments of the present invention can have the advantage of preventing outlier time series data points from causing errors in the prediction of future data points due to data drift or other machine learning data problems. Attached Figure Description
[0013] Preferred embodiments of the invention will now be described by way of example only and with reference to the following figures:
[0014] Figure 1 This is a block diagram depicting an exemplary computing environment according to an embodiment of the present invention.
[0015] Figure 2A This is a block diagram depicting a time series anomaly detection system 210 according to an embodiment of the present invention.
[0016] Figure 2B This is a block diagram depicting a time series anomaly detection engine 200 according to an embodiment of the present invention.
[0017] Figure 3 This is a block diagram depicting a method for time series anomaly detection 300 according to an embodiment of the present invention. Detailed Implementation
[0018] In time series data prediction models, anomaly detection of input data points is a crucial activity. Training a time series prediction model can be a semi-supervised task. A time series prediction model can quickly determine whether a new input is anomalous based on previous historical data and the time corresponding to the new input. However, upon receiving an anomalous data point, the entire model is then adjusted based on that anomalous data point. Therefore, based on the anomalous data point, future data points predicted by the model are then more likely to be anomalous, triggering false positives until more data points are received to correct the initial anomalous data point. Embodiments of the present invention recognize the advantages of time series prediction models that determine whether received data points are anomalous. Furthermore, in time series data prediction, immediate prediction in response to incoming data points is often not necessary, thus allowing for the determination of whether predictions based on data points are likely to trigger a cascading series of false positives.
[0019] In this embodiment, the time series model predictions can be adjusted to provide more accurate and reliable results, thereby resisting anomalies or outliers. This embodiment can train a semi-supervised anomaly detection model using historical time series data and analyze the training data to utilize features, decomposing the training data into trend, seasonal, and residual components. These components can be analyzed to generate one or more confidence intervals for different anomaly data points based on the detected characteristics of the time series data. When the time series model receives anomaly time series data, the generated predictions can be compared with the confidence intervals to determine whether the generated predictions are false positives.
[0020] In this embodiment, time series decomposition can obtain trend components, seasonal components, and residual components. In time series decomposition, a time series (X) of length n... t It can be decomposed in the following way:
[0021]
[0022] in It is a trend cycle component. It is a seasonal portion, and It is the residual component.
[0023] Trend components can be expressed as a moving average (X) based on the moving window size (l). t The trend components are obtained by centering on l. The method for generating trend components depends on whether l is even or odd. If l is even, the trend components can be generated as follows:
[0024]
[0025] If l is odd, the trend components can be generated as follows:
[0026]
[0027] Where |X| is the largest integer not greater than X. In the embodiment, the window size (l) can be specified. However, if the window size is not specified, a default rule can be used, where if the seasonal period is and Then l = s and if s = 1, but For l = 3, and for n > 10, l = 5.
[0028] In the embodiment, the seasonal irregular component can be obtained in the following manner:
[0029]
[0030] In addition, seasonal components can be obtained by averaging the seasonal irregularities of each season:
[0031]
[0032] in, , j is a non-negative integer, such that ,in This means that t and k have the same remainder after being divided by s. Therefore, the adjusted seasonal component will be
[0033]
[0034] And seasonal factors are defined as: and .
[0035] In the embodiment, if s = 1 and The residual components can be obtained through the following methods:
[0036]
[0037] However, if s = 1, The residual components can then be obtained as follows:
[0038]
[0039] In one embodiment, the trend component can be used to generate horizontally shifted confidence intervals. The trend cyclic component can be divided into multiple segments, each corresponding to one or more data points within a given time series. The average value can be obtained from the data points within each segment. The difference between the average value of each segment and the average value of adjacent segments can be obtained. Using the differences in the average values, the standard deviation of the average difference can be calculated. Confidence intervals can be generated based on the average value and the standard deviation. In one embodiment, the average difference can be obtained via the following...
[0040]
[0041] In the embodiment, the standard deviation of the difference can be obtained via the following:
[0042]
[0043] Using the standard deviation, confidence intervals can be determined, where the range of the interval is determined by... Provided, where C is a threshold (e.g., .1 to .5).
[0044] Various aspects of this disclosure are described by narrative text, flowcharts, block diagrams of computer systems, and / or block diagrams of machine logic included in embodiments of a computer program product (CPP). Regarding any flowchart, depending on the technology involved, operations may be performed in a different order than that shown in a given flowchart. For example, again according to the technology involved, two operations shown in consecutive flowchart blocks may be performed in reverse order, as a single integrated step, simultaneously, or in a manner that at least partially overlaps in time.
[0045] Computer Program Product Embodiment (“CPP Embodiment” or “CPP”) is a term used in this disclosure to describe any collection of one or more storage media (also referred to as “media”) collectively included in a collection of one or more storage devices, the collection of one or more storage devices collectively including machine-readable code corresponding to instructions and / or data for performing computer operations specified in a given CPP claim. A “storage device” is any tangible device capable of holding and storing instructions used by a computer processor. Without limitation, a computer-readable storage medium can be an electronic storage medium, a magnetic storage medium, an optical storage medium, an electromagnetic storage medium, a semiconductor storage medium, a mechanical storage medium, or any suitable combination of the foregoing. Some known types of storage devices that include these media include: magnetic disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), compact disc read-only memory (CD-ROM), digital versatile disc (DVD), memory sticks, floppy disks, mechanical encoding devices (such as punch cards or pits / platforms formed in the main surface of the disk), or any suitable combination of the foregoing. As used in this disclosure, computer-readable storage media should not be construed as storing transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides, optical pulses through fiber optic cables, electrical signals transmitted through wires, and / or other transmission media. As those skilled in the art will understand, data is typically moved at certain incidental points in time during normal operation of the storage device, such as during access, defragmentation, or garbage collection; however, this does not render the storage device transient, as the data is not transient when it is stored.
[0046] Now for reference Figure 1 , Figure 1A computing environment 100 is depicted. The computing environment 100 includes examples of environments for executing at least some of the computer code involved in performing the methods of the present invention, such as a time series anomaly detection engine 200. In addition to the time series anomaly detection engine 200, the computing environment 100 includes, for example, a computer 101, a wide area network (WAN) 102, an end-user device (EUD) 103, a remote server 104, a public cloud 105, and a private cloud 106. In this embodiment, the computer 101 includes a processor group 110 (including processing circuitry 120 and a cache 121), a communication structure 111, volatile memory 112, persistent storage device 113 (including an operating system 122 and the time series anomaly detection engine 200, as described above), a peripheral device group 114 (including a user interface (UI) device group 123, a storage device 124, and an Internet of Things (IoT) sensor group 125), and a network module 115. The remote server 104 includes a remote database 130. The public cloud 105 includes a gateway 140, a cloud coordination module 141, a host physical unit 142, a virtual machine group 143, and a container set 144.
[0047] Computer 101 can take the form of a desktop computer, laptop computer, tablet computer, smartphone, smartwatch or other wearable computer, mainframe computer, quantum computer, or any other form of computer or mobile device now known or to be developed in the future capable of running programs, accessing networks, or querying databases such as remote database 130. As is well known in the field of computer technology, and depending on the technology, the performance of a computer-implemented method can be distributed across multiple computers and / or multiple locations. On the other hand, in this presentation of computing environment 100, the detailed discussion focuses on a single computer, specifically computer 101, to keep the presentation as simple as possible. Computer 101 can reside in the cloud, even... Figure 1 It is not shown in the cloud. On the other hand, computer 101 is not required to be in the cloud except to the extent that can be definitively indicated.
[0048] Processor group 110 includes one or more computer processors of any type now known or to be developed in the future. Processing circuitry 120 may be distributed across multiple packages, such as multiple cooperating integrated circuit chips. Processing circuitry 120 may implement multiple processor threads and / or multiple processor cores. Cache 121 is memory located within the processor chip package and is typically used for data or code that should be readily accessible by the threads or cores running on processor group 110. Cache memory is typically organized into multiple levels based on its relative proximity to the processing circuitry. Alternatively, some or all of the cache in the processor group may be located “off-chip.” In some computing environments, processor group 110 may be designed to work with qubits and perform quantum computing.
[0049] Computer-readable program instructions are typically loaded onto computer 101 to cause the processor assembly 110 of computer 101 to perform a series of operational steps to implement a computer-implemented method, such that the instructions thus executed instantiate the method specified in the flowchart and / or the narrative description of the computer-implemented method included in this document (collectively, the “method of the invention”). These computer-readable program instructions are stored in various types of computer-readable storage media, such as cache 121 and other storage media discussed below. The processor assembly 110 accesses the program instructions and associated data to control and direct the execution of the method of the invention. In computing environment 100, at least some of the instructions for performing the method of the invention may be stored in a time-series anomaly detection engine 200 in persistent storage device 113.
[0050] Communication structure 111 is a signal transmission path that allows various components of computer 101 to communicate with each other. Typically, this structure consists of switches and conductive paths, such as switches and conductive paths that form buses, bridges, physical input / output ports, etc. Other types of signal communication paths can be used, such as fiber optic communication paths and / or wireless communication paths.
[0051] Volatile memory 112 is any type of volatile memory now known or to be developed in the future. Examples include dynamic random access memory (RAM) or static RAM. Typically, volatile memory 112 is characterized by random access, but this is not necessary unless explicitly indicated. In computer 101, volatile memory 112 is located in a single package and is internal to computer 101; however, alternatively or additionally, volatile memory may be distributed across multiple packages and / or located externally relative to computer 101.
[0052] The persistent storage device 113 is any form of non-volatile memory for a computer, now known or to be developed in the future. The non-volatility of this storage device means that the stored data is retained regardless of whether power is supplied to the computer 101 and / or directly to the persistent storage device 113. The persistent storage device 113 may be a read-only memory (ROM), but typically at least a portion of the persistent storage device allows for data writing, data deletion, and data rewriting. Some familiar forms of persistent storage devices include hard disks and solid-state storage devices. The operating system 122 may take several forms, such as various known proprietary operating systems or operating systems employing an open-source portable operating system interface type with a kernel. The code included in box 200 typically includes at least some of the computer code involved in performing the methods of the present invention.
[0053] Peripheral device group 114 includes a collection of peripheral devices for computer 101. Data communication connections between peripheral devices and other components of computer 101 can be implemented in various ways, such as Bluetooth connectivity, near field communication (NFC) connectivity, connections made by cables (such as Universal Serial Bus (USB) type cables), plug-in connections (e.g., secure digital (SD) cards), connections made via local area communication networks, and even connections made via wide area networks such as the Internet. In various embodiments, UI device group 123 may include components such as displays, speakers, microphones, wearable devices (such as goggles and smartwatches), keyboards, mice, printers, touchpads, game controllers, and haptic devices. Storage device 124 is an external storage device, such as an external hard drive, or a pluggable storage device, such as an SD card. Storage device 124 can be permanent and / or volatile. In some embodiments, storage device 124 may take the form of a quantum computing storage device for storing data in the form of qubits. In embodiments where computer 101 requires substantial storage (e.g., where computer 101 locally stores and manages a large database), this storage device can be provided by a peripheral storage device designed to store very large amounts of data, such as a storage area network (SAN) shared by multiple geographically distributed computers. The IoT sensor assembly 125 consists of sensors that can be used in IoT applications. For example, one sensor could be a thermometer, while another could be a motion detector.
[0054] Network module 115 is a collection of computer software, hardware, and firmware that allows computer 101 to communicate with other computers via WAN 102. Network module 115 may include hardware such as a modem or Wi-Fi transceiver, software for packetizing and / or depacketizing data transmitted over the communication network, and / or web browser software for transmitting data over the Internet. In some embodiments, the network control and network forwarding functions of network module 115 are performed on the same physical hardware device. In other embodiments (e.g., embodiments utilizing Software-Defined Networking (SDN)), the control and forwarding functions of network module 115 are performed on physically separate devices, such that the control function manages several different network hardware devices. Computer-readable program instructions for performing the methods of the present invention can typically be downloaded to computer 101 from an external computer or external storage device via a network adapter card or network interface included in network module 115.
[0055] WAN 102 is any wide area network (e.g., the Internet) capable of transmitting computer data over non-local distances using any technology known now or developed in the future for transmitting computer data. In some embodiments, WAN 102 may be replaced by and / or supplemented by a local area network (LAN), which is designed to transmit data between devices located in a local area, such as a Wi-Fi network. WAN and / or LAN typically include computer hardware such as copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers, and edge servers.
[0056] End User Equipment (EUD) 103 is any computer system used and controlled by an end user (e.g., a customer of the enterprise operating computer 101) and can take any of the forms discussed above in conjunction with computer 101. EUD 103 typically receives helpful and useful data from the operation of computer 101. For example, assuming computer 101 is designed to provide recommendations to the end user, these recommendations are typically transmitted to EUD 103 from network module 115 of computer 101 via WAN 102. In this way, EUD 103 can display or otherwise present the recommendations to the end user. In some embodiments, EUD 103 can be a client device, such as a thin client, a heavy client, a mainframe computer, a desktop computer, etc.
[0057] Remote server 104 is any computer system that provides at least some data and / or functionality to computer 101. Remote server 104 can be controlled and used by the same entity operating computer 101. Remote server 104 represents a machine that collects and stores helpful and useful data used by other computers such as computer 101. For example, if computer 101 is designed and programmed to provide recommendations based on historical data, that historical data can be provided to computer 101 from a remote database 130 of remote server 104.
[0058] Public cloud 105 is any computer system that can be used by multiple entities, providing on-demand availability of computer system resources and / or other computing capabilities (especially data storage (cloud storage) and computing power) without direct active management by users. Cloud computing typically leverages resource sharing to achieve scalability consistency and economy. The direct and active management of the computing resources of public cloud 105 is performed by the computer hardware and / or software of cloud coordination module 141. The computing resources provided by public cloud 105 are typically implemented by virtual computing environments running on various computers constituting host physical machine group 142, which is the entire domain of physical computers in and / or available to public cloud 105. Virtual computing environments (VCEs) typically take the form of virtual machines from virtual machine group 143 and / or containers from container group 144. It should be understood that these VCEs can be stored as images and can be transferred between various physical machine hosts as images or after the VCEs are instantiated. Cloud coordination module 141 manages the transfer and storage of images, deploys new instantiations of VCEs, and manages the active instantiation of VCE deployments. Gateway 140 is a collection of computer software, hardware, and firmware that allow public cloud 105 to communicate via WAN 102.
[0059] Now, we will provide some further explanation of Virtualized Computing Environments (VCEs). A VCE can be stored as an "image." New active instances of a VCE can be instantiated from this image. Two common types of VCEs are virtual machines and containers. A container is a VCE that uses operating system-level virtualization. This refers to an operating system feature where the kernel allows multiple isolated user-space instances, called containers, to exist. From the perspective of the programs running within them, these isolated user-space instances typically appear as actual computers. Computer programs running on a regular operating system can utilize all the resources of that computer, such as connected devices, files and folders, network shares, CPU power, and quantifiable hardware capabilities. However, programs running within a container can only use the contents of the container and the devices allocated to the container; this is a characteristic known as containerization.
[0060] Private cloud 106 is similar to public cloud 105, except that computing resources are available only to a single enterprise. While private cloud 106 is depicted as communicating with WAN 102, in other embodiments, private cloud may be completely disconnected from the Internet and accessible only via a local / private network. A hybrid cloud is a combination of multiple clouds of different types (e.g., private, community, or public cloud types) typically implemented by different vendors. Each of the multiple clouds remains a separate and discrete entity, but the larger hybrid cloud architecture is bound together by standardization or proprietary technology that enables coordination, management, and / or data / application portability across the multiple component clouds. In this embodiment, public cloud 105 and private cloud 106 are both part of a larger hybrid cloud.
[0061] Referring now to FIG2, FIG2 is a block diagram of a time series anomaly detection system 210 according to an embodiment of the present invention. A server 212 is shown in the time series anomaly detection system 210. A time series anomaly detection engine 200 and a time series prediction model 205 run on the server 212. A knowledge base set 216 is shown to be stored on the server 212. A network 218 and a time series data generator device 220 are also shown in the time series anomaly detection system 210. The server 212 is shown to communicate with the time series data generator device 220 via the network 212, but it can also be directly connected to the server 212.
[0062] Time series forecasting model 205 is a computer model capable of predicting future data points in a time series. This is also referred to as forecasting. In embodiments, time series forecasting module 205 may be an artificial intelligence or machine learning model trained using one or more historical datasets of time series data points. Time series forecasting model 205 may be trained in a semi-supervised manner without anomalies to accurately learn the expected behavior of time series data. In embodiments, time series forecasting model 205 may predict one or more (e.g., 2, 3, N…N+1) future data points of a time series received in real time. In embodiments, time series forecasting model 205 utilizes a sliding window with a stride over a time period. This allows time series forecasting model 205 to construct a context window and predict or forecast one or more data points in the near future. Time series forecasting model 205 may be based on a single forecasting method or a combination of forecasting methods, such as autoregression, moving average, vector autoregression, simple exponential smoothing, and seasonal autoregressive moving average. Time series forecasting model 205 may be based on a deep learning architecture, such as recurrent neural networks, long short-term memory, or multilayer perceptrons. Time series forecasting models 205 can also be based on machine learning models, such as XGBoost and random forest.
[0063] Knowledge base set 216 is a database capable of storing historical time-series data points. In an embodiment, knowledge base set 216 can incorporate newly received data from time-series data generator device 220 into historical time-series data points in a human-readable format. Knowledge base set 216 can be a database or similar data structure that can retain data points or similar information from devices recording time-series events.
[0064] Time series data generator device 220 is one or more devices capable of providing data points in a time-based format. For example, time series data generator device 220 may be a clock-coupled thermometer. In this example, the thermometer may acquire temperature readings continuously or at predetermined intervals. In another example, time series data generator device 220 may be a network traffic monitor capable of monitoring data packets uploaded and downloaded at specific points within a telecommunications system or local area network. In yet another example, a time series data generator may be a program that monitors the frequency and amount of deposits and withdrawals in a financial technology system. In yet another example, time series data generator device 220 may monitor electricity usage associated with a power grid. Other examples of time series data generator device 220.
[0065] Figure 2B This is a block diagram of a time series anomaly detection engine 200 according to an embodiment of the present invention. The anomaly detection engine 200 shows a time series decomposition module 232, a confidence interval generation module 234, and an outlier modification module 236.
[0066] The time series decomposition module 232 is a computer module that can decompose time series data into its basic components. In an embodiment, the time series decomposition module 232 may receive a set of time series data points from a historical dataset, or real-time series data points from a device such as the time series data generator device 220. The time series decomposition module can decompose the time series data points into component parts. The component parts may be trend cyclic components, seasonal cyclic components, and residual cyclic components. The trend component may indicate an overall increase, decrease, or static pattern over time or in the long-term direction of the population. The seasonal component is related to a systematic recurrence pattern (e.g., temperature is a calendar-related movement). The residual component is the non-systematic short-term fluctuation in the time series dataset.
[0067] The confidence interval generation module 234 is a computer module capable of generating confidence intervals associated with each basic component of a time series dataset. In an embodiment, the confidence interval generation module 234 may receive the output of the time series decomposition module 232 (i.e., the trend component, seasonal component, and residual component). Confidence intervals can be generated based on intervals associated with time series data points. For example, the confidence interval generation module 234 may generate confidence intervals for horizontal shifts associated with the trend component. In another example, the confidence interval generation module 234 may generate confidence intervals for the residual values. In another embodiment, the confidence interval generation module 234 may generate confidence intervals for the variance associated with the residual values.
[0068] In one embodiment, the confidence interval generation module 234 can generate confidence intervals for outliers from the residual components. The confidence interval generation module 234 can divide the residual component data points into equal time periods. The confidence interval generation module 234 can calculate the mean of each individual segment. The confidence interval generation module 234 can calculate the difference between the means of each adjacent time period. The confidence interval generation module 234 can calculate the average of the differences between adjacent time periods and determine the standard deviation. The confidence interval generation module 234 can obtain confidence intervals for outliers based on the mean and standard deviation. In one embodiment, the confidence interval may have the mean as the midpoint of the confidence interval plus or minus a multiple of the standard deviation (e.g., 1, 2, N…N+1). In another embodiment, the confidence interval may have the mean as the midpoint of the confidence interval plus or minus a constant multiplied by the generated standard deviation. This constant may be predetermined or dynamically determined.
[0069] In one embodiment, the confidence interval generation module 234 can generate horizontally shifted confidence intervals from the trend component. The confidence interval generation module 234 can divide the trend component data points into equal time periods. The confidence interval generation module 234 can calculate the mean of each individual segment. The confidence interval generation module 234 can calculate the difference between the means of each adjacent time period. The confidence interval generation module 234 can calculate the average of the differences between adjacent time periods and determine the standard deviation. The confidence interval generation module 234 can obtain confidence intervals for outliers based on the mean and standard deviation. In one embodiment, the confidence interval may have the mean as the midpoint of the confidence interval plus or minus a multiple of the standard deviation (e.g., 1, 2, n…n+1). In another embodiment, the confidence interval may have the mean as the midpoint of the confidence interval plus or minus a constant multiplied by the generated standard deviation. This constant may be predetermined or dynamically determined.
[0070] In one embodiment, the confidence interval generation module 234 can generate confidence intervals for the variance from the residual components. The confidence interval generation module 234 can divide the residual component data points into equal time periods. The confidence interval generation module 234 can calculate the variance value for each individual segment. The confidence interval generation module 234 can calculate the difference between the variances of each adjacent time period. The confidence interval generation module 234 can calculate the average of the variance differences between adjacent time periods and determine the standard deviation. The confidence interval generation module 234 can obtain confidence intervals for outliers based on the average and standard deviation of the variance differences. In one embodiment, the confidence interval may have the average of the midpoint of the confidence interval plus or minus a multiple of the standard deviation (e.g., 1, 2, n…n+1). In another embodiment, the confidence interval may have the average of the midpoint of the confidence interval plus or minus a constant multiplied by the generated standard deviation. This constant may be predetermined or dynamically determined.
[0071] Outlier identification module 236 is a computer module capable of identifying outlier data points in time series data. In an embodiment, outlier identification module 236 may receive confidence interval values generated by confidence interval generation module 234 and time series data point components of real-time time series data decomposed by time series decomposition module 236 as input. Outlier identification module 236 may compare and analyze the decomposed real-time data point components with confidence intervals (e.g., horizontal shift, variance, and outliers). If outlier identification module 236 determines that the decomposed real-time data point is outside one or more confidence intervals, then outlier identification module 236 may identify or label the data point as an outlier. The type of outlier corresponds to the confidence interval into which the data point does not fall. For example, the label may be horizontal shift, equation, and / or outlier.
[0072] When a data point is identified or marked as an anomaly, the outlier identification module 236 can signal or cause the time series prediction model to ignore the data point in its prediction or prediction considerations. In an embodiment, the outlier identification module 236 can send the expected average value of the data point at that time to prevent the time series prediction model from basing its future predictions on the outlier data point. In an embodiment, the outlier identification module can maintain a running list of identified outlier data points. If multiple outlier data points are detected consecutively, the outlier identification module 236 can send the identified outlier data points to the time series prediction model 205, which has instructions to include the data points in its predictions to prevent erroneous predictions or drift.
[0073] Now for reference Figure 3 , Figure 3This is a flowchart 300 depicting the steps for anomaly detection in time series data. In step 302, the time series data is decomposed into residual components, seasonal components, and trend components. In step 304, the residual components and trend components are divided into equal time components. In step 306, for the residual components and trend components, the mean of each time period and the standard deviation of the difference between the means and the means of each adjacent time period are determined. In step 308, confidence intervals for horizontal shifting are obtained based on the mean and standard deviation of the trend components. In step 310, confidence intervals for outliers and confidence intervals for variance are obtained based on the mean and standard deviation of the residual components. In step 312, the basic components of the real-time data series points are compared with their corresponding confidence intervals. In step 314, in response to a basic component being outside the confidence interval, the data point is marked as an anomaly.
[0074] Various embodiments of the invention have been described for illustrative purposes, but are not intended to be exhaustive or limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein has been chosen to best explain the principles of the embodiments, their practical application, or technical improvements to existing technologies on the market, or to enable others skilled in the art to understand the embodiments disclosed herein.
Claims
1. A computer-implemented method for anomaly adjustment in a time series forecasting model, comprising: Time series data is decomposed into residual components, trend components, and seasonal components; The residual components and the trend components are segmented based on time values; The first data point of the time series data is classified as an anomaly; Determine whether the first data point is within the outlier confidence interval; In response to determining that the first data point is within the outlier confidence interval, the first data point is marked as a non-outlier. Determine whether the first data point is within the horizontal shift position information interval of the same residual segment as the first data point; In response to the number of outlier data points marked as outliers exceeding a threshold, the average difference between a first trend segment corresponding to the residual segment including the first data point and a second trend segment immediately preceding it is calculated. Determine whether the average difference between the first trend segment and the second trend segment is within the horizontal shift confidence interval; In response to the average difference being within the horizontal shift position confidence interval, the variance difference between the first trend segment and the second trend segment is calculated; Determine whether the variance difference is within the variance confidence interval; In response to determining that the variance difference is within the variance confidence interval, the outlier classification of the first data point is removed.
2. The computer-implemented method according to claim 1 further includes: One or more time series data points of the time series data are predicted based at least in part on the first data point, wherein the anomaly classification of the first data point has been removed.
3. The computer-implemented method according to claim 1 or 2 further includes: A semi-supervised anomaly detection model is trained, wherein the anomaly detection model is based on the feature exploration of historical time series training data.
4. The computer-implemented method according to claim 3 further includes: Decompose historical time series data into historical residual components; The historical residual components are segmented; Calculate the average value of each historical residual component segment in the historical residual component segment; Calculate the difference between the average values of each adjacent historical residual component segment; The average difference is calculated based on the difference between the average values of each adjacent historical residual component segment; The standard deviation is calculated based on the average difference. Generate outlier confidence intervals, wherein the maximum value of the outlier confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the outlier confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
5. The computer-implemented method according to claim 3, further comprising: Decompose historical time series data into historical trend components; The historical trend components are segmented; Calculate the average value of each historical trend component segment in the historical residual component segment; Calculate the difference between the average values of each adjacent historical trend component segment; The average difference is calculated based on the difference between the average values of each adjacent historical trend component segment; The standard deviation is calculated based on the average difference. as well as A horizontal shift interval is generated, wherein the maximum value of the horizontal shift interval is the average difference plus the standard deviation multiplied by a threshold, and the minimum value of the horizontal shift interval is the average difference minus the standard deviation multiplied by the threshold.
6. The computer-implemented method according to claim 3 further includes: Decompose historical time series data into historical residual components; The historical residual components are segmented; Calculate the variance of each historical residual component segment in the historical residual component segments; Calculate the difference between each variance of adjacent historical residual component segments; The average variance difference is calculated based on the difference between the average values of each adjacent historical residual component segment; The standard deviation is calculated based on the average variance value; Generate a variance confidence interval, wherein the maximum value of the variance confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the variance confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
7. The computer-implemented method according to any one of the preceding claims, wherein, The time-series data is associated with power generation.
8. A computer system for anomaly adjustment in a time series forecasting model, comprising: processor, Computer-readable storage, A computer-readable tangible storage device, and program instructions stored on the computer-readable storage device, the program instructions being executable by a processor via the computer-readable storage device, wherein the computer system is configured to perform a method comprising: Time series data is decomposed into residual components, trend components, and seasonal components; The residual components and the trend components are segmented based on time values; The first data point of the time series data is classified as an anomaly; Determine whether the first data point is within the outlier confidence interval, and in response to determining that the first data point is within the outlier confidence interval, mark the first data point as a non-outlier. Determine whether the first data point is within the horizontal shift position information interval of the same residual segment as the first data point; In response to the number of outlier data points marked as outliers exceeding a threshold, the average difference between a first trend segment corresponding to the residual segment including the first data point and a second trend segment immediately preceding it is calculated. Determine whether the average difference between the first trend segment and the second trend segment is within the horizontal shift confidence interval; In response to the average difference being within the horizontal shift position confidence interval, the variance difference between the first trend segment and the second trend segment is calculated; Determine whether the variance difference is within the variance confidence interval; In response to determining that the variance difference is within the variance confidence interval, the outlier classification of the first data point is removed.
9. The computer system of claim 8, further comprising program instructions stored in the computer-readable storage device for execution by the processor via the computer-readable storage device, wherein, The computer system is configured to perform the method, and further includes: One or more time series data points of the time series data are predicted based at least in part on the first data point, wherein the anomaly classification of the first data point has been removed.
10. The computer system of claim 8 or 9, comprising program instructions stored on the computer-readable storage device for execution by the processor via the computer-readable storage device, wherein, The computer system is configured to perform the method, and further includes: A semi-supervised anomaly detection model is trained, wherein the anomaly detection model is based on the feature exploration of historical time series training data.
11. The computer system of claim 10, further comprising program instructions stored in the computer-readable storage device for execution by the processor via the computer-readable storage device, wherein, The computer system is configured to perform the method, and further includes: Decompose historical time series data into historical residual components; The historical residual components are segmented; Calculate the average value of each historical residual component segment in the historical residual component segment; Calculate the difference between the average values of each adjacent historical residual component segment; The average difference is calculated based on the difference between the average values of each adjacent historical residual component segment; The standard deviation is calculated based on the average difference. Generate outlier confidence intervals, wherein the maximum value of the outlier confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the outlier confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
12. The computer system of claim 10, comprising program instructions stored on the computer-readable storage device for execution by the processor via the computer-readable storage device, wherein, The computer system is configured to perform the method, and further includes: Decompose historical time series data into historical trend components; The historical trend components are segmented; Calculate the average value of each historical trend component segment in the historical residual component segment; Calculate the difference between the average values of each adjacent historical trend component segment; The average difference is calculated based on the difference between the average values of each adjacent historical trend component segment; The standard deviation is calculated based on the average difference; and A horizontal shift interval is generated, wherein the maximum value of the horizontal shift interval is the average difference plus the standard deviation multiplied by a threshold, and the minimum value of the horizontal shift interval is the average difference minus the standard deviation multiplied by the threshold.
13. The computer system of claim 10, comprising program instructions stored on the computer-readable storage device for execution by the processor via the computer-readable storage device, wherein, The computer system is configured to perform the method, and further includes: Decompose historical time series data into historical residual components; The historical residual components are segmented; Calculate the variance of each historical residual component segment in the historical residual component segments; Calculate the difference between each variance of adjacent historical residual component segments; The average variance difference is calculated based on the difference between the average values of each adjacent historical residual component segment; The standard deviation is calculated based on the average variance value; Generate a variance confidence interval, wherein the maximum value of the variance confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the variance confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
14. The computer system according to any one of claims 8 to 13, wherein, The time-series data is associated with power generation.
15. A computer program product for anomaly adjustment in a time series forecasting model, the computer program product comprising a computer-readable storage medium having program instructions embodied therein, wherein, The execution of the program instructions by the computer processor causes the computing device to: Time series data is decomposed into residual components, trend components, and seasonal components; The residual components and the trend components are segmented based on time values; The first data point of the time series data is classified as an anomaly; Determine whether the first data point is within the outlier confidence interval, and in response to determining that the first data point is within the outlier confidence interval, mark the first data point as a non-outlier. Determine whether the first data point is within the horizontal shift position information interval of the same residual segment as the first data point; In response to the number of outlier data points marked as outliers exceeding a threshold, the average difference between a first trend segment corresponding to the residual segment including the first data point and a second trend segment immediately preceding it is calculated. Determine whether the average difference between the first trend segment and the second trend segment is within the horizontal shift confidence interval; In response to the average difference being within the horizontal shift position confidence interval, the variance difference between the first trend segment and the second trend segment is calculated; Determine whether the variance difference is within the variance confidence interval; In response to determining that the variance difference is within the variance confidence interval, the outlier classification of the first data point is removed.
16. The computer program product according to claim 15, wherein, The execution of the program instructions also causes the computing device to: One or more time series data points of the time series data are predicted based at least in part on the first data point, wherein the anomaly classification of the first data point has been removed.
17. The computer program product of claim 15 or 16, wherein execution of the program instructions further causes the computing device to: Training a semi-supervised anomaly detection model, in which, The anomaly detection model is based on the exploration of characteristics of historical time series training data.
18. The computer program product according to claim 17, wherein, The execution of the program instructions also causes the computing device to: Decompose historical time series data into historical residual components; The historical residual components are segmented; Calculate the average value of each historical residual component segment in the historical residual component segment; Calculate the difference between the average values of each adjacent historical residual component segment; The average difference is calculated based on the difference between the average values of each adjacent historical residual component segment; The standard deviation is calculated based on the average difference. Generate outlier confidence intervals, wherein the maximum value of the outlier confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the outlier confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
19. The computer program product according to claim 17, wherein, The execution of the program instructions also causes the computing device to: Decompose historical time series data into historical trend components; The historical trend components are segmented; Calculate the average value of each historical trend component segment in the historical trend component segment; Calculate the difference between the average values of each adjacent historical trend component segment; The average difference is calculated based on the difference between the average values of each adjacent historical trend component segment; The standard deviation is calculated based on the average difference. as well as A horizontal shift interval is generated, wherein the maximum value of the horizontal shift interval is the average difference plus the standard deviation multiplied by a threshold, and the minimum value of the horizontal shift interval is the average difference minus the standard deviation multiplied by the threshold.
20. The computer program product according to claim 17, wherein, The execution of the program instructions also causes the computing device to: Decompose historical time series data into historical residual components; The historical residual components are segmented; Calculate the variance of each historical residual component segment in the historical residual component segments; Calculate the difference between each variance of adjacent historical residual component segments; The average variance difference is calculated based on the difference between the average values of each adjacent historical residual component segment; The standard deviation is calculated based on the average variance value; Generate a variance confidence interval, wherein the maximum value of the variance confidence interval is the mean difference plus the standard deviation multiplied by a threshold, and the minimum value of the variance confidence interval is the mean difference minus the standard deviation multiplied by the threshold.
21. A computer program comprising program code means, wherein when the program is run on a computer, the program code means is adapted to perform the method of any one of claims 1 to 7.