Time sequence anomaly point detection method and device

A time series and anomaly detection technology, applied in the field of data processing, can solve problems such as large prediction deviations, achieve the effects of improving accuracy, reducing prediction deviations, and reducing false alarm rates

Inactive Publication Date: 2018-11-09
HARBIN INST OF TECH
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0012] The technical problem to be solved by the present invention is to provide a method and device for detecting abnormal points in time series, which can use replacement strategies to predict Replace the abnormal values ​​with the predicted values ​​of the model to reduce the deviation of the prediction as much as possible

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time sequence anomaly point detection method and device
  • Time sequence anomaly point detection method and device
  • Time sequence anomaly point detection method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0046] Such as figure 1 As shown, the time series outlier detection method provided by the embodiment of the present invention may include the following steps:

[0047] Step S101: using the training set T to train the regression model of the time series;

[0048] Step S102: Predict the current time series value according to the regression model obtained from training and the input time series one period before the current time, and perform anomaly detection on the observed current time series value according to the predicted current time series value. Preferably, in step S102, it is judged whether the difference between the predicted current time series value and the observed current time series value is greater than a preset threshold, if yes, the observed current time series value is considered to be abnormal; otherwise, the observed current time series value is considered to be abnormal. The current time sequence value of is normal.

[0049]Step S103: According to the res...

Embodiment 2

[0054] On the basis of the time series outlier detection method provided in Embodiment 1, the discrimination threshold of the outlier is optimized, wherein:

[0055] Step S101 also includes: calculating the standard deviation σ of the training set;

[0056] The preset threshold used in step S102 is determined by the following formula:

[0057] D=3σ;

[0058] Where D is the preset threshold, and σ is the standard deviation of the training set.

[0059] In the anomaly detection stage, the choice of threshold has an important impact on the determination of outliers. If the threshold is too large, the false positive rate will increase; if the threshold is too small, many false positives will appear. Only by selecting the threshold reasonably can the outliers be judged correctly, making subsequent replacement strategies more successful. In threshold selection, this method adopts 3σ criterion. When the sample is large enough, it can be approximately considered as a normal distr...

Embodiment 3

[0062] The present invention provides the time series abnormal point detection method of the third embodiment in combination with the foregoing embodiments one and two, and the specific implementation process is as follows:

[0063] Step S201: the process starts;

[0064] Step S202: Establish a regression model Regressor, and use the training set T to train the regression model;

[0065] Step S203: Calculate the standard deviation σ of the training set;

[0066] Step S204: Let the abnormal point set P be an empty set;

[0067] Step S205: set the time sequence number index=k+1; k is the input length of the regression model.

[0068] Step S206: Determine whether the time sequence number index is less than or equal to the total length of the time series S, if yes, go to step S207, otherwise go to step S212;

[0069] Step S207: The period of time series [S in the time series T index-k-1 ,...,S index-1 ] Input the regression model Regressor, and predict the sequence value pred...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data processing, and provides a time sequence anomaly point detection method and device. The method comprises the steps of: training the regression model of a time sequence through a training set; predicting a sequence value at the current time according to the regression model obtained by training and an input time sequence before the current time,and performing anomaly detection on a sequence value at the current time obtained by observation according to the sequence value at the current time obtained by prediction; and, according to an anomaly detection result, when the sequence value at the current time obtained by observation is considered as anomaly, replacing the sequence value at the current time obtained by observation by the sequence value at the current time obtained by prediction, and continuously performing anomaly point detection on the next time of the time sequence. In a time sequence point anomaly detection task, a regression prediction method is adopted; an abnormal value is replaced by a prediction value; prediction deviation is reduced as much as possible; and the detection accuracy rate is increased.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to a time series abnormal point detection method and device. Background technique [0002] Time series is a numerical data sequence collected in chronological order, which widely exists in finance, industry, commerce, medical care, meteorology and other fields. Stock prices on the stock exchange over time, data collected by various sensors in factories, monthly merchandise sales in stores, electrocardiograms of patients, and precipitation in a certain area are all time series. [0003] In traditional data mining, outliers may be removed as noise, so as not to affect the results of data mining. However, in some cases, outliers contain important information, mining and analyzing outliers can get a lot of useful knowledge. For example, in seismic data, the abnormal value may be a precursor to an earthquake; the abnormal sensor data in the factory may indicate a failure in a c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F17/18G06F11/07
CPCG06F11/0751G06F17/18
Inventor 王宏志李子珏高宏万晓珑
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products