Long-time-series delta-anomaly-point detection method based on probabilistic suffix tree (PST)

A probabilistic suffix tree and long-term sequence technology, which is applied to pattern recognition in signals, instrument, character and pattern recognition, etc., can solve problems such as algorithms that rarely detect abnormal data points
CN107844731AInactive Publication Date: 2018-03-27FUDAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Current Assignee / Owner
FUDAN UNIV
Publication Date
2018-03-27
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention belongs to the field of anomaly detection of time series data, and relates to a long-symbol-string anomaly-point detection method based on a probabilistic suffix tree (PST). According tothe method, discretization technology of continuous data and a probabilistic suffix tree model are utilized to detect long-time-series anomaly data points, and the steps thereof include: discretizingthe originally continuous long time series data to obtain a long symbol string, constructing the probabilistic suffix tree according to a symbolized training data set, utilizing the constructed PST to detect the delta-anomaly-points in a to-be-detected data set, and utilizing F<1>-Measure to evaluate a detection effect. Experimental results show that the method can effectively support various long time series, is higher in all of a recall rate, an accuracy rate and a precision rate, is good in the detection effect, and can be applied to various fields of aerospace, medical data analysis, financial data analysis, network anomaly behavior detection and the like.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of time series anomaly detection, and relates to a technique for discretely processing original time series by means of a symbolization method, in particular to a method for detecting abnormal points of long symbol strings based on a probability suffix tree. Background technique

[0002] The prior art discloses that time series data is a data form that often appears in daily applications, and it has a wide range of applications in various fields such as aerospace, medical data analysis, financial data analysis, network abnormal behavior detection, and weather forecasting. In these application fields, frequent patterns in the mining sequence may not be able to reveal the abnormal information hidden in the data behavior, but these abnormal information can usually reflect certain problems. For example, abnormal data in the user's daily operation information may mean that the account password Compromised or comp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More