Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for automatically removing time sequence data outlier point

A technology of time series data and outlier points, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., to achieve the effect of reducing data analysis accuracy and avoiding failure

Inactive Publication Date: 2012-06-20
XI AN JIAOTONG UNIV
View PDF2 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method solves the problem of using computer to automatically identify and remove a large number of outliers in the data space

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for automatically removing time sequence data outlier point
  • Method for automatically removing time sequence data outlier point
  • Method for automatically removing time sequence data outlier point

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] The present invention will be described in detail below in conjunction with the accompanying drawings.

[0020] The invention realizes a method for automatically removing outlier points of time series data. This method uses variance-based density clustering, combined with the inherent characteristics of time series data, to automatically identify outliers. The basic train of thought of the present invention is: the method for identifying outlier points based on the density clustering of variance clusters with variance, mean value and time window; This method needs to use time window to divide time-series data in time on the one hand; On the other hand Measure density thresholds, radii, etc. with variance, mean, etc.

[0021] According to the technical solution of the present invention, the method includes a data configuration module, a module for loading identification data sets, data format conversion and cleaning, a module for identifying outlier points based on vari...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for automatically removing a time sequence data outlier point. In an outlier identification method, a plurality of general data are used, i.e., the same outlier identification rule can be applied to parameters in different forms; and the influence of prior conditions, such as expertise and the like, is reduced. In the method, an identified parameter configuration module, a loading, data format conversion and cleaning module for identifying a data set, a variance-based density cluster outlier point identification module, an outlier point identification result explanation module and necessary components, such as a data analysis result graphic view component, a user interaction component and the like are utilized. According to the method, the outlier point can be identified automatically by using variance-based density clustering and combining fixed characteristics of time sequence data, so that a data analyzer can clean the data, therefore, the influence of the outlier data on the data analysis precision and the judgment result is reduced; and the ineffectiveness of the data analysis result is avoided.

Description

Technical field: [0001] The invention belongs to the field of intelligent information processing and computer technology, and in particular relates to a method for automatically removing time series data outlier points for different time series parameter data. Background technique: [0002] Due to environmental interference, random interference, transmission noise, etc., the real time-series data often contains a large number of outliers, that is, data far out of the allowable range. These outliers are not normal measurement data, but noise points. If the outliers are directly involved in the calculation without processing, it will often lead to a decrease in the accuracy of data analysis, interfere with the normal judgment results, and even cause the invalidation of the data analysis results in severe cases. Human experts can distinguish between outliers and normal values ​​more accurately due to their rich professional knowledge and experience. But there are many difficu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F19/00
Inventor 鲍军鹏赵静
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products