Time series data filling and restoring method based on machine learning

A time series and machine learning technology, applied in machine learning, electrical digital data processing, special data processing applications, etc., can solve problems such as difficult large-scale data filling and restoration, time-consuming model training and prediction, and affecting data availability. Achieve the effects of not being overfitting, effective and practical time series features, and improving the upper limit of the prediction effect

Active Publication Date: 2019-11-15
杭州知衣科技有限公司
View PDF5 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] 1. Simply adopting the mean filling method, data association-based and density-based filling methods will cause serious bias in data restoration due to data volatility and affect the availability of collected data;
[0007] 2. The data restoration method based on deep learning in the industry is prone to model overfitting, time-consuming training and prediction, and it is difficult to be practical for filling and restoring large-scale data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time series data filling and restoring method based on machine learning
  • Time series data filling and restoring method based on machine learning
  • Time series data filling and restoring method based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The technical solutions of the present invention will be further described and illustrated through specific examples below.

[0052] Unless otherwise specified, the methods used in the embodiments of the present invention are conventional methods in the art.

[0053] The present invention provides a method for filling and restoring time series data based on machine learning. Specifically, the steps of the method are as follows:

[0054] Assumptions:

[0055] The sliding time window is N (dimension: day / hour / minute, represented by L); the sampling time interval is L, and the accumulated data collected at each sampling moment is T(i).

[0056] Taking an e-commerce website as an example, it is known that the monthly sales volume of the product is the cumulative sales value of the last 30 days. In order to calculate the daily sales volume of the product, the monthly sales value of the product needs to be collected once a day under normal circumstances, then N=30 days, L= ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of computer time series data analysis and prediction, in particular to a time series data filling and restoring method based on machine learning. The method includes: filling the missing value by using a domain-based median and mean value filling method; estimating a true value of an expected sampling moment through a linear rule; detecting wave crestsand wave troughs of the time sequence, and smoothing abnormal values; taking hundreds of thousands of collected real data as samples, designing and generating time sequence characteristics, taking real results as labels, and training a machine learning model based on an XGBoost (Extreme Gradient Boost) for predicting a large number of unknown data. According to the method, the problems of multiplemissing values, large volatility, error accumulation and the like of specific time sequence data are solved, and the accuracy of data filling and restoring is effectively improved; moreover, the complexity of a machine learning model is well controlled, the filling and restoration of hundreds of millions of data records can be completed within an hour level, and the method has a high practical value.

Description

technical field [0001] The invention relates to the technical field of computer time series data analysis and prediction, in particular to a method for filling and restoring time series data based on machine learning. Background technique [0002] At present, information technology is widely used in all walks of life and continuously generates various related data, and data collection and mining technology is also emerging, providing strong support for management decisions in related industries and improving economic and social benefits. [0003] Data acquisition is the process of collecting, identifying, and selecting data from data sources. Data collection can be divided into real-time collection and interval collection. Real-time acquisition refers to the acquisition of data during its existence. Interval collection refers to the collection of data at equally spaced time points. Ideal real-time collection can retain the original data to the greatest extent, thus provid...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/50G06N20/00
CPCG06N20/00
Inventor 郑泽宇温苗苗尚文祥李鸽李娜何治胡海滨何辉辉石磊
Owner 杭州知衣科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products