Time-series-data completion method based on distance matrix

A time series and distance matrix technology, applied in the fields of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of neglecting the correlation between multi-dimensional data, poor interpretability, and incomplete consideration of the nature of time series data.

Active Publication Date: 2018-06-29
NANJING UNIV
View PDF4 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The advantage of this method is that it is simple and efficient, but the disadvantage is that it ignores the correlation between multidimensional data, and the effect is not good in the case of a large number of missing and continuous data; 2) Non-negative matrix decomposition method: when there are multiple time series with the same mode Under the premise of , assuming that each sequence can be expressed as a linear combination of a set of basis vectors, use the non-negative matrix decomposition method to find the basis vector and the combination coefficient corresponding to each sequence through the known data, and restore it by multiplying the coefficient by the basis vector complete time series
The advantage of this method is that multiple pieces of information are fully considered, and the disadvantage is that it is poorly interpretable and cannot be modeled explicitly for various intrinsic physical laws; 3) Completion based on the hidden Markov model: assuming that the time series is a sequence of observations , which hides a real state sequence behind it, using the real state sequence to model the internal physical laws, and expressing the relationship between the state and the value in the time series through the mapping of the state to the observation, and decoding the hidden state sequence corresponding to the missing part , to fill in the missing data
The advantage is that it can display the physical laws of modeling including time smoothness, and the disadvantage is that it is not suitable for more complex spatial correlations.
In summary, the existing related methods do not fully consider the nature of time series data itself, and the discussion of time characteristics is limited to time smoothness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time-series-data completion method based on distance matrix
  • Time-series-data completion method based on distance matrix
  • Time-series-data completion method based on distance matrix

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] Embodiments of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0031] Through the analysis and research of the existing time series data sets, the present invention finds that the time series data not only contains the simple property of time smoothness, but also has a more complex high-order time correlation relationship - cross-time similarity and cycle The data will show similar and cyclical characteristics in a certain time span or multiple time spans. For example, for the above-mentioned scenario of continuous recording of user activities based on smartphones, the user's activity data takes one week as the A cycle, weekly data is periodic; a day as a cycle, daily data is periodic. However, in many complex and lack of prior knowledge scenarios, it is very difficult to artificially mine all the periodicity behind time series data.

[0032] Below in conjunction with specific embodiment, further illustra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a time-series-data completion method based on a distance matrix. The internal high-order time association relationship of time series data is mined and used, so that missing data is completed through similar data points in the time series data; the time-series-data completion method includes the specific steps that according to the time series data, a distance matrix D of time series is modeled based on some distance measure function, and a matrix element D<ij> located in an ith line and a jth row is the distance between an ith data point and a jth data point in the time series; based on the obtained distance matrix D, k segments closest to segments with missing are searched in original time series; through the computed k near segments, data of the segments with missing is completed. According to the method, the good completion effect can be obtained in real time-series-data missing scenes; meanwhile, the interpretability of the method is high, the physical meaning behind the method is clear, many extensions can be conducted on the basis of the method, and therefore the method is effectively applied to various real scenes.

Description

technical field [0001] The invention belongs to the field of computer applications, in particular to an efficient data completion method for data loss caused by equipment performance limitations, network transmission errors, user privacy protection and other reasons in time series data collection and transmission, specifically a A Time Series Data Completion Method Based on Distance Matrix. Background technique [0002] Time series data is a collection of observation data obtained by observing in chronological order, and its properties mainly include large amount of data, high dimensionality and need to be continuously updated. Time-series data is ubiquitous in many different kinds of applications, such as: behavior capture, sensor networks, weather forecasting, financial market modeling, and more. The main purpose of analyzing time series is to identify the underlying patterns behind the data in order to predict future trends. There are many existing mathematical tools fo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/215G06F18/24143
Inventor 汪亮吴思萌陶先平吕建
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products