Coding indexing mode for time sequences

A time series and indexing technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of high data redundancy and low indexing efficiency

Inactive Publication Date: 2014-10-15
肖瑞
View PDF0 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] In order to solve the problems of high data redundancy and low indexing efficiency in the existing time series indexing methods, this invention proposes a coded indexing method based on time series trends. This indexing method can be used in O(n ) within the time complexity of encoding the time series into a unique code value, and then build a B-tree index on the time series through the code value, so as to quickly index the time series and effectively support similarity query matching through the index

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Coding indexing mode for time sequences
  • Coding indexing mode for time sequences
  • Coding indexing mode for time sequences

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0011] The implementation is mainly composed of four parts. The first part realizes the conversion of uncertain time series into definite expected sequences. The second part realizes interval division and symbol mapping of time series to obtain symbol sequences. The third part is the arithmetic coding process of symbol sequences. The fourth part is algorithm analysis.

[0012] First part one:

[0013] A definite time series is represented as an ordered sequence of definite sampling values ​​at each time point; the uncertainty of an uncertain time series is represented as a set of sample observations at each time point. The value of each time point is represented by a random variable, and the uncertain time series is considered as an ordered sequence of random variables with time characteristics.

[0014] Definition 1. (Time series) An uncertain time series of length n is composed of a sequence containing n elements, and the time series is recorded as: TS={(t 1 , X 1 , P 1 ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a rapid indexing mode for carrying out arithmetic coding on variation trend of time sequences (determined time sequences and undetermined time sequences), which has important significance for predicting, classifying, carrying out data mining, carrying out knowledge discovery and the like on the time sequences. The indexing mode solves the problems of high data redundancy or low matching accuracy and indexing efficiency, which are caused by spatial indexing of the time sequences and can effectively complete similarity matching, clustering and classification of the time sequences (time sequences with equal lengths and time sequences with unequal lengths) by indexing and greatly reduce time complexity and spatial complexity of the problems of similarity matching, clustering and classification of the sequences. In the mode, firstly, time dimensions of the time sequences are subjected to interval segmentation, the time sequences are mapped into symbol sequences according to variation trend of the time sequences in each interval, then the symbol sequences are subjected to arithmetic coding to form unique code values and finally, the code values are used for establishing a B-Tree index for a time sequence set.

Description

Technical field [0001] The invention relates to a time series index method, which can effectively index the time series in a database, quickly retrieve the corresponding time series through the index, and effectively complete sequence similarity matching through the index. Background technique [0002] Due to the huge amount of data in the time series database, in order to quickly complete the retrieval and similarity matching, it is necessary to index the time series. Due to the high-dimensional nature of time series, and the Euclidean spatial distance is mostly used for the time series similarity measurement, most of the index methods also use the spatial index structure. From a large perspective, the spatial data index technology used in time series can be divided into two types: tree structure (including R tree, K-D tree, quad tree) and grid file, mainly including F-index, ST index, vp-tree , FastMap, etc., but these indexing methods cause high data redundancy or reduce...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2246
Inventor 肖瑞刘国华宋转肖桂来刘力刘佩郑宁
Owner 肖瑞
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products