Method for searching for approximate sequence of given time sequence from time sequence database

A time series, given time technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc.

Inactive Publication Date: 2012-10-17
FUDAN UNIV
View PDF1 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The purpose of the present invention is to solve the problem of approximate query of massive time series, and provide a method for searching the approximate sequence of a given time series in a massive time series database

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for searching for approximate sequence of given time sequence from time sequence database
  • Method for searching for approximate sequence of given time sequence from time sequence database
  • Method for searching for approximate sequence of given time sequence from time sequence database

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] Next, take the sequence set {(3,3,3,3), (2,2,2,2), (3,3,4,4)} as an example to illustrate the index construction process, assuming that the node threshold is 2. The index building process is as follows:

[0043] 1. Establish the root node root;

[0044] 2. Insert the first piece of data (3,3,3,3), the root is empty, put it in the root, the average value field of root is updated to [3,3], and the standard deviation value field is updated to [0,0 ];

[0045] 3. Insert the second piece of data (2,2,2,2), the root has two pieces of data, which do not exceed the threshold, the average value range is updated to [2,3], and the standard deviation value range is updated to [0,0] ;

[0046] 4. Insert the third piece of data (3,3,4,4), update the root average value to [2,3.5], update the standard deviation to [0,0.5], the number of sequences in the node exceeds the threshold, Consider node splitting, =2.5;

[0047] 5. Consider the case of splitting according to the average ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of data mining, in particular to a method for searching for an approximate sequence of a given time sequence from a massive time sequence database. The method comprises the following steps of: adopting a tree index node representation mode; constructing indexes one by one according to an algorithm frame of the indexes; selecting an optimal strategy for node splitting; and performing query on the basis of the DSTree indexes to search for the approximate sequence of the given time sequence from the massive time sequence database. According to an indexing method, the length and dimension of an index sub-sequence are regulated according to the data distribution condition of the time sequence, and by a new index representation mode, a requirement for the provision of an upper distance bound can be met, and query efficiency can be greatly improved.

Description

technical field [0001] The invention belongs to the technical field of data mining, and in particular relates to a method for searching an approximate sequence of a given time series in a massive time series database. Background technique [0002] Approximate time series query is a hot issue in data mining. For the massive time series in the database, how to quickly and accurately find the time series that is most similar to a given sequence is of great importance in transportation networks, sensor networks, financial analysis and other occasions. significance. Building an index on the time series in the database can effectively reduce the dimensionality and query pruning of the time series, so as to execute the query accurately and quickly. [0003] The basic idea of ​​index construction includes two aspects. On the one hand, the search space is reduced by dividing the value space of each dimension of the time series in a manner similar to the multidimensional index of the...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王鹏汪卫汪洋祝然威
Owner FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products