A depth K-means clustering method for time series data

A technology of time series and clustering methods, which is applied to instruments, character and pattern recognition, computer components, etc., can solve the problems of noise and outlier sensitivity, and can not be well adapted to clustering, so as to improve accuracy and better Clustering operation, noise reduction effect

Inactive Publication Date: 2019-06-21
SOUTH CHINA UNIV OF TECH
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this type of time series clustering algorithm has several problems: first, it is more sensitive to noise and outliers; second, it is more sensitive to the shift in phase or amplitude of time series
However, the above type of methods are carried out in stages, that is, the features of the time series are first extracted, and then the clustering operation is performed
These two stages are independent and not combined, which may cause the extracted features not to be well suited for clustering

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A depth K-means clustering method for time series data
  • A depth K-means clustering method for time series data
  • A depth K-means clustering method for time series data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0054] Such as figure 1 As shown, the depth K-means clustering method for time series data disclosed in this embodiment includes the following steps:

[0055] Step S1: Obtain a time series data set, preprocess the data, and separate sample information and category information of the data. The data set uses the UCR_TS_Archive_2015 data set. It includes 48 data sets, including both artificially synthesized and real data sets, covering various fields. Each data set contains 56 to 9236 pieces of data. The length of the sequence in the same data set is the same, and the length of the sequence in different data sets ranges from 24 to 1882. To label these data sets, each sequence can only belong to one class. In clustering, it should be interpreted as the cluster to which the sequence belongs.

[0056] Step S2, construct a codec model, wherein the encoder adopts a three-layer convolution structure, and the decoder adopts a three-layer deconvolution structure.

[0057] Step S3, forming a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a depth K-means clustering method for time series data, which comprises the following steps of: obtaining a time sequence data set, preprocessing the time sequence data set, separating sample information and category information of data, and inputting the sample information of the data into a model; constructing a codec model, and adopting a convolution-deconvolution framework; Introducing the K-means clustering loss into a codec model to form a depth K-means clustering model facing the time series data, and fusing the extracted features with the clustering target; Training the constructed model by utilizing a back propagation algorithm, and guiding generation of a hidden layer state; Finally, performing k-means clustering based on the hidden layer state representation generated by the training process, and calculating the rand index index.. By fusing the extracted features with the clustering target, the generated hidden layer state not only can reconstruct theoriginal sample, but also is beneficial to the formation of a cluster-shaped structure, so that the clustering operation is better carried out, and the clustering precision is improved.

Description

Technical field [0001] The present invention relates to the technical field of time series clustering, in particular to a deep K-means clustering method for time series data. Background technique [0002] Clustering is a very common task in data mining. The purpose is to divide the data sample into a number of disjoint subsets, each of which forms a cluster. In particular, the distance between clusters should be as large as possible, and the distance within clusters should be as small as possible. Time series clustering is a type of clustering. [0003] Time series data are interesting because they are ubiquitous in various fields such as science, engineering, business, finance, economics, medical care, and government. The purpose of time series clustering is to identify groups with similar properties based on a given similarity measure. It is an important and useful technique for exploratory research on the characteristics of different groups in a given time series data set. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
Inventor 马千里李森郑佳炜
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products