Unlock instant, AI-driven research and patent intelligence for your innovation.

Fast spectral clustering method based on multi-scale data structure

A data structure and multi-scale technology, applied in the field of clustering, can solve the problems that the DBSCAN algorithm cannot reflect high-dimensional data well, the difference between clusters is large, and the calculation cost is large.

Active Publication Date: 2019-07-05
SUZHOU UNIV
View PDF6 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Although the K-means algorithm has high data processing efficiency, it cannot effectively process non-convex data sets. It can only be used as a small part of the data processing method and cannot independently complete some data classification tasks.
[0008] The DBSCAN algorithm cannot reflect high-dimensional data very well. At the same time, if the density distribution of the data is uneven and the clustering distance differs greatly, the clustering result will be poor.
[0009] Spectral Clustering algorithm, as one of the most effective clustering algorithms at present, can handle various types of data very well. At the same time, due to its algorithm characteristics, it also has its own advantages when dealing with high-dimensional data. However, due to In data processing, the spectral clustering algorithm needs to construct a similarity matrix, and also needs to solve the eigenvectors of the similarity matrix. The calculation overhead of these two steps is very high. For the current large-scale data and large-scale image data, Such a computational burden is unacceptable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast spectral clustering method based on multi-scale data structure
  • Fast spectral clustering method based on multi-scale data structure
  • Fast spectral clustering method based on multi-scale data structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and implement it, but the examples given are not intended to limit the present invention.

[0047] Step 1: For input d-dimensional spatial data V={v 1 , v 2 ,...,v n},in Use the K-d tree algorithm to preprocess the data to obtain a series of data sets U={u 1 , u 2 ,...,u m} (where n is the number of data points, d is the dimension of the data, m is the number of data sets) and a tree structure. Specific steps are as follows:

[0048] Step 1.1 Construct root node S 0 ,S 0 The data in is the entire data set V, calculate the variance V on each dimension in the data set, find the maximum dimension corresponding to the variance maxV, set it to maxDim (because the larger the variance, the lower the coupling between the data, and the lower the coupling between the data...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a fast spectral clustering method based on a multi-scale data structure. The method comprises the following steps of step 1, for the inputted d-dimensional space data V = {v1,v2,..., vn}, adopting a K-d tree algorithm to preprocess the data to obtain a series of data sets U = {u1, u2,..., um} (wherein n is the number of data points, d is the dimension of the data, and m isthe number of the data sets) and a tree structure; 2, calculating a similarity matrix W between the sets, wherein when the kernel function used for calculating W is specifically realized, some sampling points are selected from the sets, and the similarity degree of the two sets is measured by using the Euclidean distance between the sampling points. The method has the beneficial effects that thek-d tree algorithm is innovatively adopted to obtain the series of data sets, and an original similarity matrix constructed based on data is replaced by calculating the similarity matrix between the sets.

Description

technical field [0001] The invention relates to the field of clustering, in particular to a fast spectral clustering method based on a multi-scale data structure. Background technique [0002] From the perspective of machine vision and machine learning, clustering is an unsupervised learning process, which classifies data according to the similarity between data, so that the data in the same category have the largest similarity to each other. The data between different categories have the least similarity with each other. Data clustering is widely used in the segmentation of medical images and the classification of financial data. In recent years, with the development of artificial intelligence, machine learning, and computer vision, the research on data clustering methods has become particularly important. Currently, data clustering methods are generally based on the following points: 1. Partition-based clustering methods; 2. Density-based clustering methods; 3. Graph theo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/23
Inventor 陈旻昕张重阳朱国丰吴晨健陈虹
Owner SUZHOU UNIV