Fast spectral clustering method based on multi-scale data structure

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A data structure and multi-scale technology, applied in the field of clustering, can solve the problems that the DBSCAN algorithm cannot reflect high-dimensional data well, the difference between clusters is large, and the calculation cost is large.

Active Publication Date: 2019-07-05

SUZHOU UNIV

View PDF6 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] Although the K-means algorithm has high data processing efficiency, it cannot effectively process non-convex data sets. It can only be used as a small part of the data processing method and cannot independently complete some data classification tasks.

[0008] The DBSCAN algorithm cannot reflect high-dimensional data very well. At the same time, if the density distribution of the data is uneven and the clustering distance differs greatly, the clustering result will be poor.

[0009] Spectral Clustering algorithm, as one of the most effective clustering algorithms at present, can handle various types of data very well. At the same time, due to its algorithm characteristics, it also has its own advantages when dealing with high-dimensional data. However, due to In data processing, the spectral clustering algorithm needs to construct a similarity matrix, and also needs to solve the eigenvectors of the similarity matrix. The calculation overhead of these two steps is very high. For the current large-scale data and large-scale image data, Such a computational burden is unacceptable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and implement it, but the examples given are not intended to limit the present invention.

[0047] Step 1: For input d-dimensional spatial data V={v 1 , v 2 ,...,v n},in Use the K-d tree algorithm to preprocess the data to obtain a series of data sets U={u 1 , u 2 ,...,u m} (where n is the number of data points, d is the dimension of the data, m is the number of data sets) and a tree structure. Specific steps are as follows:

[0048] Step 1.1 Construct root node S 0 ,S 0 The data in is the entire data set V, calculate the variance V on each dimension in the data set, find the maximum dimension corresponding to the variance maxV, set it to maxDim (because the larger the variance, the lower the coupling between the data, and the lower the coupling between the data...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a fast spectral clustering method based on a multi-scale data structure. The method comprises the following steps of step 1, for the inputted d-dimensional space data V = {v1,v2,..., vn}, adopting a K-d tree algorithm to preprocess the data to obtain a series of data sets U = {u1, u2,..., um} (wherein n is the number of data points, d is the dimension of the data, and m isthe number of the data sets) and a tree structure; 2, calculating a similarity matrix W between the sets, wherein when the kernel function used for calculating W is specifically realized, some sampling points are selected from the sets, and the similarity degree of the two sets is measured by using the Euclidean distance between the sampling points. The method has the beneficial effects that thek-d tree algorithm is innovatively adopted to obtain the series of data sets, and an original similarity matrix constructed based on data is replaced by calculating the similarity matrix between the sets.

Description

technical field [0001] The invention relates to the field of clustering, in particular to a fast spectral clustering method based on a multi-scale data structure. Background technique [0002] From the perspective of machine vision and machine learning, clustering is an unsupervised learning process, which classifies data according to the similarity between data, so that the data in the same category have the largest similarity to each other. The data between different categories have the least similarity with each other. Data clustering is widely used in the segmentation of medical images and the classification of financial data. In recent years, with the development of artificial intelligence, machine learning, and computer vision, the research on data clustering methods has become particularly important. Currently, data clustering methods are generally based on the following points: 1. Partition-based clustering methods; 2. Density-based clustering methods; 3. Graph theo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62

CPCG06F18/23

Inventor 陈旻昕张重阳朱国丰吴晨健陈虹

Owner SUZHOU UNIV

Fast spectral clustering method based on multi-scale data structure

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology