Fast spectral clustering method based on multi-scale data structure

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A data structure and multi-scale technology, applied in the field of clustering, can solve problems such as unacceptable computational burden, inability to effectively process non-convex data sets, and high computational overhead

Active Publication Date: 2020-10-30

SUZHOU UNIV

View PDF6 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] Although the K-means algorithm has high data processing efficiency, it cannot effectively process non-convex data sets. It can only be used as a small part of the data processing method and cannot independently complete some data classification tasks.

[0008] The DBSCAN algorithm cannot reflect high-dimensional data very well. At the same time, if the density distribution of the data is uneven and the clustering distance differs greatly, the clustering result will be poor.

[0009] Spectral Clustering algorithm, as one of the most effective clustering algorithms at present, can handle various types of data very well. At the same time, due to its algorithm characteristics, it also has its own advantages when dealing with high-dimensional data. However, due to In data processing, the spectral clustering algorithm needs to construct a similarity matrix, and also needs to solve the eigenvectors of the similarity matrix. The calculation overhead of these two steps is very high. For the current large-scale data and large-scale image data, Such a computational burden is unacceptable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments, so that those skilled in the art can better understand the present invention and implement it, but the examples given are not intended to limit the present invention.

[0047] Step 1: For input d-dimensional spatial data V={v 1 , v 2 ,...,v n},in Use the K-d tree algorithm to preprocess the data to obtain a series of data sets U={u 1 , u 2 ,...,u m} (where n is the number of data points, d is the dimension of the data, m is the number of data sets) and a tree structure. Specific steps are as follows:

[0048] Step 1.1 Construct root node S 0 ,S 0 The data in is the entire data set V, calculate the variance V in each dimension of the data set, find the maximum dimension corresponding to the variance maxV, set it to maxDim (because the larger the variance, the lower the coupling between the data, and the lower the coupling between the data...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a fast spectral clustering method based on a multi-scale data structure. A kind of fast spectral clustering method based on multi-scale data structure of the present invention, comprises: Step 1: For input d-dimensional space data V={v 1 , v 2 ,...,v n}, where the K-d tree algorithm is used to preprocess the data to obtain a series of data sets U={u 1 , u 2 ,...,u m} (where n is the number of data points, d is the dimension of the data, and m is the number of data sets) and a tree structure; Step 2: Calculate the similarity matrix W between sets and sets; where the kernel used to calculate W When the function is implemented specifically, it selects some sampling points from the collection and uses the Euclidean distance between the sampling points to measure the similarity of the two collections. Beneficial effects of the present invention: the method innovatively adopts the k-d tree algorithm to obtain a series of data sets, and replaces the original similarity matrix constructed based on data by calculating the similarity matrix between sets.

Description

technical field [0001] The invention relates to the field of clustering, in particular to a fast spectral clustering method based on a multi-scale data structure. Background technique [0002] From the perspective of machine vision and machine learning, clustering is an unsupervised learning process, which classifies data according to the similarity between data, so that the data in the same category have the largest similarity to each other. The data between different categories have the least similarity with each other. Data clustering is widely used in the segmentation of medical images and the classification of financial data. In recent years, with the development of artificial intelligence, machine learning, and computer vision, the research on data clustering methods has become particularly important. Currently, data clustering methods are generally based on the following points: 1. Partition-based clustering methods; 2. Density-based clustering methods; 3. Graph theo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06K9/62

CPCG06F18/23

Inventor 陈旻昕张重阳朱国丰吴晨健陈虹

Owner SUZHOU UNIV

Fast spectral clustering method based on multi-scale data structure

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology