Clustering algorithm on basis of minimum spanning trees and initial clustering centers

A technology of initial clustering center and initial clustering, applied in computing, computer parts, instruments, etc., to achieve stable clustering effect, avoid noise interference, and improve clustering accuracy.

Inactive Publication Date: 2018-03-06
SHANGHAI NORMAL UNIVERSITY
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In view of the above-mentioned defects of the prior art, the technical problem to be solved by the present invention is to determine the initial clustering center of the data set in combination with the geodesic distance between data points, so that the selection of the initial clustering center can bet...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Clustering algorithm on basis of minimum spanning trees and initial clustering centers
  • Clustering algorithm on basis of minimum spanning trees and initial clustering centers
  • Clustering algorithm on basis of minimum spanning trees and initial clustering centers

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0054] (1), artificially generate a data set D={d containing 90 data points 1 ,d 2 ,...,d 90}, the number of categories in this data set is K=3, and the attribute dimension of each data point is 2-dimensional. Below, the specific attributes of all data points are listed:

[0055] d 1 (0.6497, 1.7818), d 2 (1.6068, 1.0395), d 3 (1.9584,0.6588),d 4 (1.8344,0.3428),d 5 (0.9730,0.8138), d 6 (0.6096, 1.4670), d 7 (0.0519,-0.1745), d 8 (1.0918, 0.9471), d 9 (1.7432, 1.6060), d 10 (0.7359,-0.1005), d 11 (0.8657, 1.5185), d 12 (1.4774, 1.5292), d 13 (1.6692, 0.3167), d 14 (-0.0975,-0.1654), d 15 (1.1549,1.3355), d 16 (1.1517, 1.6261), d 17 (0.8829, 1.4484), d 18 (1.6915,1.5909), d 19 (0.7179,0.0843), d 20 (0.3600, 1.4522), d 21 (1.2138,0.5507), d 22 (1.3562,0.5652), d 23 (0.8315,-0.2052), d 24 (1.6167, 1.8522), d 25 (-0.1759, 1.1356), d 26(0.1674, 1.7774), d 27 (-1.3797, 2.1704), d 28 (1.4039, 0.1191), d 29 (0.2324, 1.4981), d 30 (0.8039,0.9901), d 31 (4...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a clustering algorithm on the basis of minimum spanning trees and initial clustering centers. The clustering algorithm includes steps of S1, inputting to-be-clustered data setsD and category numbers K of the data sets D; S2, creating the minimum spanning trees T<D> of the data sets and computing geodesic distances between optional two nodes in each minimum spanning tree T<D>; S3, selecting the initial clustering centers; S4, creating the minimum spanning trees T of the initial clustering centers; S5, disconnecting K-1 edges of each minimum spanning tree T<D> of thecorresponding data set. The clustering algorithm has the advantages that minimum spanning tree clustering algorithms are combined with selection of the initial clustering centers, the to-be-deleted edges are constrained on paths between the initial clustering centers and can be accurately found by the aid of comprehensive values of the density and the distances of the upper edges for forming the paths, accordingly, the clustering accuracy can be further improved, and interference of noise in the data sets can be prevented.

Description

technical field [0001] The invention relates to the fields of machine learning and data mining, in particular to a clustering algorithm based on minimum spanning trees and initial cluster centers. Background technique [0002] The purpose of clustering is to discover object categories, so that objects in the same category are relatively similar, while objects belonging to different categories are not similar. Clustering is an active research topic in different fields such as pattern recognition, machine learning, and data mining. Various clustering algorithms have been proposed for different applications. [0003] The k-means clustering algorithm proposed by Forgy belongs to the partition-based clustering algorithm. Since the k-means algorithm depends on the selection of the initial cluster center and is sensitive to noise data, the clustering results using the k-means algorithm are easy There are problems such as local optimum and unstable clustering results. The hierarc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2323G06F18/23213
Inventor 马燕吕晓波李顺宝黄慧张玉萍
Owner SHANGHAI NORMAL UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products