An improved DPC clustering algorithm and system based on a symmetric neighbor relation

A clustering algorithm and neighbor relationship technology, applied in the field of clustering algorithms, can solve the problems of difficult to determine the value, difficult to determine the value of the parameter truncation distance, and the influence of clustering results, achieve good clustering results, and avoid parameter selection errors. , Improve the effect of clustering effect

Inactive Publication Date: 2019-06-14
CHONGQING UNIV
View PDF1 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But at the same time, it still has some defects, such as the parameter truncation distance d c It is difficult to determine the value of , which has a serious impact on the clustering results; if the distribution of non-central points is unreasonable, it is easy to produce the "Domiller" effect; the clustering results shown in uneven density data sets are not ideal , especially the manifold dataset
[0003] For example, the invention patent application with the application publication number CN107392249A discloses a K-nearest neighbor similarity-optimized density peak clustering method, which uses K-nearest neighbors to detect pointing points when assigning each point, and finally assigns the remaining points to pointing points. The family of points to improve the accuracy and scope of application of the DPC algorithm, but after sorting the distances between all points from small to large, take the 2% to 5% small distance as the cut-off distance d c , there is still a truncation distance d c It is difficult to determine the value of

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An improved DPC clustering algorithm and system based on a symmetric neighbor relation
  • An improved DPC clustering algorithm and system based on a symmetric neighbor relation
  • An improved DPC clustering algorithm and system based on a symmetric neighbor relation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0061] Such as figure 1 As shown, the present embodiment provides an improved DPC clustering algorithm based on symmetric neighbor relationship, comprising the following steps:

[0062] S1: Obtain the influence space of each sample point;

[0063] Calculate the distance between all sample points; create a KD tree, use the KD tree to obtain the k-nearest neighbors and reverse k-nearest neighbors of each sample point, and intersect the k-nearest neighbors and reverse k-nearest neighbors to obtain each Influence space of sample points.

[0064] S2: Calculate the product of sample point density and distance;

[0065] Firstly, the density is calculated according to the formula of the Gaussian kernel function and the influence space obtained in step S1, and then the distance of each sample point is obtained according to the density, and the product of the density and the distance of each sample point is calculated, and the sample point is calculated according to its density and di...

Embodiment 2

[0127] This implementation provides an improved DPC clustering algorithm system based on symmetric neighbor relationships that can implement the improved DPC clustering algorithm based on symmetric neighbor relationships described in Embodiment 1, such as Figure 7 As shown, the improved DPC clustering algorithm system based on symmetric neighbor relationship is an electronic device 310 (such as a computer server with program execution function), which includes at least one processor 311, a power supply 314, and communicates with the at least one processor 311 Connected memory 312 and input-output interface 313; the memory 312 stores instructions that can be executed by the at least one processor 311, and the instructions are executed by the at least one processor 311, so that the at least one processor 311 can execute the methods disclosed in any of the foregoing embodiments; the input and output interface 313 can include a display, a keyboard, a mouse, and a USB interface for...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an improved DPC clustering algorithm and system based on a symmetric neighbor relation, and the algorithm comprises the following steps of S1, obtaining the influence space ofeach sample point; S2, calculating the product of the sample point density and the distance; and S3, performing clustering operation according to the width-first traversal mode. According to the method, parameters can be reasonably selected, the Miller effect is avoided, the clustering effect is improved, and particularly, a very good clustering effect can be obtained on a data set with non-uniform density.

Description

technical field [0001] The invention relates to the field of clustering algorithms, in particular to an improved DPC clustering algorithm and system based on symmetric neighbor relationship. Background technique [0002] Clustering is an important data mining method, which belongs to a description class of data mining methods. Clustering divides data objects into multiple groups or clusters, objects in the same group are similar to each other, and objects in different groups are not similar to each other. Unlike classification, clustering is an unsupervised learning method that requires no prior knowledge and does not require labeling of the data. There are various clustering algorithms, including partitional clustering, hierarchical clustering, density-based clustering, distribution-based clustering, and spectral clustering. The DPC algorithm is a density-based clustering algorithm that was published in the journal Science in 2014. The algorithm finds the center of the c...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
Inventor 吴春蓉李佳颜晓彤杨小川杨瑞龙
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products