Density peak clustering algorithm based on K neighbors and shared neighbors

A clustering algorithm and technology of sharing neighbors, applied in computing, computer components, instruments, etc., can solve problems such as poor clustering effect and achieve good clustering effect

Inactive Publication Date: 2019-09-13
NORTHWESTERN POLYTECHNICAL UNIV
View PDF0 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to overcome the deficiency of poor clustering effect of the existing density peak clustering algorithm, th

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Density peak clustering algorithm based on K neighbors and shared neighbors
  • Density peak clustering algorithm based on K neighbors and shared neighbors
  • Density peak clustering algorithm based on K neighbors and shared neighbors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] refer to Figure 1-6 . The specific steps of the density peak clustering algorithm based on K nearest neighbors and shared nearest neighbors in the present invention are as follows:

[0049] Step 1. Input the data Data to be clustered, the neighbor parameter K and the radius r of the neighborhood;

[0050] Step 2, data processing, including filling of missing values ​​and data normalization;

[0051] Step 3, calculate the distance between data samples, calculate ρ and δ of each data sample point according to formula (1), (2), (3);

[0052]

[0053]

[0054] Among them, d in formula (1) and (2) c is the cutoff distance, d ij is the Euclidean distance between sample i and sample j.

[0055]

[0056] Among them, d ij is the Euclidean distance between sample i and sample j, and p is the local density of sample points.

[0057] Step 4. Construct a decision diagram according to the values ​​of ρ and δ, and select the set C composed of each cluster center;

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a density peak clustering algorithm based on K neighbors and shared neighbors. The density peak clustering algorithm is used for solving the technical problem that an existingdensity peak clustering algorithm is poor in clustering effect. The technical scheme is to improve a DPC algorithm based on the similarity between K-neighbors and shared neighbors, wherein the attribution of each data sample point is determined by KNN distribution information and SNN shared neighbor similarity; and if more points belonging to a certain class cluster in the KNN (i) of the i exist,and the Euclidean distance between the points and the i is closer, the similarity between the two sample points is greater, and the attribution value of the sample i relative to the class cluster to which the KNN (i) belongs is greater while the probability that the sample point i is distributed to the class cluster is greater at the moment.The clustering center appears in an area with higher local density. The density peak clustering algorithm provided by the invention avoids the defect that the DPC algorithm measures the sample density and the associated distribution error similar to the domino effect generated during sample distribution, and the clustering effect is good.

Description

technical field [0001] The invention relates to a density peak clustering algorithm, in particular to a density peak clustering algorithm based on K-nearest neighbors and shared nearest neighbors. Background technique [0002] Cluster analysis is a statistical analysis method for studying classification problems, and it is an important technique for data mining. It classifies data samples and object sets in an unsupervised manner, and is widely used in data mining, pattern recognition, document retrieval, and image segmentation. field. Especially in the context of big data, the existence of massive and diverse data makes the research of clustering algorithms with automatic understanding, processing and summarization of data attract extensive attention. The main purpose of clustering is to divide a given group into groups or clusters with common characteristics, so that the data similarity within a group is relatively high, while the differences between groups are more obvio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2321
Inventor 殷茗王文杰马怀宇姜继娇孟丹荔张煊宇马子琛芦菲娅杨益王一博周翔熊敏光李欣吴瑜
Owner NORTHWESTERN POLYTECHNICAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products