Hierarchical clustering method based on mutual shared nearest neighbors

A hierarchical clustering and nearest neighbor technology, applied in text database clustering/classification, structured data retrieval, instrumentation, etc., can solve problems such as point division errors and low clustering accuracy, and achieve good clustering accuracy, clustering High precision, avoiding the effect of cumulative expansion

Inactive Publication Date: 2014-12-17
XIAN UNIV OF TECH
View PDF3 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a hierarchical clustering method based on mutual shared nearest neighbors, which solves the problem of point division errors in the generation of sub-cluster sets in the process of sparse and graph division based on K-nearest neighbor graph clustering. The problem of low clustering accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hierarchical clustering method based on mutual shared nearest neighbors
  • Hierarchical clustering method based on mutual shared nearest neighbors
  • Hierarchical clustering method based on mutual shared nearest neighbors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0087] The present invention is based on the hierarchical clustering method of mutual shared nearest neighbors, and consists of three key steps: calculating the nearest neighbor matrix, matrix division, and hierarchical clustering. First calculate the nearest neighbor matrix T1 and the nearest neighbor matrix T2 of the entire data set D, (parameters k1, k2 are input parameters, k2> k1); Calculate the nearest neighbor ranking matrix M from the nearest neighbor matrix T1 and the nearest neighbor matrix T2; calculate the local density through the nearest neighbor ranking matrix M to obtain the set of sub-clusters; finally calculate the similarity between the sub-clusters and aggregate the sub-clusters to get K clusters.

[0088] First, calculate the nearest neighbor matrix, the specific process is as follows:

[0089] Assume that the k1 nearest neighbor matrix of data set D is k1 is the input parameter, 0 k2 is the algorithm input parameter, 0 ij ], and 0 for The first k1 colum...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a hierarchical clustering method based on mutual shared nearest neighbors. The hierarchical clustering method comprises the steps of firstly calculating a nearest neighbor matrix T1 and a nearest neighbor matrix T2 of a whole data set D; calculating a nearest ranking matrix M according to the nearest neighbor matrix T1 and the nearest neighbor matrix T2; calculating a local density through the nearest ranking matrix M to obtain a submanifold set; finally calculating the similarity of submanifolds, and gathering the submanifolds to obtain a final partitioning result. According to the hierarchical clustering method based on mutual shared nearest neighbors, the problem of low clustering precision caused by point partitioning errors during generation of the submanifold set in the conventional sparsification and graph partitioning processes based on K near graph clustering is solved.

Description

Technical field [0001] The invention belongs to the technical field of data mining of computer science and technology, and relates to a hierarchical clustering method based on mutual shared nearest neighbors. Background technique [0002] Cluster analysis is an important research topic in the field of data mining. Clustering technology has been widely used in telecommunications, retail, biology, marketing and other fields. Clustering is an unsupervised classification, which is used to find data points that are concentrated in the characteristics of the object itself and clustered into clusters, and to ensure that the clusters have the largest possible similarity and the largest possible dissimilarity between clusters . Existing clustering algorithms are generally divided into: 1. Partition-based clustering algorithms represented by K-means, Fuzzy K-means, and k center points; 2. Level-based clustering algorithms represented by QROCK, CURE, BIRCH, Clustering algorithms; 3. Dens...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/285G06F16/35
Inventor 周红芳王心怡刘园郭杰段文聪何馨依刘杰李锦
Owner XIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products