A Hierarchical Clustering Method Based on Mutual Shared Nearest Neighbors

A hierarchical clustering and nearest neighbor technology, applied in text database clustering/classification, structured data retrieval, instrumentation, etc., can solve problems such as point division errors and low clustering accuracy, and achieve good clustering accuracy, clustering The effect of high precision and high clustering total purity

Inactive Publication Date: 2017-11-03
XIAN UNIV OF TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a hierarchical clustering method based on mutual shared nearest neighbors, which solves the problem of point division errors in the generation of sub-cluster sets in the process of sparse and graph division based on K-nearest neighbor graph clustering. The problem of low clustering accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Hierarchical Clustering Method Based on Mutual Shared Nearest Neighbors
  • A Hierarchical Clustering Method Based on Mutual Shared Nearest Neighbors
  • A Hierarchical Clustering Method Based on Mutual Shared Nearest Neighbors

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0087] The present invention is based on the hierarchical clustering method of mutually sharing the nearest neighbor, and consists of three key steps: calculating the nearest neighbor matrix, matrix division, and hierarchical clustering. First calculate the nearest neighbor matrix T1 and the nearest neighbor matrix T2 of the entire data set D, (parameters k1, k2 are input parameters, k2>k1); calculate the nearest neighbor ranking matrix M from the nearest neighbor matrix T1 and the nearest neighbor matrix T2; by The nearest neighbor ranking matrix M calculates the local density to obtain a set of sub-clusters; finally calculates the similarity between sub-clusters, and agglomerates the sub-clusters to obtain K clusters.

[0088] First, calculate the nearest neighbor matrix, the specific process is as follows:

[0089] Suppose the k1 nearest neighbor matrix of dataset D is k1 is the input parameter, 0 k2 is the input parameter of the algorithm, 0ij ],and 0 for The first k...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a hierarchical clustering method based on mutual shared nearest neighbors. The hierarchical clustering method comprises the steps of firstly calculating a nearest neighbor matrix T1 and a nearest neighbor matrix T2 of a whole data set D; calculating a nearest ranking matrix M according to the nearest neighbor matrix T1 and the nearest neighbor matrix T2; calculating a local density through the nearest ranking matrix M to obtain a submanifold set; finally calculating the similarity of submanifolds, and gathering the submanifolds to obtain a final partitioning result. According to the hierarchical clustering method based on mutual shared nearest neighbors, the problem of low clustering precision caused by point partitioning errors during generation of the submanifold set in the conventional sparsification and graph partitioning processes based on K near graph clustering is solved.

Description

technical field [0001] The invention belongs to the technical field of data mining of computer science and technology, and relates to a hierarchical clustering method based on mutual shared nearest neighbors. Background technique [0002] Cluster analysis is an important research topic in the field of data mining. Clustering technology has been widely used in telecommunications, retail, biology, marketing and other fields. Clustering is an unsupervised classification, which is used to find data points that are clustered into clusters based on the characteristics of the object itself, and to ensure that the clusters have as much similarity as possible and the clusters have as much dissimilarity as possible. . Existing clustering algorithms are generally divided into: 1. Partition-based clustering algorithms represented by K-means, Fuzzy K-means, and k center point; 2. Hierarchical-based clustering algorithms represented by QROCK, CURE, BIRCH, Clustering algorithms; 3. Dens...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/285G06F16/35
Inventor 周红芳王心怡刘园郭杰段文聪何馨依刘杰李锦
Owner XIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products