Similarity measurement and truncation method in chameleon algorithm

A similarity measurement and chameleon technology, applied in computing, computer components, instruments, etc., can solve the problems of difficult parameter selection and large influence of the threshold method

Active Publication Date: 2018-12-04
哈尔滨泛海科技开发有限公司
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the stage of merging and partitioning clusters, the parameter selection of the threshold method is more difficult, and different parameter values ​​have a greater impact on the results; the function method does not completely evaluate the similarity between clusters through the density, and in each iteration it is Least Bisector Between Two Equal Parts of Dependent Hypergraph

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Similarity measurement and truncation method in chameleon algorithm
  • Similarity measurement and truncation method in chameleon algorithm
  • Similarity measurement and truncation method in chameleon algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0063] The present invention first runs the improved algorithm on four artificial data sets to verify the correctness of the proposed algorithm, and then uses four clustering algorithms on the UCI data set to verify the validity of the algorithm. The characteristics of the 4 artificial datasets are as Figure 5 As shown, the corresponding data view is as follows figure 2 shown.

[0064] figure 2 It shows the clusters found by Chameleon using the same set of parameter values ​​for 4 data sets. The present invention uses α=1, β=1, k=10 in the optimization of the function definition, and uses the combination of colors and fonts to represent different clusters Points, therefore, points belonging to the same cluster use the same color and glyph. The results show that Chameleon is able to find the earliest points in the dataset that are true clusters, that is, they correspond to the earliest iterations of the Chameleon algorithm to determine true clusters and place them in a c...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a similarity measurement and truncation method in a chameleon algorithm, belonging to the technical field of agglomeration type hierarchical clustering algorithms. A chameleonruns on a sparse graph, and nodes represent data items and weighted edges represent similarities between the data items. The chameleon finds out clusters in a dataset through a two-stage algorithm, wherein in the first stage, a k-nearest neighbor graph Gk is constructed according to the dataset, and the data items are divided into several relatively small sub-clusters through a graph partitioningalgorithm; in the second stage, through an algorithm, real clusters are found by repeatedly combining the sub-clusters. The improved algorithm improves a conventional chameleon clustering algorithm byintroducing recursive dichotomy, flood fill and first hop truncation. The invention also provides a method that can automatically select the best clustering result from a modified chameleon tree diagram.

Description

technical field [0001] The invention belongs to the technical field of agglomerative hierarchical clustering algorithms, in particular to similarity measurement and truncation methods in chameleon algorithms. Background technique [0002] Among the clustering algorithms, the ROCK hierarchical clustering algorithm mainly emphasizes the relative interconnectivity between the clusters, so the algorithm ignores the relative proximity between the clusters. Compared with the ROCK algorithm, the CURE hierarchical clustering algorithm emphasizes the clustering The relative proximity between clusters, but ignores the relative interconnectivity between clusters. By comparing the shortcomings of the CURE and ROCK algorithms, relevant personnel have developed a dynamic modeling hierarchical clustering algorithm - Chameleon algorithm, which not only considers The relative interconnectivity and relative proximity between clusters also take into account the internal characteristics of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/23G06F18/22G06F18/2193
Inventor 董宇欣姜凯谢晓东褚慈秦帅印桂生王野王红滨王勇军白云鹏
Owner 哈尔滨泛海科技开发有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products