A clustering method based on near neighbor density and manifold distance

A clustering method and a manifold technology, applied in the field of clustering, can solve the problems of not considering the impact, being unable to discover, describe, etc., to achieve the effect of improving clustering accuracy, reducing data volume, and improving operating efficiency

Inactive Publication Date: 2019-01-25
LIAONING UNIVERSITY
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Euclidean distance cannot accurately describe it, and clusters of arbitrary shapes cannot be found
In addition, most scholars regard all attributes of the data as equally important when conducting cluster analysis research, that is, each attribute is set to the same weight during algorithm design, and does not consider the impact of data mining analysis due to unequal weights of data attributes. impact of results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A clustering method based on near neighbor density and manifold distance
  • A clustering method based on near neighbor density and manifold distance
  • A clustering method based on near neighbor density and manifold distance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0091] 1), experimental data set

[0092] In order to test the clustering performance and efficiency of the MD-CDData algorithm, this paper uses 3 artificial data sets and 3 UCI data sets for experiments, and uses MD-CDData algorithm, standard k-means algorithm, TPC algorithm, DBSCAN algorithm and TPC-ABC Algorithms for comparative analysis. Table 1 gives some properties of the datasets used in the experiments.

[0093] Table 1 Datasets used in experiments

[0094]

[0095] Among them, the first three data sets are artificial data sets with complex nonlinear distribution structure, and their distribution shapes are roughly as follows: two parallel line segments, one half-ring, two solid blocks, and two long and two short four parallel line segments. The latter three datasets are from UCI public datasets, which have high dimensionality and contain various data distribution structures.

[0096] 2), Experimental results and analysis

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A clustering method based on near neighbor density and manifold distance comprises the following steps: 1) calculating the weight of each feature according to improved information entropy); 2) calculating the near neighbor density of each sample accord to the weighted Euclidean distance, and selecting a center point accord to the nearest neighbor density; 3) calculating the Euclidean distance of each sample in the data set obtained in the step 2, and constructing an adjacency graph; 4) calculating the manifold distance between every two vertices in the adjacency graph to form a manifold distance matrix; 5) selecting k initial clustering centers and classifying each point into a cluster represented by a clustering center with the smallest distance from the manifold; and 6) renewing the cluster center, then repeat step 5 until that cluster center is no longer changes or reaches the upper limit of iteration times. The invention provides a clustering method with high algorithm running efficiency and good clustering precision through the abovementioned method.

Description

technical field [0001] The invention relates to a clustering method, in particular to a clustering method based on neighbor density and manifold distance. Background technique [0002] With the development of information technology and Internet of Things technology, modern industry has accumulated a large amount of data over time. However, due to the mutual influence and interconnection of various variables in the production process, the distribution of these data in the sample space is complex. Industrial big data contains great value, and how to improve the availability of industrial big data and mine value from complexly distributed industrial big data has become a research hotspot. [0003] As an important data mining method, clustering can divide a data set into several clusters whose intra-class objects are as similar as possible and the inter-class data objects are different, so as to discover potential data patterns and internal connections in the data set. Most clu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/2458G06K9/62
CPCG06F18/23213G06F18/2193
Inventor 王妍李俊杨冰清曾辉李玉诺
Owner LIAONING UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products