Grid clustering algorithm based on density peak value

A technology of grid clustering and density peaks, applied in computing, computer parts, instruments, etc., can solve problems such as inability to process large-scale data sets, failure of selecting center points, etc.

Inactive Publication Date: 2017-12-19
CHONGQING UNIV OF POSTS & TELECOMM
View PDF0 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the density peak clustering algorithm is a typical density clustering algorithm and cannot handle large-scale data sets.
[0005] Based on the idea of ​​quickly finding the center of density peak clustering, this paper proposes a grid granulation clustering algorithm, which granulates the data through the grid, and uses the frequency of sample points in the grid as the density of each grid. Avoid the failure of selecting the center point caused by the local density formula

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Grid clustering algorithm based on density peak value
  • Grid clustering algorithm based on density peak value
  • Grid clustering algorithm based on density peak value

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] The technical solutions in the embodiments of the present invention will be described clearly and in detail below with reference to the drawings in the embodiments of the present invention. The described embodiments are only some of the embodiments of the invention.

[0067] The technical scheme that the present invention solves the problems of the technologies described above is:

[0068] DPC clustering algorithm

[0069]The idea of ​​the density peak clustering algorithm is simple and novel. First, two variables of each point are calculated: the local density and the distance to the high-density point. The selection of the cluster center is based on two basic assumptions: (1) the density of the cluster center is higher than the density of its adjacent sample points; (2) the distance between the cluster center and the cluster center with a higher density is relatively larger. Obviously, the cluster center point is a point with a large local density and a large dista...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention requests to protect a grid clustering algorithm based on a density peak value, and can efficiently processing large-scale data. The algorithm comprises the following steps that: firstly, pelletizing an N-dimensional space into nonintersecting rectangular grid unit, and then, carrying out statistics on the information of unit space; utilizing a thought that a density peak value clustering is used for searching a center point to determine a center unit, i.e., surrounding the center grid unit by certain data units with low local density, and forming a large distance with the grid unit with high self local density; and finally, combining the grid units near the center grid unit so as to obtain a clustering result. A situation experiment result on an UCI (University of CaliforniaIrvine) artificial data set indicates that the algorithm can quickly obtain a clustering center, the clustering problem of large-scale data can be effectively processed, and the algorithm is high in efficiency. Compared with an original density peak value clustering algorithm, the algorithm is characterized in that time loss on different datasets is lowered by 10-100 times and accuracy loss is maintained at 5-8%.

Description

technical field [0001] The invention belongs to the fields of data mining and artificial intelligence, and in particular relates to a grid clustering algorithm based on density peaks. Background technique [0002] As an important research content in the field of data mining and artificial intelligence, clustering is an unsupervised pattern recognition method, that is, to discover potential similar patterns from a data set without the guidance of any prior information, and to group the data sets. In order to make the similarity within the same class as large as possible, and the difference between different classes as large as possible. In recent years, data mining has been widely used in many fields, such as: image [1] ,medicine [2] ,aviation [3] and other fields. In recent years, the acquisition of huge volumes of spatial data from satellites, imagery, and other sources has grown exponentially. Due to the huge data magnitude and the complexity of data types, improving ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
CPCG06F18/23213G06F18/2193
Inventor 王飞王国胤杨洁李苑欧阳卫华
Owner CHONGQING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products