Unlock instant, AI-driven research and patent intelligence for your innovation.

Parallel and highly efficient grid-and-density-based multi-dimensional spatial data clustering algorithm (GRIDEN)

A multi-dimensional space and data clustering technology, which is applied in the field of data mining and big data analysis, can solve the problems of insufficient precision of spatial data clustering algorithms and insufficient efficiency of spatial data clustering algorithms, and achieve powerful parallel computing capabilities and reduce time effect of complexity

Pending Publication Date: 2018-04-13
CHINA TOBACCO GUANGXI IND
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The present invention aims to solve the problem that the efficiency of the existing density-based spatial data clustering algorithm is not high enough, and the accuracy of the existing grid-based spatial data clustering algorithm is not high enough. The accuracy and the speed of the grid clustering algorithm based on the variable space data clustering algorithm, combined with the idea and method of parallel computing, this method has reliable calculation accuracy and very powerful calculation efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel and highly efficient grid-and-density-based multi-dimensional spatial data clustering algorithm (GRIDEN)
  • Parallel and highly efficient grid-and-density-based multi-dimensional spatial data clustering algorithm (GRIDEN)
  • Parallel and highly efficient grid-and-density-based multi-dimensional spatial data clustering algorithm (GRIDEN)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0037] This embodiment provides a multi-dimensional spatial data clustering algorithm GRIDEN based on grid and density, such as figure 1 shown, including:

[0038] S1: Create a D-dimensional spatial data grid G ​​according to the preset neighbor distance parameter ε, grid division coefficient k, and D-dimensional spatial data set P, and map the data points in P to G.

[0039] S2: Calculate the neighbor grid subset S with respect to the neighbor distance parameter ε and the grid division coefficient k.

[0040] S3: Perform unsupervised spatial grid clustering on the D-dimensional spatial data grid G ​​according to the minimum neighbor parameter Min_N and the neighbor grid subset S, and perform unsupervised spatial grid clustering on the entire D-dimensional grid according to the clustering results of the data points. The above D-dimensional spatial data set P is classified and labeled.

[0041] In the above scheme, first create a D-dimensional spatial data grid G ​​composed of ...

Embodiment 2

[0043] In the above step S3, four sequential steps may be adopted to realize the parallel calculation of the unsupervised spatial grid clustering process. An implementation is provided in this embodiment, including:

[0044] Specifically, such as figure 2 shown, including the following steps:

[0045] S31: According to the preset minimum neighbor parameter Min_N, for any grid in which the number of data points in the D-dimensional spatial data grid G ​​is not empty, calculate the sum of the number of data points in the subset S of adjacent grids in parallel, and calculate the total Grids larger than the minimum neighbor parameter Min_N are marked as core grids and given independent class labels;

[0046] S32: iteratively traversing and calculating all core grids in the D-dimensional spatial data grid G ​​in parallel, and merging the core grid and all other core grids in the adjacent grid subset S into one class, if the iteration process has not ended Then continue the iter...

Embodiment 3

[0050] Figure 5 It is a schematic diagram of the hardware structure of the electronic equipment of the multi-dimensional spatial data clustering algorithm GRIDEN based on the grid and density provided in this embodiment, as Figure 5 As shown, the equipment includes:

[0051] one or more processors 701 and memory 702, Figure 5 A processor 701 is taken as an example.

[0052] The device for executing the grid-and-density-based multidimensional spatial data clustering algorithm GRIDEN may also include: an input device 703 and an output device 704 .

[0053] The processor 701, the memory 702, the input device 703 and the output device 704 may be connected via a bus or in other ways, Figure 5 Take connection via bus as an example.

[0054] As a non-volatile computer-readable storage medium, the memory 702 can be used to store non-volatile software programs, non-volatile computer-executable programs and modules, such as the grid-and-density-based multidimensional The progra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention particularly relates to a parallel and highly efficient grid-and-density-based multi-dimensional spatial data clustering algorithm (GRIDEN). Density-based clustering parameters of a neighbor distance epsilon, a minimum neighbor number Min_N and a grid partitioning coefficient k are preset; a D-dimensional spatial data grid G is created according to preset values and a D-dimensional spatial data set P; neighbor lattice subsets S about the epsilon and the k are calculated; unsupervised spatial grid clustering is carried out on the D-dimensional spatial data grid G according to theneighbor lattice subsets S; and classification labeling is carried out on the entire D-dimensional spatial data set P according to clustering results of a D-dimensional lattice where data points are located. Through the technical solution of the invention, density-based unsupervised clustering can be carried out on massive multi-dimensional spatial data sets, and highly efficient, fast and parallel spatial data clustering calculation is realized.

Description

technical field [0001] The invention relates to the fields of data mining and big data analysis, in particular to a parallel and efficient multi-dimensional spatial data clustering algorithm GRIDEN based on grid and density. Background technique [0002] Spatial data clustering is widely used in many information technology fields, such as data mining, pattern recognition, machine learning, artificial intelligence, visual analysis, geographic information system, etc. In the era of big data, it can be used to explore and discover potential patterns and values ​​in data, and can be applied to many disciplines, such as astronomy, bioinformatics, bibliometrics, social network analysis, economic network analysis, transportation network analysis, meteorology analytics, smart city development, etc. There are four traditional spatial data clustering methods: 1), partition-based clustering; 2), density-based clustering; 3), hierarchical clustering; 4), grid-based clustering. [0003...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06K9/62
CPCG06F16/2465G06F2216/03G06F18/23211
Inventor 邓超陈智斌郭晓惠农英雄黄聪李喆韦屹汪倍贝钱方远
Owner CHINA TOBACCO GUANGXI IND
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More