Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-GPU Density Peak Clustering Method Based on Locality Sensitive Hashing

A locally sensitive hashing and density peak technology, which is applied in text database clustering/classification, database model, unstructured text data retrieval, etc., can solve the problems of high computational complexity and high time consumption, and improve the reading and writing speed , Improve the effect of parameter calculation speed

Active Publication Date: 2021-05-14
NAT UNIV OF DEFENSE TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The present invention will aim at solving the problem that the existing density peak clustering method is too high in computational complexity and time consumption when solving large-scale and high-dimensional data set clustering problems, and provides a multi-GPU density peak based on local sensitive hashing clustering method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-GPU Density Peak Clustering Method Based on Locality Sensitive Hashing
  • Multi-GPU Density Peak Clustering Method Based on Locality Sensitive Hashing
  • Multi-GPU Density Peak Clustering Method Based on Locality Sensitive Hashing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to better understand the technical solutions in this application, the following will give a clear and detailed description of this application in conjunction with the drawings and specific implementation methods in the embodiments of this application:

[0043] The multi-GPU-based parallel method for density peak clustering consists of four processes: computing distance matrix; computing local density; computing distance δ; computing cluster centers and assigning clusters. Among them, before calculating the local density in each GPU, it is necessary to calculate the parameter d c . This application is mainly aimed at calculating the distance matrix and parameter d in a multi-GPU environment cOptimize to improve computing parallelism and speed. In order to make full use of the parallel acceleration performance of multiple GPUs, the method starts the CPU with the same number of threads as the number of GPUs, and allocates an appropriate amount of data to each th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of data mining, aiming at solving the problems of too high computational complexity and too high time consumption when the existing density peak clustering method solves the clustering problem of large-scale and high-dimensional data sets, it provides a method based on local sensitive hashing. Multi-GPU density peak clustering method, including four processes: calculating distance matrix; calculating local density; calculating distance δ; calculating cluster centers and assigning clusters. The core idea is that the original data is divided by locality-sensitive hashing, and the shared memory is used to improve the reading and writing speed. This method makes full use of locality-sensitive hashing to divide similar data into the same GPU, which can reduce unnecessary distance calculations. In addition, multiple hash functions are designed to divide data through OR operations, reducing similar data being mapped to different GPUs, and implementing multi-GPU statistical methods based on the message passing interface MPI, which improves the parameter calculation speed.

Description

technical field [0001] The invention belongs to the field of data mining, in particular to a multi-GPU (Graphic Processing Unit, graphics processor) density peak clustering method based on local sensitive hashing. Background technique [0002] Clustering is an unsupervised classification technique, the purpose of which is to divide unlabeled data sets into limited categories or clusters according to the similarity between the data, and finally make the data similarity within the group large and the difference between the groups large. Because clustering can find hidden patterns in data sets, it has been widely used in many scientific researches such as machine learning, computer vision and bioinformatics. At present, there are mainly the following clustering methods: (1) K-means and K-medoids methods, which use the center of the data point as the corresponding clustering center, can only find the shortcomings of spherical clusters, and are not suitable for non-spherical clus...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/28G06F16/35
Inventor 李东升葛可适苏华友
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products