Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-GPU density peak value clustering method based on local sensitive Hashing

A local-sensitive hash and density peak technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of high computational complexity and high time consumption, improve the speed of reading and writing, and improve the speed of parameter calculation. Effect

Active Publication Date: 2018-11-27
NAT UNIV OF DEFENSE TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] The present invention will aim at solving the problem that the existing density peak clustering method is too high in computational complexity and time consumption when solving large-scale and high-dimensional data set clustering problems, and provides a multi-GPU density peak based on local sensitive hashing clustering method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-GPU density peak value clustering method based on local sensitive Hashing
  • Multi-GPU density peak value clustering method based on local sensitive Hashing
  • Multi-GPU density peak value clustering method based on local sensitive Hashing

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] In order to better understand the technical solutions in this application, the following will give a clear and detailed description of this application in conjunction with the drawings and specific implementation methods in the embodiments of this application:

[0043] The multi-GPU-based parallel method for density peak clustering consists of four processes: computing distance matrix; computing local density; computing distance δ; computing cluster centers and assigning clusters. Among them, before calculating the local density in each GPU, it is necessary to calculate the parameter d c . This application is mainly aimed at calculating the distance matrix and parameter d in a multi-GPU environment cOptimize to improve computing parallelism and speed. In order to make full use of the parallel acceleration performance of multiple GPUs, the method starts the CPU with the same number of threads as the number of GPUs, and allocates an appropriate amount of data to each th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the field of data mining, and provides a multi-GPU density peak value clustering method based on local sensitive Hashing. The method aims to solve the problems that when an existing density peak value clustering method is used for solving the problem of large and high-dimension data clustering, calculation complexity is too high, and consumed time is too long. The multi-GPU density peak value clustering method includes the four steps of calculation of a distance matrix, calculation of local density, calculation of a distance delta and calculation of a clustering centerand class cluster assigning. The method is characterized in that original data is divided through local sensitive Hashing, and read-write speed is increased by the utilization of a shared memory. Inthe method, local sensitive Hashing is sufficiently utilized for dividing similar data into the same GPU, so that unnecessary distance calculation is reduced. Multiple Hash functions are designed to divide the data through or operation, the situation that the similar data is mapped into different GPUs is reduced, the multi-GPU statistics method is realized based on a message passing interface (MPI), and the speed of calculating parameters is increased.

Description

technical field [0001] The invention belongs to the field of data mining, in particular to a multi-GPU (Graphic Processing Unit, graphics processor) density peak clustering method based on local sensitive hashing. Background technique [0002] Clustering is an unsupervised classification technique, the purpose of which is to divide unlabeled data sets into limited categories or clusters according to the similarity between the data, and finally make the data similarity within the group large and the difference between the groups large. Because clustering can find hidden patterns in data sets, it has been widely used in many scientific researches such as machine learning, computer vision and bioinformatics. At present, there are mainly the following clustering methods: (1) K-means and K-medoids methods, which use the center of the data point as the corresponding clustering center, can only find the shortcomings of spherical clusters, and are not suitable for non-spherical clus...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 李东升葛可适苏华友
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products