Fast global K-means clustering method of using OpenCL (Open Computing Language) acceleration

A technology of K-means and clustering method, which is applied in the field of data processing, can solve the problems of unportable code, limited application range, and inability to parallelize acceleration, etc., and achieve the effects of overcoming poor portability, saving storage space, and increasing load

Active Publication Date: 2018-07-13
XIDIAN UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this method is that the parallelized K-means code implemented by the Spark platform cannot be transplanted on various hardware devices that support the open computing language OpenCL, and the portability is not high. It is only suitable for distributed systems and cannot Parallel acceleration on GPU, FPGA, MIC and other hardware devices greatly limits its application scope
However, this method still has the disadvantage that it needs to store a large amount of intermediate calculation results when accelerating the algorithm on the heterogeneous device side. When the input image data is too large, it exceeds the storage capacity of the heterogeneous device, resulting in This method cannot realize the processing of larger image data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fast global K-means clustering method of using OpenCL (Open Computing Language) acceleration
  • Fast global K-means clustering method of using OpenCL (Open Computing Language) acceleration
  • Fast global K-means clustering method of using OpenCL (Open Computing Language) acceleration

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The present invention will be further described below in conjunction with the accompanying drawings.

[0035] The invention utilizes OpenCL hardware equipment, an open computing language, and adopts a fast global K-means clustering algorithm to realize.

[0036] refer to figure 1 , further describe the implementation steps of the present invention.

[0037] Step 1, read in the dataset and the total number of clusters.

[0038] Read in the data set stored in a two-dimensional matrix. The rows of the matrix represent the number of data, and the columns represent the attributes of the data.

[0039] Read in the total number of clusters.

[0040] Step 2, transpose the data set of the two-dimensional matrix.

[0041] Copy the two-dimensional matrix in the data set to the global memory of the hardware device.

[0042] Each thread is responsible for a data point in the two-dimensional matrix of the data set, calculates the index of the data point of each thread in the mat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a fast global K-means clustering method of using OpenCL (Open Computing Language) acceleration. The method includes the realization steps of: (1) reading in a data set and a total clustering number; (2) transposing data at an OpenCL hardware equipment end; (3) calculating a centroid, and using the same as a first initial clustering center point; (4) using a K-means algorithm for clustering; (5) selecting a new initial clustering center point at the OpenCL hardware equipment end; (6) judging whether a current total clustering center point number is less than or equal tothe total clustering number, if yes, executing the step (4), and otherwise, executing a step (7); and (7) outputting a clustering result. The method can realize real-time processing on massive clustering data on any hardware equipment supporting an open computing language.

Description

technical field [0001] The invention belongs to the technical field of data processing, and further relates to a Fastglobal K-means clustering method accelerated by OpenCL (Open Computing Language) hardware equipment in the technical field of data mining. The invention can realize the parallel accelerated fast global K-means clustering method, and can realize the real-time processing of massive data on the open computing language OpenCL hardware device. Background technique [0002] The fast global K-means algorithm uses a deterministic method that does not depend on any initial parameter value instead of random search to obtain the initial cluster center, which solves the problem of poor stability of the clustering results of the traditional K-means clustering method based on local search. And by optimizing the calculation method of global K-means to increase the new cluster center, the running time is reduced. [0003] TCL Group Co., Ltd. disclosed a patent document "A me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/00
CPCG06V10/94G06F18/23213
Inventor 朱虎明钱新宇焦李成王坤缑水平田小林张小华马晶晶马文萍
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products