Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

KNN-GPU acceleration method based on OpenCL

A KNN algorithm and distance technology, applied in machine execution devices, concurrent instruction execution, etc., can solve problems such as high computing overhead, and achieve the effect of reducing time

Inactive Publication Date: 2014-09-03
SHANGHAI UNIV
View PDF2 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

And when the training sample compared with the test sample is large, it will incur high computational overhead
In a large classification system, the size of the training text set is very large. If the similarity calculation is performed with each text in it, the time-consuming cost is unacceptable.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • KNN-GPU acceleration method based on OpenCL
  • KNN-GPU acceleration method based on OpenCL
  • KNN-GPU acceleration method based on OpenCL

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0022] Embodiment one: A preferred embodiment of the KNN-GPU acceleration method based on OpenCL is described as follows in conjunction with the accompanying drawings. In this embodiment, the dimension is 4, the number of test points is 100, and the number of training data 15 is an example for illustration, which is divided into 6 steps:

[0023] Step 1: Initialize the OpenCL platform: first obtain the OpenCL platform information, then obtain the device ID, and finally create the context of the device's operating environment.

[0024] Step 2: Storage configuration operation on the device side: Three memories are configured on the CPU side: 1 is used to store input training data, 2 is used to store input test data, and 3 is used to store output classification data; the GPU side uses the corresponding memory read data in.

[0025] Step 3: Configure the test data and training data on the GPU device side: allocate the number of threads according to the GPU device side, set the si...

Embodiment 2

[0045] Embodiment 2: In this embodiment, the dimension is 8 and the number of test points is 4000 as an example for illustration. The steps are the same as those in Embodiment 1 and will not be repeated here.

[0046] In this embodiment, the working group size is set to 256, and the global index space is divided into N / 256 working groups. The test result is: under the same classification effect, the calculation time of the previous KNN classification method on the CPU It is 7.641 seconds, but with the OpenCL-based KNN-GPU acceleration method proposed by the present invention, its operation time only needs 0.923 seconds, and the speed is increased to 8.28 times. It can be seen that the growth of the data dimension will bring the performance advantages of the GPU into full play.

[0047] Experimental results:

[0048] The present invention has been carried out based on the experiment of OpenCL1.0 platform, and experimental environment selects Intel CORE i5 processor for CPU, me...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a KNN-GPU acceleration method based on OpenCL. According to the method, mainly, parallelized achievement of a KNN algorithm is completed. Firstly, preparation work of a CPU end is performed, wherein an OpenCL platform is initialized, and storage configuration of an equipment end is performed; then, the calculation process on a GPU is performed, wherein test data and training data of the equipment end are represented in index space, distance is calculated, and sorting is performed; finally, the classification result obtained by a GPU end is output to the CPU end to be displayed. According to the method, memory access of the GPU end is optimized, storage and data reading of a local storage are performed, and the parallelizing efficiency is further improved. As experiments prove, the efficiency of the KNN classification algorithm can be effectively improved, the classification precision is kept unchanged, and the method can be widely applied to classification calculation of texts, images and the like.

Description

technical field [0001] The invention relates to the field of GPU parallel acceleration based on the OpenCL platform, in particular to a new GPU acceleration method based on the OpenCL KNN classification. Background technique [0002] The K-Nearest Neighbor (KNN for short) algorithm was first proposed by Cover and Hart in 1968, and has been extensively and in-depth analyzed and researched subsequently. This algorithm is an algorithm based on analogy learning. Distributed samples can obtain higher classification accuracy, and have the advantages of robustness and simple method. [0003] The basic idea of ​​the KNN algorithm is: assuming that each class contains multiple sample data, and each data has a unique class label to represent these sample categories, calculate the distance from each sample data to the data to be classified, and take and classify For the nearest k sample data of the data, determine the belonging of the sample to be classified according to the category ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38
Inventor 余小清周艺圆万旺根叶轩楠
Owner SHANGHAI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products