Convolutional neural network acceleration method based on OpenCL standard

A convolutional neural network and data technology, applied in the field of computer vision object detection and convolutional neural network acceleration, can solve the problem that the CPU efficiency cannot meet the demand, achieve high-performance parallel computing, improve computing speed, and large data throughput amount of effect

Active Publication Date: 2017-11-10
XIDIAN UNIV
View PDF7 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when deep learning algorithms require more and more computing power, especially convolutional neural network algorithms, the efficiency of CPU execution cannot meet the demand

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolutional neural network acceleration method based on OpenCL standard
  • Convolutional neural network acceleration method based on OpenCL standard
  • Convolutional neural network acceleration method based on OpenCL standard

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] The technical solutions and effects of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0033] refer to figure 1 , the implementation steps of the present invention include as follows:

[0034] Step 1, read in the original 3D image data and transfer it to the global memory of the GPU.

[0035] 1.1) The input size is a three-dimensional color road picture of 448*448, and the original picture data is read into the host memory;

[0036] 1.2) Choose but not limited to AMD R9 200 GPU as the acceleration device, expand the original image data in the host memory by one bit each, and then transfer it to the global memory of the GPU.

[0037] Step 2, read the weight data and bias data into the global memory of the GPU.

[0038] 2.1) The weight data and bias data obtained by the convolutional neural network training are first stored in a text file, and then the text file is read into the host memory;

[0039] 2.2...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention proposes a convolutional neural network acceleration method based on the OpenCL standard for mainly solving the problem of low efficiency of the existing CPU to process a convolutional neural network. The method comprises the following steps: 1, reading original three-dimensional image data, and transmitting the original three-dimensional image data to a global memory of a GPU; 2, reading weights and offset data in the global memory of the GPU; 3, reading the original image data in the global memory of the GPU in a local memory of the GPU; 4, initializing the parameters, and constructing a linear activation function Leaky-ReLU; 5, calculating picture data of the twelfth layer of the convolutional neural network; 6, calculating the picture data of the fifteenth layer of the convolutional neural network; and 7, calculating the picture data of the eighteenth layer of the convolutional neural network, storing the picture data in the GPU, transmitting the picture data to a host memory, and providing an operation time. By adoption of the convolutional neural network acceleration method, the operation speed of the convolutional neural network is improved, and the convolutional neural network acceleration method can be applied to object detection in computer vision.

Description

technical field [0001] The invention belongs to the technical field of unmanned driving perception, and in particular relates to a convolutional neural network acceleration method, which can be used for object detection of computer vision. Background technique [0002] With the deepening of neural network research, researchers have found that convolution operations on image input are similar to neurons in biological vision receiving local input, and adding convolution operations to neural networks has become a mainstream trend. Due to the specific design of the convolutional neural network CNN in the structure of the neural network for the characteristics of the visual input itself, the convolutional neural network has become an inevitable choice in the field of computer vision. As the domain of computer vision, the perception part of unmanned driving will inevitably become the stage where CNN plays a role. [0003] The main computing tool for traditional deep learning algo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F15/16G06F9/30G06N3/04
CPCG06F9/30007G06F15/161G06N3/04
Inventor 王树龙殷伟刘而云刘红侠杜守刚
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products