Method and system for accelerating deep learning algorithm on field programmable gate array platform

A deep learning and gate array technology, applied in neural learning methods, physical implementation, biological neural network models, etc., can solve the problem of high GPU power consumption, achieve low power consumption, and accelerate the effect of deep learning algorithms

Active Publication Date: 2019-03-22
SUZHOU INST FOR ADVANCED STUDY USTC
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the power consumption of the GPU is relatively high, and the power consumption of a single GPU is often higher than that of the mainstream CPU in the same period, usually dozens of times or even hundreds of times more energy consumption than FPGAs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for accelerating deep learning algorithm on field programmable gate array platform
  • Method and system for accelerating deep learning algorithm on field programmable gate array platform
  • Method and system for accelerating deep learning algorithm on field programmable gate array platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0062] The field programmable gate array platform in the embodiment of the present invention refers to a computing system that simultaneously integrates a general purpose processor (General Purpose Processor, referred to as "GPP") and a field programmable gate array (Field Programmable GateArrays, referred to as "FPGA") chip , wherein, the data path between FPGA and GPP can adopt PCI-E bus protocol, AXI bus protocol, etc. The data path in the drawings of the embodiments of the present invention is illustrated by using the AXI bus protocol as an example, but the present invention is not limited thereto.

[0063] figure 1 It is a flowchart of a method 100 for accelerating a deep learning algorithm on a field programmable gate array platform according to an embodiment of the present invention. The method 100 includes:

[0064] S110, according to the deep learning prediction process and training process, wherein the training process includes a local pre-training process and a gl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for deep learning algorithm acceleration on a field-programmable gate array platform. The field-programmable gate array platform is composed of a universal processor, a field-programmable gate array and a storage module. The method comprises: according to a deep learning prediction process and a training process, a general computation part that can be operated on a field-programmable gate array platform is determined by combining a deep neural network and a convolutional neural network; a software and hardware cooperative computing way is determined based on the determined general computation part; and according to computing logic resources and the bandwidth situation of the FPGA, the number and type of IP core solidification are determined, and acceleration is carried out on the field-programmable gate array platform by using a hardware computing unit. Therefore, a hardware processing unit for deep learning algorithm acceleration is designed rapidly based on hardware resources; and compared with the general processor, the processing unit has characteristics of excellent performance and low power consumption.

Description

technical field [0001] The invention relates to the field of computer hardware acceleration, in particular to a method and system for accelerating deep learning algorithms on a field programmable gate array platform. Background technique [0002] Deep learning has achieved remarkable results in solving high-level abstract cognitive problems, bringing machine learning to a new level. It not only has high scientific research value, but also has strong practicality, which makes it very popular in both academia and industry. However, in order to solve more abstract and complex learning problems, the network size of deep learning is increasing, and the complexity of calculation and data is also increasing dramatically. For example, the Google Cat system network has about 1 billion neurons. Accelerating deep learning-related algorithms with high performance and low energy consumption has become a research hotspot for scientific research and commercial institutions. [0003] Usua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/06G06N3/08
CPCG06N3/063G06N3/08
Inventor 周学海王超余奇周徐达赵洋洋李曦陈香兰
Owner SUZHOU INST FOR ADVANCED STUDY USTC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products