A design method of YOLO network forward inference accelerator based on FPGA

A design method and forward reasoning technology, applied in biological neural network models, neural architectures, etc., can solve the problems of accelerated network scale, FPGA can not accommodate, limitations, etc., to achieve the effect of ensuring stability, less on-chip resources, and improving speed

Active Publication Date: 2019-01-15
北京邮电大学深圳研究院
View PDF11 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The existing FPGA-based neural network accelerators often store all the intermediate calculation results of the network layer in the on-chip static memory, and store the weights required by the network in the off-chip dynamic memory. Limits the size of the network that can be accelerated
At this stage, as the requirements for task complexity and precision become hig

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A design method of YOLO network forward inference accelerator based on FPGA
  • A design method of YOLO network forward inference accelerator based on FPGA

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] A kind of FPGA-based YOLO network forward reasoning accelerator design method, described accelerator comprises FPGA chip and DRAM, memory BRAM in the described FPGA chip is as data buffer, and described DRAM is as main storage device, uses ping-pong structure in DRAM ; It is characterized in that, described accelerator design method comprises the following steps:

[0040] (1) Perform 8-bit fixed-point quantization on the original network data to obtain the position of the decimal point that has the least impact on the detection accuracy, and form a quantization scheme. The quantization process is carried out layer by layer;

[0041] (2) The FPGA chip performs parallel computing on YOLO's nine-layer convolutional network;

[0042] (3) Location mapping.

[0043] Specifically, the quantization process of a certain layer in the step (1) is:

[0044] a) Quantify the weight data of the original network: when quantizing according to a certain decimal position of an 8bit fixe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a design method of YOLO network forward inference accelerator based on FPGA. The accelerator comprises an FPGA chip and a DRAM. The memory BRAM in the FPGA chip is used as a data buffer, and the DRAM is used as a main storage device. The accelerator design method comprises the following steps: (1) quantizing the original network data by 8 bits to obtain a decimal point position which has the least influence on the detection accuracy, and forming a quantization scheme, wherein the quantization process is carried out layer by layer; (2) parallel computation of YOLO's nine-layer convolution network with FPGA chip; (3) position mapping. The invention solves the technical problem that the storage resources on the FPGA chip in the prior art do not grow as fast as the neural network scale, and the general target detection network is difficult to be transplanted to the FPGA chip according to the traditional design thinking, so as to achieve the purpose of using fewer resources on the chip to achieve faster speed.

Description

technical field [0001] The invention relates to the technical field of deep learning and hardware structure design, in particular to a design method for forward reasoning acceleration of a target detection network on an FPGA. Background technique [0002] In recent years, machine learning algorithms based on convolutional neural networks (Convolutional Neutral Network) have been widely applied to computer vision tasks. However, for large-scale CNN networks, the characteristics of intensive computation, intensive storage, and large resource consumption have brought great challenges to the above tasks. It is difficult for traditional general-purpose processors to meet the practical requirements in the face of such high computing pressure and large data throughput. Therefore, hardware accelerators based on GPU, FPGA, and ASIC have been proposed and widely used. [0003] FPGA (Field Programmable Gate Array) field programmable gate array is a product of further development on th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04
CPCG06N3/045
Inventor 张轶凡陈昊应山川李玮
Owner 北京邮电大学深圳研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products