A Design Method of FPGA-Based Yolo Network Forward Reasoning Accelerator

A design method and forward reasoning technology, applied to biological neural network models, neural architectures, etc., can solve problems such as limitations, FPGA incapacity, and accelerated network scale, achieving the effects of increasing speed, less on-chip resources, and ensuring stability

Active Publication Date: 2020-09-04
北京邮电大学深圳研究院
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The existing FPGA-based neural network accelerators often store all the intermediate calculation results of the network layer in the on-chip static memory, and store the weights required by the network in the off-chip dynamic memory. Limits the size of the network that can be accelerated
At this stage, as the requirements for task complexity and precision become higher, the scale of convolutional neural networks is getting larger and larger, and the total number of parameters is also getting larger and larger. However, the process of FPGA chips and the growth of memory resources that can be accommodated on the chip Not so fast, if the previous design method is still followed, the FPGA cannot accommodate a network of this size at all

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Design Method of FPGA-Based Yolo Network Forward Reasoning Accelerator
  • A Design Method of FPGA-Based Yolo Network Forward Reasoning Accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] A kind of FPGA-based YOLO network forward reasoning accelerator design method, described accelerator comprises FPGA chip and DRAM, memory BRAM in the described FPGA chip is as data buffer, and described DRAM is as main storage device, uses ping-pong structure in DRAM ; It is characterized in that, described accelerator design method comprises the following steps:

[0041] (1) Perform 8-bit fixed-point quantization on the original network data to obtain the position of the decimal point that has the least impact on the detection accuracy, and form a quantization scheme. The quantization process is carried out layer by layer;

[0042] (2) The FPGA chip performs parallel computing on YOLO's nine-layer convolutional network;

[0043] (3) Location mapping.

[0044] Specifically, the quantization process of a certain layer in the step (1) is:

[0045] a) Quantify the weight data of the original network: when quantizing according to a certain decimal position of an 8bit fixe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention proposes a design method for an FPGA-based YOLO network forward reasoning accelerator, the accelerator includes an FPGA chip and a DRAM, the memory BRAM in the FPGA chip is used as a data buffer, and the DRAM is used as a main storage device; The accelerator design method includes the following steps: (1) Carry out 8-bit fixed-point quantization on the original network data, obtain the position of the decimal point that has the least impact on the detection accuracy, and form a quantization scheme. The quantization process is carried out layer by layer; The nine-layer convolutional network for parallel computing; (3) position mapping. It solves the technical problem that the growth rate of storage resources on the FPGA chip is not as fast as that of the neural network scale in the prior art, and it is difficult to transplant the general target detection network to the FPGA chip according to the traditional design idea, and realizes using less on-chip resources to achieve The purpose of faster speed.

Description

technical field [0001] The invention relates to the technical field of deep learning and hardware structure design, in particular to a design method for forward reasoning acceleration of a target detection network on an FPGA. Background technique [0002] In recent years, machine learning algorithms based on convolutional neural networks (Convolutional Neutral Network) have been widely applied to computer vision tasks. However, for large-scale CNN networks, the characteristics of intensive computation, intensive storage, and large resource consumption have brought great challenges to the above tasks. It is difficult for traditional general-purpose processors to meet the practical requirements in the face of such high computing pressure and large data throughput. Therefore, hardware accelerators based on GPU, FPGA, and ASIC have been proposed and widely used. [0003] FPGA (Field Programmable Gate Array) field programmable gate array is a product of further development on th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/04
CPCG06N3/045
Inventor 张轶凡陈昊应山川李玮
Owner 北京邮电大学深圳研究院
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products