FPGA (Field Programmable Gate Array)-based YOLOv2-tiny neural network low-delay hardware accelerator implementation method

A hardware accelerator and neural network technology, applied in biological neural network models, physical implementation, neural architecture, etc., can solve the problem of high delay of hardware accelerators, achieve the effects of reducing startup time, improving usage efficiency, and reducing computing time

Pending Publication Date: 2019-12-10
合肥辉羲智能科技有限公司
View PDF2 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In order to solve the problem of high delay of the hardware accelerator of the YOLO network in the prior art, the present invention proposes a method for realizing a low-latency hardware accelerator of the FPGA-based YOLOv2-tiny neural network, which can significantly reduce the delay of the overall system. Improve the use efficiency of DSP

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • FPGA (Field Programmable Gate Array)-based YOLOv2-tiny neural network low-delay hardware accelerator implementation method
  • FPGA (Field Programmable Gate Array)-based YOLOv2-tiny neural network low-delay hardware accelerator implementation method
  • FPGA (Field Programmable Gate Array)-based YOLOv2-tiny neural network low-delay hardware accelerator implementation method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0053] In order to make the measures, creative features, goals and effects achieved by the present invention easy to understand, the present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0054] The present invention is an FPGA-based YOLOv2-tiny neural network low-latency hardware accelerator implementation method, the hardware platform used is Xilinx ZC706 development board, the data set selected for training and testing is Kitti, and the input picture size is 1280×384, specifically The network structure is shown in Table 1.

[0055] Table 1 YOLOv2-tiny network structure

[0056] name The main parameters input size output size Conv1 Convolution layer, convolution kernel (3,3,16) (1280,384,3) (1280,384,16) BN1 batch normalization layer (1280,384,16) (1280,384,16) Maxpool1 pooling layer, pooling kernel(2,2) (1280,384,16) (640,192,16) Conv2 Convolution layer, convolution...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an FPGA-based YOLOv2-tiny neural network low-delay hardware accelerator implementation method. The method comprises the steps of network quantification; the overall hardware architecture design of the target detection system based on YOLOv2-tiny; design of a convolution layer processing unit; and double multiplier design and design space exploration. According to the invention, the time delay of the whole system can be significantly reduced, and the use efficiency of the DSP is improved.

Description

technical field [0001] The invention belongs to the technical field of deep learning and convolutional neural network hardware accelerators, and in particular relates to an FPGA-based YOLOv2-tiny neural network low-latency hardware accelerator implementation method. Background technique [0002] In recent years, a major breakthrough has been made in the field of Convolutional Neural Network (CNN), which has greatly improved the performance of CNN-based object detection algorithms. Classification challenge results from the PASCAL VOC dataset demonstrate that since 2007, the mean average precision (Mean Average Precision, mAP) of object detection algorithms has increased from 20% to 85%. The excellent performance of object detection algorithms makes them widely used in automated systems, such as robots, autonomous driving, and drones. [0003] However, the high accuracy of the target detection algorithm is accompanied by the problem of high computational complexity. If the CP...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063
CPCG06N3/063G06N3/045
Inventor 郭谦张津铭李杰李岑蒋剑飞绳伟光景乃锋王琴贺光辉
Owner 合肥辉羲智能科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products