Multi-parallel strategy convolutional network accelerator based on FPGA

A convolutional network and accelerator technology, applied in the field of network computing, can solve the problems of computational redundancy, high computational complexity, and low implementation efficiency, achieve high parallel processing efficiency, solve computational redundancy, and improve computational speed

Pending Publication Date: 2020-12-11
REDNOVA INNOVATIONS INC
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The calculation efficiency of floating-point accelerators is lower than that of fixed-point accelerators, and fixed-point accelerators often ignore the accuracy of fixed-point networks
In order to solve the problem of accuracy, the existing quantization methods are more inclined to software implementation, without considering the calculation characteristics of FGPA, the calculation complexity is high, and the implementation efficiency is low
[0005] In response to the above problems, the existing method is to propose Google (IAO), which uses the Integer Arithmetic Only (IAO) method to calculate and express the forward reasoning process of the network, which not only meets the calculation characteristics of the FPGA platform, but also ensures the accuracy of the network after quantization. But there is a computational redundancy problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-parallel strategy convolutional network accelerator based on FPGA
  • Multi-parallel strategy convolutional network accelerator based on FPGA
  • Multi-parallel strategy convolutional network accelerator based on FPGA

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0042] like figure 2 As shown, the input parallel behavior uses feature templates to process N input feature maps in parallel. The input feature maps enter the row cache in the order of row by row and column by column. When a row cache is full, the data of the previous row is filled Enter the next line buffer, and get the data of the size of the feature template at the exit of each line buffer along with the flow of pixels;

[0043] like image 3 As shown, the pixel parallel behavior completes the convolution process of multiple consecutive pixels at the same time, using an 8-bit pixel strategy; the top-level interface is 32-bit input, and the feature template with a size of 3×3 can store the convolution process of 4 pixels at the same time. The required input feature map.

[0044] like Figure 4 As shown, the output parallel can process N input feature maps in parallel, and the same input feature map is convoluted with the weight calculation of N groups of output channels...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a multi-parallel strategy convolutional network accelerator based on an FPGA, and relates to the field of network computing. The system comprises a single-layer network computing structure, the single-layer network computing structure comprises a BN layer, a convolution layer, an activation layer and a pooling layer, the four layers of networks form an assembly line structure, and the BN layer merges input data; the convolution layer is used for carrying out a large amount of multiplication and additive operation, wherein the convolution layer comprises a first convolution layer, a middle convolution layer and a last convolution layer, and convolution operation is carried out by using one or more of input parallel, pixel parallel and output parallel; the activationlayer and the pooling layer are used for carrying out pipeline calculation on an output result of the convolution layer; and storing a pooled and activated final result into an RAM (Random Access Memory). Three parallel structures are combined, different degrees of parallelism can be configured at will, high flexibility is achieved, free combination is achieved, and high parallel processing efficiency is achieved.

Description

technical field [0001] The invention relates to the field of network computing, in particular to an FPGA-based multi-parallel policy convolution network accelerator. Background technique [0002] In recent years, deep learning has greatly accelerated the development of machine learning and artificial intelligence and has achieved remarkable results in various research fields and commercial applications. [0003] Field Programmable Gate Array (FPGA) is one of the preferred platforms for embedded implementation of deep learning algorithms. FPGA has low power consumption and a certain degree of parallelism, and FPGA focuses on solving real-time problems of algorithms. [0004] FPGA accelerators can be divided into fixed-point accelerators and floating-point accelerators. The fixed-point accelerator mainly designs parallel acceleration units for the convolution calculation process to achieve efficient convolution calculations. The floating-point accelerator also designs a par...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063
CPCG06N3/063G06N3/045
Inventor 王堃王铭宇吴晨
Owner REDNOVA INNOVATIONS INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products