Multi-parallel strategy convolutional network accelerator based on FPGA

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional network and accelerator technology, applied in the field of network computing, can solve the problems of computational redundancy, high computational complexity, and low implementation efficiency, achieve high parallel processing efficiency, solve computational redundancy, and improve computational speed

Pending Publication Date: 2020-12-11

REDNOVA INNOVATIONS INC

View PDF4 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The calculation efficiency of floating-point accelerators is lower than that of fixed-point accelerators, and fixed-point accelerators often ignore the accuracy of fixed-point networks

In order to solve the problem of accuracy, the existing quantization methods are more inclined to software implementation, without considering the calculation characteristics of FGPA, the calculation complexity is high, and the implementation efficiency is low

[0005] In response to the above problems, the existing method is to propose Google (IAO), which uses the Integer Arithmetic Only (IAO) method to calculate and express the forward reasoning process of the network, which not only meets the calculation characteristics of the FPGA platform, but also ensures the accuracy of the network after quantization. But there is a computational redundancy problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0042] like figure 2 As shown, the input parallel behavior uses feature templates to process N input feature maps in parallel. The input feature maps enter the row cache in the order of row by row and column by column. When a row cache is full, the data of the previous row is filled Enter the next line buffer, and get the data of the size of the feature template at the exit of each line buffer along with the flow of pixels;

[0043] like image 3 As shown, the pixel parallel behavior completes the convolution process of multiple consecutive pixels at the same time, using an 8-bit pixel strategy; the top-level interface is 32-bit input, and the feature template with a size of 3×3 can store the convolution process of 4 pixels at the same time. The required input feature map.

[0044] like Figure 4 As shown, the output parallel can process N input feature maps in parallel, and the same input feature map is convoluted with the weight calculation of N groups of output channels...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a multi-parallel strategy convolutional network accelerator based on an FPGA, and relates to the field of network computing. The system comprises a single-layer network computing structure, the single-layer network computing structure comprises a BN layer, a convolution layer, an activation layer and a pooling layer, the four layers of networks form an assembly line structure, and the BN layer merges input data; the convolution layer is used for carrying out a large amount of multiplication and additive operation, wherein the convolution layer comprises a first convolution layer, a middle convolution layer and a last convolution layer, and convolution operation is carried out by using one or more of input parallel, pixel parallel and output parallel; the activationlayer and the pooling layer are used for carrying out pipeline calculation on an output result of the convolution layer; and storing a pooled and activated final result into an RAM (Random Access Memory). Three parallel structures are combined, different degrees of parallelism can be configured at will, high flexibility is achieved, free combination is achieved, and high parallel processing efficiency is achieved.

Description

technical field [0001] The invention relates to the field of network computing, in particular to an FPGA-based multi-parallel policy convolution network accelerator. Background technique [0002] In recent years, deep learning has greatly accelerated the development of machine learning and artificial intelligence and has achieved remarkable results in various research fields and commercial applications. [0003] Field Programmable Gate Array (FPGA) is one of the preferred platforms for embedded implementation of deep learning algorithms. FPGA has low power consumption and a certain degree of parallelism, and FPGA focuses on solving real-time problems of algorithms. [0004] FPGA accelerators can be divided into fixed-point accelerators and floating-point accelerators. The fixed-point accelerator mainly designs parallel acceleration units for the convolution calculation process to achieve efficient convolution calculations. The floating-point accelerator also designs a par...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/04G06N3/063

CPCG06N3/063G06N3/045

Inventor 王堃王铭宇吴晨

Owner REDNOVA INNOVATIONS INC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-parallel strategy convolutional network accelerator based on FPGA

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology