Supercharge Your Innovation With Domain-Expert AI Agents!

FPGA (Field Programmable Gate Array)-based configurable CNN (Convolutional Neural Network) multiplication accumulator supporting 8-bit and 16-bit data

A multiplication and accumulation, data technology, applied in the field of image recognition, can solve the problems of high cost, high design threshold, and large resource consumption, so as to reduce the pressure of storage, reduce complexity, and speed up deployment.

Pending Publication Date: 2021-07-20
GUANGDONG UNIV OF TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, the platforms that can implement CNN mainly include CPU, GPU, FPGA and ASIC. Due to its excellent general performance, CPU is not suitable for computing CNN, which requires a large number of arithmetic operations; GPU, due to its excellent parallel computing capability It is widely used in the training of artificial neural networks, but due to its high power, it is not suitable for use in scenarios that have strict requirements on power; and the cost of GPU is high, so it is mostly used in the cloud at present; On the one hand, it still performs well in terms of cost, but its design threshold is high and the design cycle is long; FPGA is often used for verification before ASIC tape-out due to its programmable characteristics and design close to ASIC, although the energy efficiency of FPGA is not as good as ASIC, But its flexibility to modify the design
[0004] The existing FPGA-based configurable CNN multiplication accumulator can only accelerate some simple calculations with high repeatability, while some complex operations or random logic in CNN, such as the need to calculate the power of the natural number e or some randomness in post-processing Optimization operations such as dropout are difficult to implement with FPGA, or require a lot of resources, and the performance obtained is not directly proportional to the resource investment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • FPGA (Field Programmable Gate Array)-based configurable CNN (Convolutional Neural Network) multiplication accumulator supporting 8-bit and 16-bit data
  • FPGA (Field Programmable Gate Array)-based configurable CNN (Convolutional Neural Network) multiplication accumulator supporting 8-bit and 16-bit data
  • FPGA (Field Programmable Gate Array)-based configurable CNN (Convolutional Neural Network) multiplication accumulator supporting 8-bit and 16-bit data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Such as figure 1As shown, the present invention provides a configurable CNN multiplication accumulator based on FPGA to support 8bit and 16bit data bit width, including control module, input feature map register, weight register, partial sum register, PE array and output feature map register, where:

[0034] The control module is used to control the timing of the entire convolution calculation; after receiving the start signal, the control module first generates the enable signal and the address of the data from the external storage according to the convolution configuration signal, and inputs the feature map, weight And part of the sum is read in parallel, so three read enable signals and corresponding three address signals are generated; when the input feature map is read, the convolution calculation starts, and the control module controls the PE array to start calculating the volume In the process of convolution calculation, the control module generates a moving sig...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an FPGA-based configurable CNN multiply accumulator capable of supporting 8-bit and 16-bit data widths, which comprises a control module, an input feature map register, a weight register, a partial sum register, a PE array and an output feature map register; the control module is used for controlling the time sequence of the whole convolution calculation; the input feature map register is used for registering an input feature map and outputting pixels of the input feature map to the PE array according to a convolution sequence; the weight register is used for providing an input weight for the PE array; the partial sum register is a register array with only one layer, the PE array is used for completing convolution calculation, and the output feature map register is used for registering values obtained after calculation is completed through the PE array. The design and deployment of the CNN hardware accelerator can be accelerated, and the design process is simplified.

Description

technical field [0001] The invention relates to the field of image recognition, in particular to an FPGA-based configurable CNN multiplication accumulator supporting 8-bit and 16-bit data. Background technique [0002] Deep learning is a brand-new field that has developed very rapidly in recent years. As one of the most commonly used models for deep learning, convolutional neural networks are widely used in image processing, face recognition, audio retrieval and other fields due to their excellent feature learning capabilities. . With the development of the convolutional neural network network structure, its network depth continues to deepen, and the network structure is changing with each passing day. The calculation of the network needs to perform a large number of arithmetic operations; at the same time, its application scenarios continue to expand, which puts forward higher requirements for the real-time performance of the network. . In addition, the progress made in n...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F7/523G06N3/04
CPCG06F7/523G06N3/045Y02D10/00
Inventor 胡湘宏李学铭黄宏敏陈淘生刘梓豪熊晓明
Owner GUANGDONG UNIV OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More