Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Coarse-grained reconfigurable convolution neural network accelerator and system

A convolutional neural network, coarse-grained technology, applied in the field of high-efficiency hardware accelerator design, can solve the problems of low operation speed, large circuit scale, long development cycle, etc., to reduce reconfiguration overhead, reconfiguration time, and reconfiguration. Configure the effect of the speed boost

Active Publication Date: 2017-07-14
TSINGHUA UNIV
View PDF6 Cites 49 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Traditional DSP (Digital Signal Processing, digital signal processing) has the disadvantages of low computing speed, non-reconfigurable hardware structure, long development and upgrading cycle and non-portability. This shortcoming is more obvious when facing large-scale computing.
ASIC has great advantages in terms of performance, area, and power consumption, but the changing application requirements and rapidly increasing complexity make ASIC design and verification difficult, and the development cycle is long, making it difficult to meet the requirements of rapid product application.
In programmable logic devices, although Xilinx's Virtex-6 series FPGA uses a 600MHz DSP48E1slice to achieve a performance of more than 1000GMACS (1×1012 multiplication and accumulation operations / second), but for large-scale calculations, the circuit that needs to be configured The scale is too large, the synthesis and configuration time is too long, and the actual operating frequency is not high, it is difficult to maintain high performance while pursuing the goal of flexibility and low power consumption

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Coarse-grained reconfigurable convolution neural network accelerator and system
  • Coarse-grained reconfigurable convolution neural network accelerator and system
  • Coarse-grained reconfigurable convolution neural network accelerator and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0022] figure 1 It shows a coarse-grained reconfigurable convolutional neural network accelerator, including multiple processing unit clusters, each processing unit cluster includes several basic computing units, and the several basic computing units are connected by a sub-addition unit, The sub-addition unit is as figure 1 In ADDB1-ADDB4, the sub-addition units of the plurality of processing unit clusters are respectively connected to a mother addition unit, and the mother addition unit is as figure 1 ADDB0 shown in, the sub-addition unit has the same structure as the mother addition unit; each sub-addition unit is used to generate the partial sum of adjacent basic addi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a coarse-grained reconfigurable convolution neural network accelerator and a system. The accelerator comprises a number of processing unit clusters. Each processing unit cluster comprises a number of basic computing units which are connected through a sub-addition unit. The sub-addition units of a number of processing unit clusters are respectively connected with a mother addition unit. Each sub addition unit is used for generating partial sum of a number of adjacent basic computing units. The mother addition unit is used for accumulating the sub-addition units. According to the invention, a coarse-grained reconfigurable way is adopted to link different weights and image tracks through SRAM or other interconnection units to realize different convolution kernel processing structures; the accelerator can effectively support networks and convolution cores of different sizes; and the cost of reconfiguration is significantly reduced.

Description

technical field [0001] The invention relates to the technical field of high-energy-efficiency hardware accelerator design, and more particularly, to a coarse-grained reconfigurable convolutional neural network accelerator and system. Background technique [0002] Convolutional Neural Network (CNN) is a feedforward neural network. Its artificial neurons can respond to surrounding units within a part of the coverage area, and it has excellent performance for large-scale image processing. Convolutional neural networks have become the most commonly used algorithms in the fields of image recognition and speech recognition. This type of method requires a lot of calculations and requires the design of dedicated accelerators. It also has good application prospects in mobile devices. However, due to the limited resources of mobile devices, accelerators currently designed on GPU and FPGA (Field Programmable Gate Array, Field Programmable Gate Array) platforms are difficult to use on ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/063G06T1/20G06T1/60
CPCG06N3/063G06T1/20G06T1/60G06T2200/28
Inventor 袁哲刘勇攀杨华中岳金山李金阳
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products