Method and device for reducing first-layer convolution calculation delay of CNN accelerator

An accelerator and first-layer technology, which is applied in the parallel design of convolutional neural network accelerators and the field of hardware accelerated convolutional neural networks. Utilization rate, reduced calculation delay, and obvious acceleration effect

Pending Publication Date: 2020-04-21
TIANJIN UNIV
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Since the single calculation engine uses the same circuit to meet the calculation requirements of all convolutional layers, compared with the fully customized architecture, it effectively saves hardware resources and improves the versatility of the circuit, but its disadvantages are also obvious: since all convolutional layers Using the same calculation mode, when individual convolutional layers (especially the first layer convolution) operate according to this mode, due to the failure to maximize the use of parallel computing resources, the calculation efficiency is low and the delay is large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for reducing first-layer convolution calculation delay of CNN accelerator
  • Method and device for reducing first-layer convolution calculation delay of CNN accelerator
  • Method and device for reducing first-layer convolution calculation delay of CNN accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention mainly aims at the problem of large first-layer convolution calculation delay in the current single computing engine architecture, and improves the calculation parallelism by optimizing the calculation mode of the first-layer convolution, thereby effectively reducing the calculation delay of the first-layer convolution.

[0029] For this reason, the technical solution adopted by the present invention is: increase the computational parallelism of the convolution of the first layer. That is, when calculating the convolution of the first layer, not only parallel calculations are performed from different channel directions, but also different feature blocks in the same channel are also calculated in parallel; for other convolutional layers, only parallel calculations are performed from different channel directions. The corresponding hardware architecture is as follows: every 3 convolutions form a group, followed by a first-level addition tree, a judgmen...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of hardware accelerated convolutional neural networks, and aims to improve the calculation parallelism of first-layer convolution by optimizing the calculation modeof the first-layer convolution, thereby effectively reducing the calculation delay of the first-layer convolution. Therefore, the technical scheme adopted by the invention is as follows: the method and the device for reducing the calculation delay of the first-layer convolution of the CNN accelerator are characterized in that when the first-layer convolution is calculated, parallel calculation iscarried out from different channel directions, and meanwhile, parallel calculation is also carried out on different feature image blocks in the same channel; for other convolution layers, parallel computing is carried out only from different channel directions. The method and device are mainly applied to integrated circuit design and manufacturing occasions.

Description

technical field [0001] The invention relates to the field of hardware accelerated convolutional neural networks, in particular to the field of parallel design of convolutional neural network accelerators. Specifically, the invention relates to a method and a device for reducing the first-layer convolution calculation delay of a convolutional neural network accelerator. Background technique [0002] In recent years, Convolutional Neural Network (CNN) has been widely used in computer vision tasks such as image classification, object detection and scene segmentation due to its high classification accuracy. [1] . However, the commonly used CNN model has a huge amount of calculation, and it will be a very time-consuming task to implement the convolutional neural network algorithm with software. Therefore, the design method of hardware accelerated CNN algorithm has emerged. The hardware accelerator maps the neural network algorithm to the hardware and makes full use of the para...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063
CPCG06N3/063G06N3/045
Inventor 刘强刘杰
Owner TIANJIN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products