A convolution neural network accelerator

一种卷积神经网络、加速器的技术,应用在卷积神经网络加速器领域,能够解决困难等问题,达到加速卷积计算、加速运算速度、减少存储的效果

Active Publication Date: 2019-03-29
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF14 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

(2) Most of the bits of some weights are slack
However, achieving this goal will be difficult because existing MAC computing models need to be modified and the hardware architecture rebuilt to support the new computing model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A convolution neural network accelerator
  • A convolution neural network accelerator
  • A convolution neural network accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The invention reconstructs the inference calculation mode of the DCNN model. The present invention uses a split accumulator (SAC) to replace the classic computational mode - MAC. Instead of the classic multiplication operation, it is replaced by a series of low-cost adders. The present invention can make full use of the basic bits in the weight, and is composed of an adder, a shifter, etc., without a multiplier. A shift-sum operation is performed for each weight / activation pair in a traditional multiplier, where "weight / activation" means "weight and activation". However, in the present invention, multiple weight / activation value pairs are accumulated multiple times, and only one shift, addition, and summation is performed, thereby obtaining a large speedup.

[0030] Finally, the present invention proposes a Tetris accelerator to exploit the maximum potential of the weight kneading technique and the split accumulator SAC. The Tetris accelerator consists of a series of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A convolution neural network accelerator includes sorting and bitwise aligning the original weights in the calculation order, so that weight matrix is obtained; removing the slack bits in the weight matrix to obtain a reduced matrix with vacancies, and filling the vacancies by using the basic bits in each column of the reduced matrix according to the calculation order to obtain an intermediate matrix, removing the empty rows in the intermediate matrix, and taking the empty position 0 of the intermediate matrix to obtain a kneading matrix, and kneading each row of the matrix as a kneading weight; Obtaining the position information of each bit in the kneading weight corresponding to the activation value according to the correspondence relationship between the activation value and the basic bits in the original weight; sending the kneading weight to a splitting accumulator, and the splitting accumulator bitwise divides the kneading weight into a plurality of weight segments, and accordingto the position information, the weight segments and the corresponding activation values are summed, and the processing result is sent to an addition tree, and the processing result is shifted and added to obtain an output characteristic map.

Description

technical field [0001] The invention relates to the field of neural network computing, and in particular to a convolutional neural network accelerator. Background technique [0002] Deep convolutional neural networks have made significant progress in machine learning applications such as real-time image recognition, detection, and natural language processing. To improve accuracy, advanced deep convolutional neural network (DCNN) architectures possess complex connections and a large number of neurons and synapses to meet the demands of high precision and complex tasks. In the convolution operation, the weights are multiplied by the corresponding activation values, and finally the products are added up and summed. That is, weights and activation values ​​form a pair. [0003] Given the limitations of traditional general-purpose processor architectures, many researchers have proposed dedicated accelerators for specific computational modes of modern DCNNs. DCNN consists of mu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/15G06N3/04G06N3/08
CPCG06F17/15G06N3/063G06N3/082G06N3/045H03M7/4037H03M7/70H03M7/3059G06N3/08G06N3/048G06F5/01G06F7/50H03M7/40G06N3/04
Inventor 李晓维魏鑫路航
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products