Sparse matrix-vector multiplication computational unit for permuted block-diagonal weight matrices

A technology of weight matrix and sparse matrix, which is applied in the field of sparse matrix-vector multiplication calculation unit, can solve the problems of increased system power consumption, not considering the sparseness of intermediate products, etc., and achieve the effect of avoiding accumulation operation and eliminating storage overhead

Active Publication Date: 2021-07-09
扬州伊达实业有限公司
View PDF13 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

On the contrary, because of the addition of additional comparison operations, it will also increase the power consumption of the entire system
(3) The calculation unit does not consider the sparsity of the intermediate product, because the product generated by zero-value weight or zero-value excitation is also zero, so the intermediate product has the same or greater sparsity than the weight matrix

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Sparse matrix-vector multiplication computational unit for permuted block-diagonal weight matrices
  • Sparse matrix-vector multiplication computational unit for permuted block-diagonal weight matrices
  • Sparse matrix-vector multiplication computational unit for permuted block-diagonal weight matrices

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0025] The sparse matrix-vector multiplication calculation unit for the arranged block diagonal weight matrix of this embodiment, such as figure 1 , including: several processing units and an accumulator; the output of the processing unit is connected to the accumulator, and the accumulator is used to accumulate the outputs of all processing units; in order to reduce consumption, the processing unit is in the first stage, and the accumulator is in the second to Nth stages, The outputs of the two processing units are connected to an accumulator at the second stage, and the outputs of the two accumulators at the n+1th stage are connected to an accumulator at the n+2th stage, where n∈(1,N-2 ), the Nth stage contains 1 accumulator. If the processing unit or the accumulator in a certain level is singular, it only needs to be connected to the accumulator in the next level alone, which is equivalent to adding a 0 bit.

[0026] Assuming that the length of the input excitation vector ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a sparse matrix-vector multiplication calculation unit for arranged block diagonal weight matrices, comprising: several processing units and an accumulator; the output of the processing unit is connected to the accumulator. The sparse matrix-vector multiplication calculation unit for arranged block diagonal weight matrices provided by the present invention makes full use of the sparseness of the weight matrix after pruning, and avoids multiplication operations between zero-value weights and corresponding input excitation elements. Zero-hopping operations can be dynamically enabled in conjunction with sparsity of input stimuli. The sparsity of the intermediate product obtained by multiplying the weight and the corresponding input stimulus is fully utilized, and the accumulation operation between the zero value product and the corresponding product is avoided. The designed pointer generator eliminates the storage overhead of the pointers that record the location information of non-zero values.

Description

technical field [0001] The invention relates to a sparse matrix-vector multiplication calculation unit for arranged block diagonal weight matrices, which belongs to the technical field of integrated circuit design. Background technique [0002] In recent years, deep learning algorithms have begun to replace traditional algorithms and become mainstream algorithms in many fields due to their excellent performance. However, the current mainstream processors (CPU, GPU, DSP, etc.) are still unable to better adapt to their data-intensive computing characteristics. Therefore, there has been a wave of research on deep learning processors in the academic and industrial circles. [0003] The fully connected operation is a very important type of operation in the neural network algorithm, and it is also a type of operation with a very large amount of calculation. Therefore, efficiently performing such operations becomes the key to improving the performance of deep learning processors. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F30/39
Inventor 郑勇陈柱佳舒毅
Owner 扬州伊达实业有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products