Apparatus and method for realizing sparse neural network

A neural network, sparse technology, applied in neural learning methods, biological neural network models, neural architectures, etc., can solve problems such as limited acceleration, and achieve the effect of improving performance, reducing quantity, and reducing memory access.

Pending Publication Date: 2017-10-10
XILINX INC
View PDF2 Cites 43 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the CPU and GPU cannot fully enjoy the benefits of sparseness, and the acceleration achieved is extremely limi

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Apparatus and method for realizing sparse neural network
  • Apparatus and method for realizing sparse neural network
  • Apparatus and method for realizing sparse neural network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] Compression and Computation for Deep Neural Networks (DNNs)

[0032] The FC layer of DNN performs calculations as follows:

[0033] b=f(Wa+v) (3)

[0034] where a is the input excitation vector, b is the output excitation vector, υ is the bias, W is the weight matrix, and f is the non-linear function, a typical linear rectification unit (ReLU) in CNNs and some RNNs. Sometimes υ is combined with W by adding an extra to the vector a, so the bias is ignored in the following paragraphs.

[0035] For a typical FC layer, like VGG-16 or AlexNet's FC7, the excitation vector is 4k long, and the weight matrix is ​​4K×4K (16M weights). Weights are represented as single precision floating point numbers, so such a layer requires 64MB of storage. The output excitation of equation (3) is calculated element by element, as:

[0036]

[0037]The paper "Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding" by Han Song et al. int...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an apparatus and method for realizing sparse neural network. The apparatus herein includes an input receiving unit, a sparse matrix reading unit, an M*N number of computing units, a control unit and an output buffer unit. The invention also provides a parallel computing method for the apparatus. The parallel computing method can be used for sharing in the dimension of input vector and for sharing in the dimension of weight matrix of the sparse neural network. According to the invention, the apparatus and the method substantially reduce memory access, reduce the number of on-chip buffering, effectively balance the relationship among on-chip buffering, I/O access and computing, and increase the performances of computing modules.

Description

technical field [0001] The object of the present invention is to provide a device and method for implementing a sparse neural network accelerator, so as to achieve the purpose of improving operation efficiency. Background technique [0002] Artificial Neural Networks (ANN), also referred to as neural networks (NNs), is an algorithmic mathematical model that imitates the behavioral characteristics of animal neural networks and performs distributed parallel information processing. In recent years, neural networks have developed rapidly and are widely used in many fields, including image recognition, speech recognition, natural language processing, weather forecast, gene expression, content push and so on. [0003] Such as figure 1 As shown, the accumulated stimulus of a neuron is the sum of the stimulus delivered by other neurons and the corresponding weight, using X j Indicates this accumulation, Y i Indicates the amount of stimulation transmitted by a neuron, W i Represe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/04
CPCG06N3/04G06N3/063G06N3/045G06N3/082
Inventor 谢东亮康君龙韩松
Owner XILINX INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products