FPGA accelerator of LSTM neural network and acceleration method of FPGA accelerator

A neural network and accelerator technology, applied in the field of FPGA accelerator and its acceleration, can solve problems such as load imbalance

Active Publication Date: 2019-08-09
NANJING UNIV
View PDF3 Cites 27 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the technical problem of unbalanced load in the operation of sparse neural network, the present invention proposes an FPGA accelerator of LSTM neural network and its acceleration method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • FPGA accelerator of LSTM neural network and acceleration method of FPGA accelerator
  • FPGA accelerator of LSTM neural network and acceleration method of FPGA accelerator
  • FPGA accelerator of LSTM neural network and acceleration method of FPGA accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0015] The above solution will be further described below in conjunction with specific embodiments. It should be understood that these examples are used to illustrate the present invention and not to limit the scope of the present invention.

[0016] figure 1 It is a system architecture diagram of an FPGA-based LSTM neural network accelerator for optimizing system computing performance. When the parameter scale of the original LSTM neural network exceeds the storage resource limit of the FPGA, the model parameters are compressed by means of pruning and quantization. The compressed neural network model becomes more sparse, and its weight matrix becomes a sparse matrix. In order to improve the operation balance of parallelism and computing units, the compressed sparse matrix is ​​rearranged row by row, from top to bottom, each row The number of non-zero weights gradually decreases. The number of non-zero weight values ​​is evenly distributed to each operation unit to ensure t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an FPGA accelerator of an LSTM neural network and an acceleration method of the FPGA accelerator. The accelerator comprises a data distribution unit, an operation unit, a control unit and a storage unit; the operation unit comprises a sparse matrix vector multiplication module, a nonlinear activation function module and an element-by-element multiplication and addition calculation module; the control unit sends a control signal to the data distribution unit, and the data distribution unit reads an input excitation value and a neural network weight parameter from the storage unit and inputs the input excitation value and the neural network weight parameter to the operation unit for operation. The operation resources are uniformly distributed to each operation unit according to the number of the non-zero weight values, so that idling of operation resources is avoided, and the operation performance of the whole network is improved. Meanwhile, the pruned neural network is stored in the form of the sparse network, the weight value of each column is stored in the same address space, the neural network is coded according to the row index, and the operation performance and the data throughput rate are improved under the condition that the precision is guaranteed.

Description

technical field [0001] The invention relates to the field of computer hardware acceleration, in particular to an LSTM neural network-oriented FPGA accelerator and an acceleration method thereof. Background technique [0002] RNN neural network (Recurrent Neural Network) is usually used for processing sequence signal data. Due to its memory for historical data, RNN network can be used in machine translation, speech recognition, user behavior learning and other fields. The development of the variant LSTM network of the RNN network can solve the problem of gradient explosion and gradient disappearance in RNN network training. Because the LSTM network shows higher accuracy than the RNN network, the academic research on the accelerator of the LSTM network is blowout. [0003] Based on the characteristics of the LSTM network, its calculation process is a calculation-intensive operation. Using a general-purpose hardware platform such as a CPU to execute the LSTM algorithm does no...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/04G06N3/08
CPCG06N3/063G06N3/082G06N3/044G06N3/045Y02D10/00
Inventor 潘红兵查羿郭良蛟秦子迪苏岩朱杏伟
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products