Unlock instant, AI-driven research and patent intelligence for your innovation.

Efficient LSTM accelerator based on FPGA

An accelerator and high-efficiency technology, applied in the direction of instruments, memory systems, machine execution devices, etc., can solve problems such as limited parallel computing capabilities, difficulty in meeting the comprehensive needs of power consumption and performance of actual intelligent applications, large memory bandwidth and power consumption, etc.

Active Publication Date: 2021-07-30
NANJING UNIV OF AERONAUTICS & ASTRONAUTICS
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] Existing general-purpose computing platforms (CPU, GPU) are limited by the structure of serial execution, and their parallel computing capabilities are extremely limited, and the structure of separation of computing and storage makes data movement occupy a large amount of memory bandwidth and power consumption, which is difficult to meet the needs of actual intelligence. Combined needs of applications for power consumption and performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Efficient LSTM accelerator based on FPGA
  • Efficient LSTM accelerator based on FPGA
  • Efficient LSTM accelerator based on FPGA

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some of the embodiments of the present invention, not all of them. The term "temporary" and "first" in the invention are used to explain the different stages in the algorithm training, and have no limiting meaning. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative efforts fall within the protection scope of the present invention.

[0043] Such as figure 1 Shown, a kind of high-efficiency LSTM accelerator based on FPGA, described FPGA accelerator interior comprises a plurality of computing units (PE), storage unit, fully connected module, softmax module and control unit;

[0044] The calculation unit is composed of a matrix-vector multiplication mod...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an efficient LSTM (Long Short Term Memory) accelerator based on an FPGA (Field Programmable Gate Array). The FPGA accelerator internally comprises a plurality of calculation units, a storage unit and a control unit, the calculation unit comprises a matrix vector multiplication module and an Element operation module, the matrix vector multiplication module is composed of 4 * N DSPs and four adders, the N DSPs are used for parallel multiply-accumulate operation of weight data and input data and add N multiply-accumulate results to obtain a result vector corresponding to a single gate, and meanwhile, the multiply-accumulate operation of the four gates is executed in parallel; the Element operation module is used for calculating a cell state value and output data at the current moment; the plurality of calculation units adopt a parallel operation and multiplexing strategy; the storage unit is used for caching weight data, input data, output values and cell state values required by LSTM network calculation; the control unit is used for controlling the state transition of LSTM network calculation and the data stream transmission process. Compared with a general processor, the FPGA accelerator has the advantages of being high in performance, low in power consumption and large in throughput capacity.

Description

technical field [0001] The invention relates to the field of computer hardware acceleration, in particular to an FPGA-based high-efficiency LSTM accelerator and a design method thereof. Background technique [0002] Long Short-Term Memory (LSTM) is a typical representative of recurrent neural network, which can effectively solve the long-term dependence problem in recurrent neural network. With the increase of the application scale, the calculation amount and storage complexity of the neural network model algorithm are also increasing, and the large-scale neural network model needs to take into account the storage, computing performance and energy consumption in the process of training and prediction. Therefore, how to realize the application of neural network algorithms with high performance and low energy consumption is a hot topic of current research. [0003] Existing general-purpose computing platforms (CPU, GPU) are limited by the structure of serial execution, and th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/04G06F15/78G06F9/38G06F9/302G06F9/30G06F12/0897G06F1/03
CPCG06N3/063G06F15/781G06F9/3893G06F9/3867G06F9/3001G06F9/30036G06F12/0897G06F1/0307G06N3/048G06N3/044Y02D10/00
Inventor 葛芬杨滢张炜张伟枫李梓瑜岳鑫周芳吴宁
Owner NANJING UNIV OF AERONAUTICS & ASTRONAUTICS