FPGA accelerator of LSTM neural network and acceleration method of FPGA accelerator

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and accelerator technology, applied in the field of FPGA accelerator and its acceleration, can solve problems such as load imbalance

Active Publication Date: 2019-08-09

NANJING UNIV

View PDF3 Cites 27 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] In order to solve the technical problem of unbalanced load in the operation of sparse neural network, the present invention proposes an FPGA accelerator of LSTM neural network and its acceleration method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0015] The above solution will be further described below in conjunction with specific embodiments. It should be understood that these examples are used to illustrate the present invention and not to limit the scope of the present invention.

[0016] figure 1 It is a system architecture diagram of an FPGA-based LSTM neural network accelerator for optimizing system computing performance. When the parameter scale of the original LSTM neural network exceeds the storage resource limit of the FPGA, the model parameters are compressed by means of pruning and quantization. The compressed neural network model becomes more sparse, and its weight matrix becomes a sparse matrix. In order to improve the operation balance of parallelism and computing units, the compressed sparse matrix is rearranged row by row, from top to bottom, each row The number of non-zero weights gradually decreases. The number of non-zero weight values is evenly distributed to each operation unit to ensure t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides an FPGA accelerator of an LSTM neural network and an acceleration method of the FPGA accelerator. The accelerator comprises a data distribution unit, an operation unit, a control unit and a storage unit; the operation unit comprises a sparse matrix vector multiplication module, a nonlinear activation function module and an element-by-element multiplication and addition calculation module; the control unit sends a control signal to the data distribution unit, and the data distribution unit reads an input excitation value and a neural network weight parameter from the storage unit and inputs the input excitation value and the neural network weight parameter to the operation unit for operation. The operation resources are uniformly distributed to each operation unit according to the number of the non-zero weight values, so that idling of operation resources is avoided, and the operation performance of the whole network is improved. Meanwhile, the pruned neural network is stored in the form of the sparse network, the weight value of each column is stored in the same address space, the neural network is coded according to the row index, and the operation performance and the data throughput rate are improved under the condition that the precision is guaranteed.

Description

technical field [0001] The invention relates to the field of computer hardware acceleration, in particular to an LSTM neural network-oriented FPGA accelerator and an acceleration method thereof. Background technique [0002] RNN neural network (Recurrent Neural Network) is usually used for processing sequence signal data. Due to its memory for historical data, RNN network can be used in machine translation, speech recognition, user behavior learning and other fields. The development of the variant LSTM network of the RNN network can solve the problem of gradient explosion and gradient disappearance in RNN network training. Because the LSTM network shows higher accuracy than the RNN network, the academic research on the accelerator of the LSTM network is blowout. [0003] Based on the characteristics of the LSTM network, its calculation process is a calculation-intensive operation. Using a general-purpose hardware platform such as a CPU to execute the LSTM algorithm does no...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06N3/063G06N3/04G06N3/08

CPCG06N3/063G06N3/082G06N3/044G06N3/045Y02D10/00

Inventor 潘红兵查羿郭良蛟秦子迪苏岩朱杏伟

Owner NANJING UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

FPGA accelerator of LSTM neural network and acceleration method of FPGA accelerator

What is Al technical title? Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document. A neural network and accelerator technology, applied in the field of FPGA accelerator and its acceleration, can solve problems such as load imbalance

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and accelerator technology, applied in the field of FPGA accelerator and its acceleration, can solve problems such as load imbalance

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology