Hardware acceleration implementation system and method for RNN forward propagation model based on transverse pulsation array

A systolic array, forward propagation technology, applied in the field of hardware acceleration implementation system of RNN forward propagation model, can solve the problems of non-configurability, poor flexibility, and inability to meet the computing network.

Active Publication Date: 2020-02-21
NANJING UNIV
View PDF6 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the era of artificial intelligence, neural network algorithms have been widely used in different fields such as image recognition, such as convolutional neural networks (Convolutional Neural Networks, CNN), deep neural networks (Deep Neural Networks, DNN), in such a wide and frequent use , the neural network algorithm also exposed some problems: the traditional neural network can only process one input alone, and the input at the previous and subsequent moments is completely irrelevant
[0003] The input layer, hidden layer, and output layer of RNN are all fully-connected computing modes, including a large number of multiply-accumulate operations, which are computationally intensive algorithms that usually need to be implemented using hardware acceleration methods, but traditional CPUs have fewer logical operation units. It cannot meet the increasingly large computing network. Although GPU has strong computing power, it cannot flexibly configure the hardware structure. As a dedicated processing chip, ASIC has low power consumption, small area, high performance and poor flexibility and cannot be configured for specific needs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hardware acceleration implementation system and method for RNN forward propagation model based on transverse pulsation array
  • Hardware acceleration implementation system and method for RNN forward propagation model based on transverse pulsation array
  • Hardware acceleration implementation system and method for RNN forward propagation model based on transverse pulsation array

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] The hardware acceleration implementation system of the RNN forward propagation model based on the transverse systolic array, such as figure 1 As shown, it includes a data control unit, a forward propagation calculation unit and a data cache unit, the data control unit is used to receive and generate control signals, and at the same time control the transmission and calculation of data between modules; the forward propagation calculation The unit is used to transmit the data into the transverse pulsation array, sequentially calculate the hidden layer neurons and the output layer neurons, and complete the RNN forward model operation; the data cache unit is used to provide storage space for participating in the calculation and obtaining the calculation results, The data are all 16-bit fixed-point numbers.

[0049] The data transmitted by the forward propagation calculation unit to the lateral pulsation array at least includes an input vector x, weight matrices U, W, V and ...

Embodiment 2

[0061] A method for implementing hardware acceleration of the RNN forward propagation model based on a transverse systolic array includes the following steps:

[0062] S1. Initialization step: configure network parameters, the parameters at least include the number of nodes in the input layer, hidden layer, and output layer, time series length and batches to be processed.

[0063] S2, the calculation step of the hidden layer neurons: the data is passed into the horizontal pulsation array, and the hidden layer neurons are calculated based on the horizontal pulsation array; the weights in the calculation are designed in blocks, and the weight matrix calculated by the hidden layer is divided into blocks by rows , calculate h t =Φ(Ux t +Wh t-1 +b), where x t Input vector for the current moment, h t-1 input vector x for the hidden layer for the previous time instant t-1 The excitation value of the RNN network is generated through the matrix multiplication vector and vector sum...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a hardware acceleration implementation system and a method for an RNN forward propagation model based on a transverse pulsation array. The method comprises the steps of firstly, configuring network parameters, initializing data, lateral systolic array, wherein a blocking design is adopted in the weight in calculation; partitioning a weight matrix calculated by the hidden layer according to rows; carrying out matrix multiplication vector and vector summation operation and activation function operation; calculating hidden layer neurons, obtaining hidden layer neurons according to the obtained hidden layer neurons; performing matrix multiplication vector, vector summation operation and activation function operation; generating an RNN output layer result; finally, generating an output result required by the RNN network according to time sequence length configuration information; according to the method, a hidden layer and an output layer are parallel in a multi-dimensional mode, the pipelining performance of calculation is improved, meanwhile, the characteristic of weight matrix parameter sharing in the RNN is achieved, the partitioning design is adopted, the parallelism degree of calculation is further improved, the flexibility, expandability, the storage resource utilization rate and the acceleration ratio are high, and calculation is greatly reduced.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence hardware acceleration, and in particular relates to a system and method for realizing hardware acceleration of an RNN forward propagation model based on a transverse pulsation array. Background technique [0002] In the era of artificial intelligence, neural network algorithms have been widely used in different fields such as image recognition, such as convolutional neural networks (Convolutional Neural Networks, CNN), deep neural networks (Deep Neural Networks, DNN), in such a wide and frequent use , the neural network algorithm also exposed some problems: the traditional neural network can only process one input alone, and the input at the previous and subsequent moments is completely irrelevant. For example, when we predict the next word of a sentence, the preceding and following words in the sentence are not independent, and CNN or DNN can't do anything about it, while Recurren...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06F15/80G06N3/04
CPCG06N3/063G06F15/8046G06N3/045G06N3/044
Inventor 傅玉祥高珺李丽宋文清黄延李伟
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products