Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hardware Acceleration Implementation Method of RNN Forward Propagation Model Based on Transversal Systolic Array

A systolic array, forward propagation technology, applied in the field of hardware acceleration implementation system of RNN forward propagation model, can solve the problems of non-configurability, poor flexibility, and inability to meet the computing network.

Active Publication Date: 2021-04-23
NANJING UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In the era of artificial intelligence, neural network algorithms have been widely used in different fields such as image recognition, such as convolutional neural networks (Convolutional Neural Networks, CNN), deep neural networks (Deep Neural Networks, DNN), in such a wide and frequent use , the neural network algorithm also exposed some problems: the traditional neural network can only process one input alone, and the input at the previous and subsequent moments is completely irrelevant
[0003] The input layer, hidden layer, and output layer of RNN are all fully-connected computing modes, including a large number of multiply-accumulate operations, which are computationally intensive algorithms that usually need to be implemented using hardware acceleration methods, but traditional CPUs have fewer logical operation units. It cannot meet the increasingly large computing network. Although GPU has strong computing power, it cannot flexibly configure the hardware structure. As a dedicated processing chip, ASIC has low power consumption, small area, high performance and poor flexibility and cannot be configured for specific needs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hardware Acceleration Implementation Method of RNN Forward Propagation Model Based on Transversal Systolic Array
  • Hardware Acceleration Implementation Method of RNN Forward Propagation Model Based on Transversal Systolic Array
  • Hardware Acceleration Implementation Method of RNN Forward Propagation Model Based on Transversal Systolic Array

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] The hardware acceleration implementation system of the RNN forward propagation model based on the transverse systolic array, such as figure 1 As shown, it includes a data control unit, a forward propagation calculation unit and a data cache unit, the data control unit is used to receive and generate control signals, and at the same time control the transmission and calculation of data between modules; the forward propagation calculation The unit is used to transmit the data into the transverse pulsation array, sequentially calculate the hidden layer neurons and the output layer neurons, and complete the RNN forward model operation; the data cache unit is used to provide storage space for participating in the calculation and obtaining the calculation results, The data are all 16-bit fixed-point numbers.

[0049] The data transmitted by the forward propagation computing unit to the transverse pulsation array includes at least an input vector , weight matrix and a bias...

Embodiment 2

[0061] A method for implementing hardware acceleration of the RNN forward propagation model based on a transverse systolic array includes the following steps:

[0062] S1. Initialization step: configure network parameters, the parameters at least include the number of nodes in the input layer, hidden layer, and output layer, time series length and batches to be processed.

[0063] S2, the calculation step of the hidden layer neurons: the data is passed into the horizontal pulsation array, and the hidden layer neurons are calculated based on the horizontal pulsation array; the weights in the calculation are designed in blocks, and the weight matrix calculated by the hidden layer is divided into blocks by rows ,calculate ,in Enter the vector for the current moment, input vector for the hidden layer for the previous time instant The excitation value of the RNN network is generated through the matrix multiplication vector and vector summation operation and activation functi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a system and method for realizing hardware acceleration of an RNN forward propagation model based on a transverse pulsation array. First, the network parameters are configured, and after the data is initialized, based on the transverse pulsation array, the weight in the calculation adopts block design, and the hidden layer calculation The weight matrix is ​​divided into blocks by rows, and the hidden layer neurons are calculated through matrix multiplication vector and vector summation operation and activation function operation, and then according to the obtained hidden layer neurons, matrix multiplication vector, vector summation operation and activation function Operations to generate the results of the RNN output layer, and finally configure the information according to the length of the time series to generate the output results required by the RNN network. In this method, the hidden layer and the output layer are multi-dimensionally parallel, which improves the flow of calculation. At the same time, the weight matrix parameters in the RNN network The shared feature adopts block design to further improve the parallelism of computing, high flexibility, strong scalability, high utilization of storage resources, and high speed-up ratio, which greatly reduces computing.

Description

technical field [0001] The invention belongs to the technical field of artificial intelligence hardware acceleration, and in particular relates to a system and method for realizing hardware acceleration of an RNN forward propagation model based on a transverse pulsation array. Background technique [0002] In the era of artificial intelligence, neural network algorithms have been widely used in different fields such as image recognition, such as convolutional neural networks (Convolutional Neural Networks, CNN), deep neural networks (Deep Neural Networks, DNN), in such a wide and frequent use , the neural network algorithm also exposed some problems: the traditional neural network can only process one input alone, and the input at the previous and subsequent moments is completely irrelevant. For example, when we predict the next word of a sentence, the preceding and following words in the sentence are not independent, and CNN or DNN can't do anything about it, while Recurren...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06N3/063G06F15/80G06N3/04
CPCG06N3/063G06F15/8046G06N3/045G06N3/044
Inventor 傅玉祥高珺李丽宋文清黄延李伟
Owner NANJING UNIV
Features
  • Generate Ideas
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More