Patents
Literature
Patsnap Copilot is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Patsnap Copilot

51results about "Systolic arrays" patented technology

Sparse neural network processor based on systolic array

The invention provides a sparse neural network processor based on a systolic array. The sparse neural network processor comprises a storage unit, a control unit, a sparse matrix operation array, a calculation unit and a confluence array. The storage unit is used for storing weights, gradients, features and instruction sequences used for scheduling data streams. The control unit takes out data required by the training and reasoning process from the storage unit according to the control of the instruction sequence, converts the data into a sparse matrix operation format and sends the data into the sparse matrix operation array. The sparse matrix operation array comprises a plurality of processing units connected in a systolic array mode and is used for completing sparse matrix operation. Thecalculation unit is used for completing element-by-element operation such as a nonlinear activation function. The confluence array delivers the same data segment to different rows of the systolic array through internal data transfer to reduce storage overhead. The processor makes full use of the sparsity of the weight and the characteristics, achieves the improvement of the speed and power consumption ratio in the neural network training and reasoning process, and has the advantages of high concurrency, low bandwidth requirements and the like.
Owner:BEIHANG UNIV

Hardware acceleration implementation system and method for RNN forward propagation model based on transverse pulsation array

The invention discloses a hardware acceleration implementation system and a method for an RNN forward propagation model based on a transverse pulsation array. The method comprises the steps of firstly, configuring network parameters, initializing data, lateral systolic array, wherein a blocking design is adopted in the weight in calculation; partitioning a weight matrix calculated by the hidden layer according to rows; carrying out matrix multiplication vector and vector summation operation and activation function operation; calculating hidden layer neurons, obtaining hidden layer neurons according to the obtained hidden layer neurons; performing matrix multiplication vector, vector summation operation and activation function operation; generating an RNN output layer result; finally, generating an output result required by the RNN network according to time sequence length configuration information; according to the method, a hidden layer and an output layer are parallel in a multi-dimensional mode, the pipelining performance of calculation is improved, meanwhile, the characteristic of weight matrix parameter sharing in the RNN is achieved, the partitioning design is adopted, the parallelism degree of calculation is further improved, the flexibility, expandability, the storage resource utilization rate and the acceleration ratio are high, and calculation is greatly reduced.
Owner:NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products