Excitation data block processing method for hardware accelerator and hardware accelerator

A technology of hardware accelerator and incentive data, applied in physical realization, neural architecture, biological neural network model, etc., can solve the problems of large cache of incentive data, affecting system efficiency and power consumption, etc., to reduce the demand for storage resources and avoid data dependent effect

Active Publication Date: 2022-02-18
INST OF ELECTRONICS ENG CHINA ACAD OF ENG PHYSICS
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to the large cache required for the incentive data during the inference calculation process of the convolutional network hardware, the on-chip calculation cannot be realized on the limited on-chip SRAM of the AI ​​hardware accelerator at present, and the off-chip data access of the incentive data is reduced to zero. Greatly affects the efficiency and power consumption of the system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Excitation data block processing method for hardware accelerator and hardware accelerator
  • Excitation data block processing method for hardware accelerator and hardware accelerator
  • Excitation data block processing method for hardware accelerator and hardware accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings.

[0024] In the convolution network block parallel processing method of the present invention, H represents the size of the data in the height direction, W represents the size of the width direction, N represents the size of the output channel, M represents the size of the output channel, and K represents the size of the convolution kernel, Pa and Pb represent the parallelism of a single cycle in the H and W directions respectively, and Po represents the parallelism of a single cycle on the output channel. Input and output data are collectively referred to as incentive data.

[0025] A block processing method for excitation data of a hardware accelerator, wherein the hardware accelerator performs block parallel processing on the excitation data of the convolutional neural network, and stores the block parallel processed excitation data in a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an excitation data block processing method for a hardware accelerator and the hardware accelerator, and the method comprises the steps: 1) enabling an AI hardware accelerator to divide a convolution layer into a shallow convolution layer and a deep convolution layer when the AI hardware accelerator carries out the parallel calculation of a convolution network; 2) partitioning a shallow convolutional layer by an AI hardware accelerator; 3) enabling the AI hardware accelerator to calculate the blocked shallow convolutional layer; and 4) enabling the AI hardware accelerator to combine results obtained after calculation of all the shallow convolutional layers is completed and then take the results as input data of a deep convolutional layer, and carrying out deep convolutional layer calculation. The hardware accelerator shares the excitation data storage unit and the register array, the hardware accelerator performs block parallel processing on excitation data of the convolutional neural network, and stores the segmented excitation data in the shared excitation data storage unit, so that the out-of-chip data access of the excitation data of the hardware accelerator is zero. According to the invention, the off-chip data access is 0, and the on-chip SRAM resource demand of the AI accelerator is greatly reduced.

Description

technical field [0001] The present invention relates to the technical field of digital signal processing, in particular to the field of hardware accelerator design methods, in particular to a hardware accelerator excitation data block processing method and the hardware accelerator. Background technique [0002] In the fields of artificial intelligence and parallel computing, there are a large number of multidimensional matrix operations. In order to achieve real-time signal processing, AI hardware accelerators usually integrate multiple parallel computing units to perform parallel computing on data to achieve fast inference computing. How to achieve efficient parallel computing through a hardware platform with parallel computing capabilities is a major difficulty in the design of AI hardware accelerators. The main difficulty lies in: the energy consumption of data reading, especially the reading of off-chip DRAM data is much higher than that of addition, Energy consumption ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/063G06N3/04
CPCG06N3/063G06N3/045Y02D10/00
Inventor 贺迅马建平刘友江曹韬
Owner INST OF ELECTRONICS ENG CHINA ACAD OF ENG PHYSICS
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products