Reconfigurable general standard convolution accelerator design method based on HLS
A design method and a general standard technology, applied in neural architecture, physical implementation, biological neural network models, etc., can solve the problem of insufficient parallelism of convolutional neural network design, difficulty in realizing the practical application of embedded devices, and resource density of FPGA chips Low-level problems, to achieve the effect of improving the overall computing speed, saving system power consumption, and saving design time
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0030] The present invention will be described in further detail below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
[0031] figure 1 A block diagram is designed for the overall accelerator of the present invention, and the cycle block factors are respectively: q, p, and the control of the degree of parallelism can be realized by changing the sizes of q and p. For the weight data, this embodiment only transfers the weight required for this calculation to the on-chip BRAM, so the size of the on-chip BRAM is: p*q*k*k, where k is the size of the convolution kernel; for the input of the feature map, Use the "Line Buffer" in the library function to design a cache structure of q k lines, and the input data can be cached by moving and inserting; take the 3*3 convolution calculation as an example, use the "...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


