Unlock instant, AI-driven research and patent intelligence for your innovation.

FPGA-based neural network accelerator supporting channel separation convolution

A neural network and channel separation technology, applied in the field of neural network accelerator hardware structure, can solve the problems of performance degradation, reduction of calculation amount and storage cost, reduction of time and space utilization, etc., to reduce access and improve energy efficiency.

Active Publication Date: 2021-05-07
SOUTHEAST UNIV
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The technical effects are that by making it possible for computers with more processing units (CPUs) to use less space on storage devices while still supporting various networks like conventional communication systems or wireless communications system. This allows them to work better without having too much extra RAM needed compared to other methods such as disk arrays.

Problems solved by technology

This patents discusses how modern computers work better than ever before by making them smarter faster while still being able to handle complicated computations efficiently without having too much memory or CPU usage. However, current algorithms require significant amounts of computer cycles and consume considerable battery capacity when running certain types of operations such as autonomously driven drones navigating through streets. Therefore, there exists a technical problem where developing new architectures and optimized functions within convolutional neurons makes their way towards more realistic applications like artificial intelligence.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • FPGA-based neural network accelerator supporting channel separation convolution
  • FPGA-based neural network accelerator supporting channel separation convolution
  • FPGA-based neural network accelerator supporting channel separation convolution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The technical solutions and beneficial effects of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0039] Such as figure 1 As shown, the hardware structure of the convolutional neural network accelerator designed for the present invention, taking the four convolution types shown in Table 1 as examples, describes its working methods in detail.

[0040] The external control processor first writes relevant parameters such as the size of the input feature value of this layer, the number of channels, padding, and convolution calculation methods (full connection, channel separation convolution, and traditional convolution) and on-chip network data flow configuration information through the configuration bus. into accelerator-related registers. Secondly, control the DMA to write the input feature value and the weight value into the corresponding input buffer sub-area and the weight buffer sub-area in the ORMU unit respec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an FPGA-based neural network accelerator supporting channel separation convolution. The accelerator comprises a Ping-Pong register file, an output feature value row mapping unit ORMU array capable of configuring data flow, a functional unit module, a memory interface module and the like. The Ping-Pong register file receives configuration and control words from the control processor, and an interrupt signal is sent out after calculation is completed; according to the ORMU array, ORMU units and caches are interconnected through a configurable network-on-chip, so that calculation of neural networks with different data bandwidth requirements is met; the functional unit module is used for realizing functions of Pooling, Relu activation, batch normalization of BN and the like; and the memory interface module is used for transmitting the weights and the characteristic values. According to the method, different requirements of channel separation convolution (channel-by-channel convolution and point-by-point convolution), traditional convolution and full connection on data bandwidth are supported through the flexible layered network-on-chip, so that the relatively high utilization rate of a calculation unit is ensured, and the reasoning/calculation speed is greatly improved.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Owner SOUTHEAST UNIV