Parallel rapid FIR filter algorithm-based convolutional neural network hardware accelerator

A technology of convolutional neural network and hardware accelerator, which is applied in the direction of biological neural network model and physical realization, and can solve the problems of low parallelism and high computational complexity of deep convolutional neural network

Active Publication Date: 2018-01-26
南京风兴科技有限公司
View PDF4 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The present invention aims to solve technical problems such as high computational complexity an

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Parallel rapid FIR filter algorithm-based convolutional neural network hardware accelerator
  • Parallel rapid FIR filter algorithm-based convolutional neural network hardware accelerator
  • Parallel rapid FIR filter algorithm-based convolutional neural network hardware accelerator

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] Embodiments of the invention are described in detail below, examples of which are illustrated in the accompanying drawings. Where the same name is used throughout to refer to modules with the same or similar functionality. The following will be specifically described by referring to the implementation examples described with reference to the accompanying drawings.

[0024] Such as figure 1 Shown is the overall structure diagram of the present invention. The convolutional neural network hardware accelerator based on the parallel fast FIR filter algorithm includes:

[0025] 1. P (set the size of the convolution kernel to be k×k) multi-purpose processors are used to receive input pixel neurons, perform operations such as bit width conversion, convolution, addition tree, linear correction, and maximum pooling, and convert the results into the corresponding storage unit.

[0026] 2. Pixel memory, used to store part of input pictures and feature pictures.

[0027] 3. Wei...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a parallel rapid FIR filter algorithm-based convolutional neural network hardware accelerator. The accelerator mainly consists of calculation logic and a storage unit, whereinthe calculation logic mainly comprises a multifunctional processor, a rapid convolution unit and a convolutional calculation array formed by the rapid convolution unit; and the storage unit comprisesa pixel memory, a weight cache, an additional memory and an off-chip dynamic memory. The accelerator is capable of processing the calculation of convolutional neural networks in three levels: line (row) parallel, in-layer parallel and inter-layer parallel. The accelerator can be suitable for the occasions of multiple degrees of parallelism, so that the calculation of the convolutional neural networks can be processed efficiently, and considerable data throughput rate can be achieved.

Description

technical field [0001] The invention relates to the field of computer and electronic information technology, in particular to a convolutional neural network hardware accelerator based on a parallel fast FIR filter algorithm. Background technique [0002] Convolutional Neural Networks (CNNs) are one of the most popular deep learning algorithms today. Due to its excellent performance in image and sound recognition, it has been widely used in academia and industry. Applying the recognition system based on the convolutional neural network algorithm to the local processor has great prospects and is also the direction of future development. However, due to the high computational complexity of the deep convolutional neural network, the general dedicated graphics processor can do fast training and recognition, but it cannot reduce the computational complexity. The convolutional neural network mainly consists of a convolutional layer, a pooling layer, and a fully connected layer to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/06G06N3/063
Inventor 王中风王稷琛林军
Owner 南京风兴科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products