A convolution neural network accelerator based on PSoC

A convolutional neural network and accelerator technology, which is applied in the field of convolutional neural network accelerators, can solve the problems of large calculation volume, large bandwidth requirements, and impossibility of being involved in neural networks, so as to reduce bandwidth requirements, solve large calculation volumes, and improve Effects of Parallel Processing Efficiency

Active Publication Date: 2018-12-28
GUANGDONG UNIV OF TECH +1
View PDF4 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

All the multiplication and addition calculations in the multiplication and addition calculation module are calculated in parallel, which supports convolution calculations of different convolution kernel sizes, and solves the problem of large calculation volume and large bandwidth requirements involved in the neural network.
The software partially solves the softmax classifier and non-maximum value suppression algorithm and image processing algorithm that cannot be realized by hardware logic, and solves the configuration of convolutional neural networks with different network structures

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A convolution neural network accelerator based on PSoC
  • A convolution neural network accelerator based on PSoC
  • A convolution neural network accelerator based on PSoC

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] In order to increase the amount of calculation of the convolutional neural network, improve the efficiency of parallel processing, and reduce the bandwidth requirement, the present invention provides such as figure 1 A PSoC-based convolutional neural network accelerator 100 is shown, including: off-chip memory 101, CPU 102, feature map input memory 103, feature map output memory 104, bias memory 105, weight memory 106, direct memory access DMA 107 and the same number of computing units 108 as neurons.

[0024] Direct memory storage DMA107 reads and transfers from off-chip memory 101 to feature map input memory 103, bias memory 105 and weight memory 106 under the control of CPU102, or writes data from feature map output memory 104 back to off-chip memory 101. The CPU 102 needs to control the storage location of the input feature map, bias, weight, and output feature map in the off-chip memory, as well as the parameter transmission of the multi-layer convolutional neural ...

Embodiment 2

[0031] Correspondingly, the present invention also combines Image 6 The method flow of the convolutional neural network calculation based on the convolutional neural network accelerator based on PSoC is further described.

[0032] The CPU can be programmed in the embedded software, realize the construction of the deep convolutional neural network in the software programming, and input the relevant processor through the bus configuration to transmit the command value control register.

[0033] Examples of configuration commands are shown in the following table:

[0034] The input of the first layer is x1 input feature map data and x3 weight data data, and the calculation results are input to the maximum pooling module and activation function module to obtain x2 output feature map data.

[0035]

[0036] The storage form of the output feature map of the convolutional layer in the off-chip memory has M layers, and the values ​​of M are 1, 3, 5, 7.... The output feature map ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This patent discloses a convolution neural network accelerator based on PSoC device, including an off-chip memory, a CPU, a feature map input memory, a feature Map Output Memory, a bias memory, a weight memory, a direct memory access to the same number of cells as neurons. The calculation unit comprises a first-in first-out queue, a state machine, a data selector, an average pooling module, a maximum pooling module, a multiplication and addition calculation module and an activation function module, wherein the calculations in the multiplication and addition calculation module are executed in parallel, and can be used for a convolution neural network system of various architectures. The invention fully utilizes the programmable part in the PSoC (Programmable System on Chip) device to realize the convolution neural network calculation part with large calculation amount and high parallelism, and utilizes the CPU to realize the serial algorithm and the state control.

Description

technical field [0001] The invention relates to convolutional neural network structure technology, in particular to a PSoC-based convolutional neural network accelerator. Background technique [0002] The convolutional neural network has unique advantages in image processing with local weight sharing. Its layout is closer to the actual biological neural network. Shared weights reduce the complexity of the neural network and reduce the computational load of the neural network. At present, convolutional neural networks are widely used in video surveillance, machine vision, pattern recognition, image search and other fields. [0003] However, the hardware implementation of convolutional network requires a lot of hardware resources, low bandwidth utilization and low data multiplexing. Convolutional neural networks need to support convolution operations of different sizes, pooling operations, and full connection operations. At the same time, many convolutional neural network app...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063
CPCG06N3/063G06N3/045
Inventor 熊晓明李子聪曾宇航胡湘宏
Owner GUANGDONG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products