Lightweight neural network hardware accelerator based on depth separable convolution

A technology of convolutional neural network and hardware accelerator, applied in the direction of biological neural network model, neural architecture, neural learning method, etc., can solve the problems of low utilization rate of systolic array, cost and energy loss, etc., and achieve the reduction of high power consumption off-chip The effect of accessing memory, saving resources, and improving processing performance

Active Publication Date: 2021-06-25
CHONGQING UNIV
View PDF7 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Existing convolutional neural network hardware accelerators such as Eyeriss and Google TPU are suitable for most known neural network models and have strong versatility, but for network m

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Lightweight neural network hardware accelerator based on depth separable convolution
  • Lightweight neural network hardware accelerator based on depth separable convolution
  • Lightweight neural network hardware accelerator based on depth separable convolution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with accompanying drawing.

[0029] In this example, if image 3 and Figure 4 As shown, a lightweight neural network hardware accelerator based on depthwise separable convolution, including a parallel array of A-way K×K-channel convolution processing units, a parallel array of A-way 1×1 point convolution processing units, and a buffer for On-chip memory for convolutional neural networks and input-output feature maps. The convolutional neural network is a lightweight neural network obtained by compressing the neural network MobileNet using a quantization-aware training method.

[0030] Such as Figure 4 As shown, the parallel array of A-way K×K channel convolution processing units and the parallel array of multiple 1×1 point convolution processing units are deployed in a pixel-level pipeline.

[0031] Such as Figure 5 As shown, each K×K channel convolution processing unit in the A-way K×K channel...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a lightweight neural network hardware accelerator based on depth separable convolution. The lightweight neural network hardware accelerator comprises an A-path K * K channel convolution processing unit parallel array, an A-path 1 * 1 point convolution processing unit parallel array and an on-chip memory used for buffering a convolutional neural network and an input and output feature map, The convolutional neural network is a lightweight neural network obtained by compressing a neural network MobileNet by using a quantitative perception training method; The A-path K * K channel convolution processing unit parallel array and the multi-path 1 * 1 point convolution processing unit parallel array are deployed in a pixel-level assembly line; Each K * K channel convolution processing unit comprises a multiplier, a summator and an activation function calculation unit; and each 1 * 1 point convolution processing unit comprises a multiplexer, a two-stage adder tree and an accumulator. According to the invention, the problem of high-energy-consumption off-chip memory access generated in the reasoning process of the accelerator in the prior art is solved, resources are saved, and the processing performance is improved.

Description

technical field [0001] The invention belongs to the technical field of neural network hardware accelerators, and in particular relates to a lightweight neural network hardware accelerator based on depth-separable convolution. Background technique [0002] Today, convolutional neural networks have achieved great success in the fields of image classification, medical image segmentation, and object tracking. Typical convolutional neural networks (such as VGG16 and GoogLeNet) are computationally intensive and rely on high-cost, energy-inefficient graphics processing units or remote computing centers. It is difficult to deploy them on portable or mobile real-time systems under tight energy and cost budgets. Previous research has focused on two directions to solve this problem. One is to optimize the convolutional neural network at the algorithm level to reduce computation and storage access (such as topology optimization and model compression). Another direction is to design VL...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06N3/063G06N3/04G06N3/08G06F17/15
CPCG06N3/063G06N3/082G06F17/15G06N3/045
Inventor 林英撑李睿石匆何伟张玲杨晶
Owner CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products