Lightweight neural network hardware accelerator based on depth separable convolution

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of convolutional neural network and hardware accelerator, applied in the direction of biological neural network model, neural architecture, neural learning method, etc., can solve the problems of low utilization rate of systolic array, cost and energy loss, etc., and achieve the reduction of high power consumption off-chip The effect of accessing memory, saving resources, and improving processing performance

Active Publication Date: 2021-06-25

CHONGQING UNIV

View PDF7 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Existing convolutional neural network hardware accelerators such as Eyeriss and Google TPU are suitable for most known neural network models and have strong versatility, but for network models with irregular topologies such as those based on depthwise separable convolution Networks with low utilization of systolic arrays, which lead to unnecessary costs and energy losses

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] The present invention will be further described below in conjunction with accompanying drawing.

[0029] In this example, if image 3 and Figure 4 As shown, a lightweight neural network hardware accelerator based on depthwise separable convolution, including a parallel array of A-way K×K-channel convolution processing units, a parallel array of A-way 1×1 point convolution processing units, and a buffer for On-chip memory for convolutional neural networks and input-output feature maps. The convolutional neural network is a lightweight neural network obtained by compressing the neural network MobileNet using a quantization-aware training method.

[0030] Such as Figure 4 As shown, the parallel array of A-way K×K channel convolution processing units and the parallel array of multiple 1×1 point convolution processing units are deployed in a pixel-level pipeline.

[0031] Such as Figure 5 As shown, each K×K channel convolution processing unit in the A-way K×K channel...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a lightweight neural network hardware accelerator based on depth separable convolution. The lightweight neural network hardware accelerator comprises an A-path K * K channel convolution processing unit parallel array, an A-path 1 * 1 point convolution processing unit parallel array and an on-chip memory used for buffering a convolutional neural network and an input and output feature map, The convolutional neural network is a lightweight neural network obtained by compressing a neural network MobileNet by using a quantitative perception training method; The A-path K * K channel convolution processing unit parallel array and the multi-path 1 * 1 point convolution processing unit parallel array are deployed in a pixel-level assembly line; Each K * K channel convolution processing unit comprises a multiplier, a summator and an activation function calculation unit; and each 1 * 1 point convolution processing unit comprises a multiplexer, a two-stage adder tree and an accumulator. According to the invention, the problem of high-energy-consumption off-chip memory access generated in the reasoning process of the accelerator in the prior art is solved, resources are saved, and the processing performance is improved.

Description

technical field [0001] The invention belongs to the technical field of neural network hardware accelerators, and in particular relates to a lightweight neural network hardware accelerator based on depth-separable convolution. Background technique [0002] Today, convolutional neural networks have achieved great success in the fields of image classification, medical image segmentation, and object tracking. Typical convolutional neural networks (such as VGG16 and GoogLeNet) are computationally intensive and rely on high-cost, energy-inefficient graphics processing units or remote computing centers. It is difficult to deploy them on portable or mobile real-time systems under tight energy and cost budgets. Previous research has focused on two directions to solve this problem. One is to optimize the convolutional neural network at the algorithm level to reduce computation and storage access (such as topology optimization and model compression). Another direction is to design VL...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/063G06N3/04G06N3/08G06F17/15

CPCG06N3/063G06N3/082G06F17/15G06N3/045

Inventor 林英撑李睿石匆何伟张玲杨晶

Owner CHONGQING UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Lightweight neural network hardware accelerator based on depth separable convolution

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology