An ARM-based Embedded Convolutional Neural Network Acceleration Method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A convolutional neural network and convolutional neural technology, applied in the field of embedded convolutional neural network acceleration, can solve problems such as inefficiency, achieve the effect of wide use space and improve computing efficiency

Active Publication Date: 2022-03-25

SOUTH CHINA UNIV OF TECH

View PDF2 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Although there are so many lightweight convolutional neural networks, it is not efficient to directly deploy them to run on embedded devices

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035] The optimization method of the present invention will be further described in detail in conjunction with the drawings and MobileNetV1 below, but the present invention is also applicable to other neural networks using 1×1 convolution and 3×3 depth separable convolution.

[0036] Such as image 3 As shown, the ARM-based embedded convolutional neural network acceleration method provided by the present invention comprises the following steps:

[0037] Step 1, use Caffe or other deep learning frameworks to train the lightweight convolutional neural network MobileNetV1.

[0038] Step 2, export the trained MobileNetV1 network structure and weights to a file.

[0039] Step 3, the design program imports the weight file, and realizes the forward calculation of the neural network according to the trained network structure. Different layers in the neural network can be represented by different functions. Function parameters include layer specification parameters, input feature ma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an ARM-based embedded convolutional neural network acceleration method, which overcomes the problems of insufficient hardware resources of embedded devices and high computational complexity of the convolutional neural network. The time-consuming 1×1 convolution and 3×3 depth-wise separable convolution commonly used in lightweight convolutional neural networks are optimized using ARM NEON technology. In particular, memory rearrangement is performed first for 1×1 convolution, and then ARM NEON vector optimization is used. For 3×3 depth separable convolution, ARM NEON vector optimization is directly performed, which accelerates the calculation of convolutional neural networks and makes full use of It saves the hardware computing resources of embedded devices, making the convolutional neural network deployed on embedded terminals run faster and more practical.

Description

technical field [0001] The present invention relates to the technical field of embedded convolutional neural network acceleration, in particular to an ARM-based embedded convolutional neural network acceleration method. Background technique [0002] Deep learning algorithms based on convolutional neural networks have achieved great success in various fields of computer vision. However, with the continuous improvement of the performance of the deep convolutional neural network, the number of parameters of the network is increasing, and the amount of calculation is also becoming larger and larger. Due to the high requirements on hardware computing power of deep convolutional neural networks, it has become a challenge to deploy deep convolutional neural networks on devices with limited computing resources such as embedded devices. [0003] At present, it has become a feasible method to design a lightweight convolutional neural network structure and deploy the structure to embe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06N3/04G06N3/063G06N3/08

CPCG06N3/063G06N3/08G06N3/045

Inventor 毕盛张英杰董敏

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An ARM-based Embedded Convolutional Neural Network Acceleration Method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology