Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An embedded convolutional neural network acceleration method based on ARM

A convolutional neural network and convolutional neural technology, applied in the field of embedded convolutional neural network acceleration, can solve problems such as inefficiency, achieve the effect of wide use space and improve computing efficiency

Active Publication Date: 2019-03-08
SOUTH CHINA UNIV OF TECH
View PDF2 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Although there are so many lightweight convolutional neural networks, it is not efficient to directly deploy them to run on embedded devices

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An embedded convolutional neural network acceleration method based on ARM
  • An embedded convolutional neural network acceleration method based on ARM
  • An embedded convolutional neural network acceleration method based on ARM

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The optimization method of the present invention will be further described in detail in conjunction with the drawings and MobileNetV1 below, but the present invention is also applicable to other neural networks using 1×1 convolution and 3×3 depth separable convolution.

[0036] like image 3 As shown, the ARM-based embedded convolutional neural network acceleration method provided by the present invention comprises the following steps:

[0037] Step 1, use Caffe or other deep learning frameworks to train the lightweight convolutional neural network MobileNetV1.

[0038] Step 2, export the trained MobileNetV1 network structure and weights to a file.

[0039] Step 3, the design program imports the weight file, and realizes the forward calculation of the neural network according to the trained network structure. Different layers in the neural network can be represented by different functions. Function parameters include layer specification parameters, input feature maps,...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an embedded convolutional neural network acceleration method based on ARM, which overcomes the shortage of hardware resources of embedded equipment and the problem of high computational complexity of the convolutional neural network. The time-consuming convolution of 1*1 and 3*3 depth separable convolution are commonly used in lightweight convolution neural networks, and are optimized by using ARM NEON technique. In particular, the 1*1 convolution is first rearranged, then ARM NEON vector optimization is used to optimize the 3*3 depth separable convolution, which speedsup the computation of convolution neural network and makes full use of the hardware computation resources of embedded equipment, so that the convolution neural network deployed in the embedded terminal runs faster and more practical.

Description

technical field [0001] The present invention relates to the technical field of embedded convolutional neural network acceleration, in particular to an ARM-based embedded convolutional neural network acceleration method. Background technique [0002] Deep learning algorithms based on convolutional neural networks have achieved great success in various fields of computer vision. However, with the continuous improvement of the performance of the deep convolutional neural network, the number of parameters of the network is increasing, and the amount of calculation is also becoming larger and larger. Due to the high requirements on hardware computing power of deep convolutional neural networks, it has become a challenge to deploy deep convolutional neural networks on devices with limited computing resources such as embedded devices. [0003] At present, it has become a feasible method to design a lightweight convolutional neural network structure and deploy the structure to embe...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06N3/04G06N3/063G06N3/08
CPCG06N3/063G06N3/08G06N3/045
Inventor 毕盛张英杰董敏
Owner SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products