Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Acceleration design method of CNN network suitable for low-resource embedded chip

An embedded chip and design method technology, applied in the field of neural network deep learning, can solve problems such as insufficient DDR bandwidth resources, insufficient CPU utilization, and slow calculation speed

Pending Publication Date: 2020-06-26
杭州雄迈集成电路技术股份有限公司
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] 1. The calculation speed is not fast and the CPU is not fully utilized;
[0006] 2. Insufficient DDR bandwidth resources, but also increased a lot of additional data read and write

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acceleration design method of CNN network suitable for low-resource embedded chip
  • Acceleration design method of CNN network suitable for low-resource embedded chip
  • Acceleration design method of CNN network suitable for low-resource embedded chip

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0036] S1: Read the neural network structure file, analyze the data size and convolution kernel size of each layer, calculate and confirm the data buffer space required by each layer, and then allocate the corresponding address range in SRAM.

[0037] Taking the deep learning computing framework Caffe as an example, analyze the corresponding prototxt file. According to the network structure, the shape information of the data size of each layer (denoted as: C, H, W) and the size of the convolution kernel of each layer (denoted as: Kh, Kw) are obtained. Among them, the first layer needs to reserve the Kh+1th slice to allow the DMA to pre-move the data in advance, and the data from the second layer to the nth layer can be calculated and recycled only after K slices, so each layer The memory consumption of is marked as follows:

[0038] Input layer: move data from DDR to SRAM through DMA, the DMA operation unit for each move is (W*C), and the size of SRAM needs to be recorded as:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an acceleration design method of a CNN network suitable for a low-resource embedded chip, and relates to the technical field of neural network deep learning. The method comprises the following steps: redesigning a calculation direction of convolution / pooling operation, and adopting a calculation sequence with channel direction priority; optimizing the space distribution andcalculation process of a convolution / pooling layer in the CNN network; utilizing a Neon instruction set to improve the computing performance. According to the invention, the resources such as DMA, SRAM, Neon and the like commonly existing in the existing embedded chip are fully utilized, optimization arrangement and SRAM space reutilization are carried out on calculation steps of a convolution unit and a pooling unit in a CNN network; data moving and numerical calculation are separated, DMA is used for moving data, Neon is used for achieving numerical calculation, DDR is not written in an intermediate result, and the purpose that the CNN network can be operated more quickly and efficiently in a low-resource embedded chip is achieved.

Description

technical field [0001] The invention belongs to the technical field of neural network deep learning, and in particular relates to an accelerated design method of a CNN network suitable for low-resource embedded chips. Background technique [0002] Convolutional Neural Network (CNN) is an important innovation in the field of deep learning technology. As a typical multi-layer neural network, convolutional neural network has always been at the core of research. The local connection and weight sharing methods adopted, on the one hand, reduce the number of weights to make the network easy to optimize, and on the other hand, reduce the complexity of the model, that is, reduce the risk of overfitting. Therefore, CNN has been applied in many machine vision-related tasks and has achieved great success. [0003] The excellent effect brought by CNN has stimulated the demand of many existing embedded device manufacturers to intelligently empower their original products through softwar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/08G06F13/28G06N3/04
CPCG06N3/082G06F13/28G06N3/045Y02D10/00
Inventor 葛益军
Owner 杭州雄迈集成电路技术股份有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products