Acceleration design method of CNN network suitable for low-resource embedded chip

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An embedded chip and design method technology, applied in the field of neural network deep learning, can solve problems such as insufficient DDR bandwidth resources, insufficient CPU utilization, and slow calculation speed

Pending Publication Date: 2020-06-26

杭州雄迈集成电路技术股份有限公司

View PDF11 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] 1. The calculation speed is not fast and the CPU is not fully utilized;

[0006] 2. Insufficient DDR bandwidth resources, but also increased a lot of additional data read and write

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

specific Embodiment approach 1

[0036] S1: Read the neural network structure file, analyze the data size and convolution kernel size of each layer, calculate and confirm the data buffer space required by each layer, and then allocate the corresponding address range in SRAM.

[0037] Taking the deep learning computing framework Caffe as an example, analyze the corresponding prototxt file. According to the network structure, the shape information of the data size of each layer (denoted as: C, H, W) and the size of the convolution kernel of each layer (denoted as: Kh, Kw) are obtained. Among them, the first layer needs to reserve the Kh+1th slice to allow the DMA to pre-move the data in advance, and the data from the second layer to the nth layer can be calculated and recycled only after K slices, so each layer The memory consumption of is marked as follows:

[0038] Input layer: move data from DDR to SRAM through DMA, the DMA operation unit for each move is (W*C), and the size of SRAM needs to be recorded as:...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an acceleration design method of a CNN network suitable for a low-resource embedded chip, and relates to the technical field of neural network deep learning. The method comprises the following steps: redesigning a calculation direction of convolution / pooling operation, and adopting a calculation sequence with channel direction priority; optimizing the space distribution andcalculation process of a convolution / pooling layer in the CNN network; utilizing a Neon instruction set to improve the computing performance. According to the invention, the resources such as DMA, SRAM, Neon and the like commonly existing in the existing embedded chip are fully utilized, optimization arrangement and SRAM space reutilization are carried out on calculation steps of a convolution unit and a pooling unit in a CNN network; data moving and numerical calculation are separated, DMA is used for moving data, Neon is used for achieving numerical calculation, DDR is not written in an intermediate result, and the purpose that the CNN network can be operated more quickly and efficiently in a low-resource embedded chip is achieved.

Description

technical field [0001] The invention belongs to the technical field of neural network deep learning, and in particular relates to an accelerated design method of a CNN network suitable for low-resource embedded chips. Background technique [0002] Convolutional Neural Network (CNN) is an important innovation in the field of deep learning technology. As a typical multi-layer neural network, convolutional neural network has always been at the core of research. The local connection and weight sharing methods adopted, on the one hand, reduce the number of weights to make the network easy to optimize, and on the other hand, reduce the complexity of the model, that is, reduce the risk of overfitting. Therefore, CNN has been applied in many machine vision-related tasks and has achieved great success. [0003] The excellent effect brought by CNN has stimulated the demand of many existing embedded device manufacturers to intelligently empower their original products through softwar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/08G06F13/28G06N3/04

CPCG06N3/082G06F13/28G06N3/045Y02D10/00

Inventor 葛益军

Owner 杭州雄迈集成电路技术股份有限公司

Acceleration design method of CNN network suitable for low-resource embedded chip

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

specific Embodiment approach 1

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology