Unlock instant, AI-driven research and patent intelligence for your innovation.

Convolution operation method based on expansion access on heterogeneous many-core architecture

A convolution operation and heterogeneous technology, applied in the field of deep learning, can solve problems such as inability to use processor computing resources, poor optimization effect, system bandwidth pressure, etc., to save memory bandwidth resources, reduce memory access requirements, and improve performance effect

Pending Publication Date: 2022-03-22
JIANGNAN INST OF COMPUTING TECH
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] At present, there are some optimized convolution operation methods, such as im2col, which converts the convolution operation into matrix multiplication, and uses the optimized matrix multiplication to optimize the convolution operation, but this method needs to expand the input to the original K*K times , which puts additional pressure on system memory
Heterogeneous many-core processors contain a large number of slave cores and have powerful computing power. Memory access bandwidth is the bottleneck of the system. For this kind of computing-intensive operations, such a method will not only fail to utilize the computing resources of the processor, but will also give the system Bandwidth causes huge pressure, and the optimization effect is not good. Therefore, how to efficiently use the memory access bandwidth and reduce the memory access pressure is the key to give full play to the performance of the processor.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolution operation method based on expansion access on heterogeneous many-core architecture
  • Convolution operation method based on expansion access on heterogeneous many-core architecture
  • Convolution operation method based on expansion access on heterogeneous many-core architecture

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0019] Embodiment: The present invention provides a convolution operation method based on dilation fetching on a heterogeneous many-core architecture, which specifically includes the following steps:

[0020] S1. Input input, weight, and stride, where input is Hi*Wi, weight is K*K, calculate the shape of output output according to the shape of input and weight, and obtain Ho*Wo;

[0021] S2. According to the shape of the output, in the Ho and Wo dimensions, according to the logic number of each core, the convolution calculation tasks are evenly distributed to the cores, and each core processes a calculation task whose size is Ho_BLOCK*Wo_BLOCK;

[0022] S3. Each core calculates the required input size Hi_BLOCK* Wo_BLOCK according to its own task size, Hi_BLOCK=Ho_BLOCK*stride+K-1, Wi_BLOCK= Wo_BLOCK*stride+K-1;

[0023] S4. Each core performs convolution calculation through the obtained input (Hi_BLOCK* Wo_BLOCK) and weight;

[0024] S5. Steps S3 and S4 are repeated until the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a convolution operation method based on expansion access on a heterogeneous many-core architecture, and the method comprises the following steps: S1, inputting an input, a weight and a stride wherein the input is Hi * Wi and the weight is K * K, and calculating the shape of an output according to the shapes of the input and the weight to obtain Ho * Wo; s2, according to the shape of the output, on the dimensions of Ho and Wo, according to the logic number of each core, the convolution calculation tasks are averagely distributed to many cores; s3, determining the size of each core according to the own task size; s4, each kernel carries out convolution calculation through the obtained input (HiBLOCK * WoBLOCK) and the obtained weight; and S5, repeating S3 and S4 until the calculation is finished. According to the method, memory bandwidth resources are saved, and meanwhile, many-core computing resources can be fully utilized.

Description

technical field [0001] The invention relates to a convolution operation method based on expansion fetching on a heterogeneous many-core architecture, and belongs to the technical field of deep learning. Background technique [0002] Convolution is one of the most important concepts in deep learning. During the training and reasoning process of the convolutional neural network, the convolution operation occupies the vast majority of calculations. High-performance computing platforms usually provide dedicated solutions. For calculation-intensive functions, such as convolution in deep learning, how to provide enough data for the powerful calculation kernel in a timely manner and improve the reusability of data is a problem that needs to be solved. [0003] The convolution operation is the core operation of the artificial intelligence CNN network. The data of each K times of the convolution operation overlaps but is not repeated. If the number of data is frequent, if the chara...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/15G06F9/30G06F15/16
CPCG06F17/153G06F15/161G06F9/30007
Inventor 袁欣辉尹万旺林蓉芬魏迪郑岩王飞孙浩男孙强史俊达王丹云
Owner JIANGNAN INST OF COMPUTING TECH