Convolution calculation data reuse method based on heterogeneous many-core processor

A technology of many-core processors and convolution, applied in the field of deep learning, to achieve the effect of reducing memory access requirements, high performance, and improving data reuse rate

Active Publication Date: 2021-03-26
JIANGNAN INST OF COMPUTING TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For some heterogeneous many-core processors, the maximum memory access speed of the memory does not match the powerful computing capabilities of the many cores, resulting in convolution calculations that can only play 10% to 20% of the CPU's computing performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convolution calculation data reuse method based on heterogeneous many-core processor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0015] Embodiment: A method for reusing data for convolution calculation based on heterogeneous many-core processors. Based on a large-scale heterogeneous system, the CPU completes the convolution calculation of data block C through data block A and data block B, including the following steps:

[0016] S1. According to the number of cores NUM of the heterogeneous many-core processor, two-dimensionally map the cores of the heterogeneous many-core processor into N*N cores, where the value of N is the largest integer that does not exceed the square root of NUM, and N *N cores are numbered, and data block A, data block B, and data block C are each divided into N*N blocks in two-dimensional equal parts. The (i, j)th core divides data block A, data block B, and data The (j, i) block data of block C are respectively read from the memory into their own on-chip memory, and the convolution calculation of data block C (i, j) requires data block A (i, k) and data block B (k , j), where k=...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a convolution calculation data reuse method based on a heterogeneous many-core processor. A CPU completes convolution calculation of a data block C through a data block A and adata block B. The method comprises the following steps: S1, according to the number of cores of the heterogeneous many-core processor, performing two-dimensional mapping into N * N, dividing the datablock A, the data block B and the data block C into N * N blocks, enabling the (i, j)th kernel to read the (j, i)th block data from the memory into the own on-chip memory, wherein convolution calculation of the data block C (i, j) needs a data block A (i, k) and a data block B (k, j), and k = 1, 2,..., N; and S2, entering a cycle k for N cycles from 1 to N, and completing the Kth convolution calculation of the data block C by using the obtained data block A and data block B. The memory access requirement of convolution calculation on the heterogeneous many-core processor is remarkably reduced, and the many-core calculation capability is fully exerted, so that the high performance of convolution calculation is realized, and the calculation performance of convolution calculation on the heterogeneous many-core processor is improved.

Description

technical field [0001] The invention relates to a method for reusing convolution calculation data based on heterogeneous many-core processors, and belongs to the technical field of deep learning. Background technique [0002] Convolution is one of the most important concepts in deep learning. During the training and reasoning process of the convolutional neural network, the convolution operation occupies most of the calculation. High-performance computing platforms usually provide specialized solutions for such core operations. For calculation-intensive functions, such as convolution in deep learning, how to provide enough data to the powerful calculation kernel in a timely manner is a problem that needs to be solved. Heterogeneous many-core processors have super computing power, multi-level storage levels, and efficient on-chip communication methods, making it possible to achieve efficient data reuse. [0003] At present, the commonly used convolution calculation optimiz...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/54G06N3/063
CPCG06F9/545G06N3/063Y02D10/00
Inventor 林蓉芬袁欣辉尹万旺魏迪杨金才王丹云董恩铭
Owner JIANGNAN INST OF COMPUTING TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products