Unlock instant, AI-driven research and patent intelligence for your innovation.

GEMM-based deep neural network acceleration method and system

A deep neural network, dimensional technology, applied in biological neural network models, neural architecture, climate sustainability, etc., can solve the problems of small matrix and affect the overall efficiency, and achieve the effect of reducing time

Pending Publication Date: 2022-07-08
SOUTH CHINA UNIV OF TECH +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] GEMM operations usually include matrix slice sum operations. If regular matrix multiplication and irregular matrix multiplication are mixed together and the same fragmentation strategy is used, there may be cases where the matrix slices are small and many due to unsuitable fragmentation strategies, thus introducing unnecessary The number of memory loads affects the overall efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GEMM-based deep neural network acceleration method and system
  • GEMM-based deep neural network acceleration method and system
  • GEMM-based deep neural network acceleration method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0049] like figure 1 As shown, a flow chart of a deep neural network acceleration method based on GEMM, a deep neural network acceleration method based on GEMM, includes the following steps: judging that the input matrix multiplication is regular matrix multiplication or irregular matrix multiplication; if it is a regular matrix Multiplication, traverse the pre-established regular matrix multiplication sharding strategy, and select the best sharding strategy based on Kernel Occupancy; if it is irregular matrix multiplication, generate fractions according to the matrix dimension and the preset irregular matrix multiplication sharding strategy. Sharding strategy and traversing the generated sharding strategy, select the optimal sharding strategy based on Kernel Occupancy; shard the matrix according to the selected optimal sharding strategy, and calculate all matrix shards to obtain the operation result. The working flow chart of the GEMM computing accelerator, such as figure 2...

Embodiment 2

[0108] like image 3 As shown in the figure, a structure diagram of a deep neural network acceleration system based on GEMM, the present embodiment provides a deep neural network acceleration system based on GEMM, the system includes an input judgment module, a selection strategy module and a slice operation module, each of which The specific functions of the module are as follows:

[0109] Input judgment module, used to judge whether the input matrix multiplication is regular matrix multiplication or irregular matrix multiplication;

[0110] The selection strategy module is used to traverse the preset regular matrix multiplication sharding strategy if it is regular matrix multiplication, and select the best sharding strategy based on Kernel Occupancy; if it is irregular matrix multiplication, according to the matrix dimension and preset The irregular matrix multiplication sharding strategy generates a sharding strategy and traverses the generated sharding strategy, and selec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of GEMM operation acceleration, and relates to a GEMM-based deep neural network acceleration method and system, and the method comprises the steps: firstly, judging whether an input matrix multiplication is a regular matrix multiplication or an irregular matrix multiplication; different fragmentation modes are adopted for different types of matrix multiplication: if the matrix multiplication is a regular matrix multiplication, traversing and pre-formulating fragmentation strategies to select an optimal strategy, and if the matrix multiplication is an irregular matrix multiplication, firstly generating the fragmentation strategies according to the pre-formulated strategies and then carrying out strategy selection; when the fragmentation strategy is selected, the Kernel Occupancy is used as a basis; fragmenting the matrix according to the selected fragmentation strategy; and calculating the matrix sheets and combining calculation results. According to the method, two different dynamic fragmentation modes and KernelOccupancy are utilized through the GEMM, so that the adaptation degree of the fragmentation size is improved, unnecessary memory loading times are reduced, and meanwhile, the occupancy of a CU is improved.

Description

technical field [0001] The invention relates to the field of GEMM operation acceleration, in particular to a GEMM-based deep neural network acceleration method and system. Background technique [0002] GEMM (General Matrix Multiplication) is a widely used linear algebra operation. Its calculation form is: CαAB+βC, where A, B, and C are matrices, and α and β are scalars. As a basic module of high-performance computing, GEMM has a wide range of applications, including traditional fields such as statistics and scientific computing, as well as emerging fields such as deep learning and big data analysis. [0003] The size and shape of GEMM operations involved in different fields and applications are usually different. For example, in the fields of scientific computing and big data analysis, large-scale matrices need to be processed. In deep neural network applications, the usual is small to medium matrix multiplication. [0004] cuBlas and rocBlas are commonly used linear algeb...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N3/04G06F17/16G06F7/523
CPCG06F17/16G06F7/523G06N3/045Y02D10/00
Inventor 舒惠瑶冼允廷陆璐
Owner SOUTH CHINA UNIV OF TECH
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More