GEMM-based deep neural network acceleration method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A deep neural network, dimensional technology, applied in biological neural network models, neural architecture, climate sustainability, etc., can solve the problems of small matrix and affect the overall efficiency, and achieve the effect of reducing time

Pending Publication Date: 2022-07-08

SOUTH CHINA UNIV OF TECH +1

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0007] GEMM operations usually include matrix slice sum operations. If regular matrix multiplication and irregular matrix multiplication are mixed together and the same fragmentation strategy is used, there may be cases where the matrix slices are small and many due to unsuitable fragmentation strategies, thus introducing unnecessary The number of memory loads affects the overall efficiency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0049] like figure 1 As shown, a flow chart of a deep neural network acceleration method based on GEMM, a deep neural network acceleration method based on GEMM, includes the following steps: judging that the input matrix multiplication is regular matrix multiplication or irregular matrix multiplication; if it is a regular matrix Multiplication, traverse the pre-established regular matrix multiplication sharding strategy, and select the best sharding strategy based on Kernel Occupancy; if it is irregular matrix multiplication, generate fractions according to the matrix dimension and the preset irregular matrix multiplication sharding strategy. Sharding strategy and traversing the generated sharding strategy, select the optimal sharding strategy based on Kernel Occupancy; shard the matrix according to the selected optimal sharding strategy, and calculate all matrix shards to obtain the operation result. The working flow chart of the GEMM computing accelerator, such as figure 2...

Embodiment 2

[0108] like image 3 As shown in the figure, a structure diagram of a deep neural network acceleration system based on GEMM, the present embodiment provides a deep neural network acceleration system based on GEMM, the system includes an input judgment module, a selection strategy module and a slice operation module, each of which The specific functions of the module are as follows:

[0109] Input judgment module, used to judge whether the input matrix multiplication is regular matrix multiplication or irregular matrix multiplication;

[0110] The selection strategy module is used to traverse the preset regular matrix multiplication sharding strategy if it is regular matrix multiplication, and select the best sharding strategy based on Kernel Occupancy; if it is irregular matrix multiplication, according to the matrix dimension and preset The irregular matrix multiplication sharding strategy generates a sharding strategy and traverses the generated sharding strategy, and selec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of GEMM operation acceleration, and relates to a GEMM-based deep neural network acceleration method and system, and the method comprises the steps: firstly, judging whether an input matrix multiplication is a regular matrix multiplication or an irregular matrix multiplication; different fragmentation modes are adopted for different types of matrix multiplication: if the matrix multiplication is a regular matrix multiplication, traversing and pre-formulating fragmentation strategies to select an optimal strategy, and if the matrix multiplication is an irregular matrix multiplication, firstly generating the fragmentation strategies according to the pre-formulated strategies and then carrying out strategy selection; when the fragmentation strategy is selected, the Kernel Occupancy is used as a basis; fragmenting the matrix according to the selected fragmentation strategy; and calculating the matrix sheets and combining calculation results. According to the method, two different dynamic fragmentation modes and KernelOccupancy are utilized through the GEMM, so that the adaptation degree of the fragmentation size is improved, unnecessary memory loading times are reduced, and meanwhile, the occupancy of a CU is improved.

Description

technical field [0001] The invention relates to the field of GEMM operation acceleration, in particular to a GEMM-based deep neural network acceleration method and system. Background technique [0002] GEMM (General Matrix Multiplication) is a widely used linear algebra operation. Its calculation form is: CαAB+βC, where A, B, and C are matrices, and α and β are scalars. As a basic module of high-performance computing, GEMM has a wide range of applications, including traditional fields such as statistics and scientific computing, as well as emerging fields such as deep learning and big data analysis. [0003] The size and shape of GEMM operations involved in different fields and applications are usually different. For example, in the fields of scientific computing and big data analysis, large-scale matrices need to be processed. In deep neural network applications, the usual is small to medium matrix multiplication. [0004] cuBlas and rocBlas are commonly used linear algeb...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N3/04G06F17/16G06F7/523

CPCG06F17/16G06F7/523G06N3/045Y02D10/00

Inventor 舒惠瑶冼允廷陆璐

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

GEMM-based deep neural network acceleration method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology