L1 cache sharing method for GPU

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A high-speed cache and S12 technology, applied in the GPU field, can solve problems such as large memory access overhead, unbalanced resource utilization, and inability to fully utilize L1 cache resources.

Active Publication Date: 2021-09-10

NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI

View PDF2 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, although spatially multitasking GPUs that run computation-intensive programs and memory-intensive programs at the same time can effectively improve the overall resource utilization of the system, running different programs on different SMs will cause resources on the SMs, especially the L1 cache (Level 1 cache, L1 Cache) Resource utilization is unbalanced, which affects the further improvement of multi-tasking GPU performance

Specifically, for SMs running memory-intensive programs, such programs will generate a large number of memory access requests, resulting in excessive use of L1 cache resources, high L1 cache failure rate, and invalid requests through the on-chip interconnection network It is sent to the L2 cache (secondary cache, L2 Cache) and the storage system, which will bring a large memory access overhead; for SMs running computationally intensive programs, there are very few memory access requests for such programs, Causes L1 cache resources to be underutilized

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0033] In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be clearly and completely described below in conjunction with specific embodiments of the present invention and corresponding drawings. Apparently, the described embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts fall within the protection scope of the present invention.

[0034] The technical solution provided by an embodiment of the present invention will be described in detail below with reference to the accompanying drawings.

[0035] see figure 1 , an embodiment of the present invention provides a method for sharing an L1 cache of a GPU, the method is used for a spatially multitasking GPU that simultaneously runs a computati...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an L1 cache sharing method for a GPU (Graphics Processing Unit), which comprises the following steps: S11, judging whether a local memory access request is empty or not, if so, executing S21, and if not, executing S12; s12, taking out the request to access the L1 cache; s13, judging whether the target is hit or not, if yes, returning data, and if not, executing S14; s14, judging whether the program is a storage-intensive program or not, if so, sending the request to other SM and executing S15, and if not, sending the request to L2 cache; s15, judging whether a cache data block needs to be replaced or not, and if yes, sending a data block replacement request to other SM; s21, judging whether the far-end memory access request is empty or not, and if not, executing S22; s22, taking out the request to access the L1 cache; s23, judging whether the request is hit or not, if yes, returning data, and executing S24, and if not, sending the request to the L2 cache, and executing S24; and S24, judging whether the far-end data request is empty or not, and if not, storing the data block needing to be replaced into the L1 cache. According to the invention, the operation of the storage-intensive program and the use of the L1 cache on the SM of the operation calculation-intensive program can be realized.

Description

technical field [0001] The invention relates to the technical field of GPUs, in particular to an L1 cache sharing method for GPUs. Background technique [0002] Graphics Processing Unit (GPU) is a microprocessor used to do image and graphics-related calculations. GPU is widely used in cloud computing platforms and data centers because of its powerful computing capabilities, providing users with the calculation. Compared with a single-task GPU that only runs one task on the GPU, a multi-task GPU can run multiple tasks on the GPU at the same time, which can effectively improve resource utilization. Specifically, a multi-task GPU can simultaneously run a computation-intensive program and a storage-intensive program on one GPU, and the computation resources and storage resources on the GPU can be fully utilized at the same time. [0003] At present, the spatial multitasking method is mainly used to realize the GPU to run multiple tasks at the same time. Specifically, in the sp...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F12/0811G06T1/20

CPCG06T1/20G06F12/0811Y02D10/00

Inventor 赵夏何益百张拥军张光达陈任之隋京高王承智王璐王君展

Owner NAT INNOVATION INST OF DEFENSE TECH PLA ACAD OF MILITARY SCI

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

L1 cache sharing method for GPU

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology