Method for distributing tasks by general purpose graphic processing unit in multi-task concurrent execution manner

A graphics processor and multi-tasking technology, applied in the field of high-performance computing, can solve the problem of not proposing a first-level data cache pollution optimization scheme, low computing resource utilization, and not processing stream multiprocessors. Different kernel function thread block scheduling problems, etc. question

Active Publication Date: 2016-06-08
PEKING UNIV
View PDF4 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method is a coarse-grained concurrency technology. Although it can balance the utilization of streaming multiprocessors and off-chip storage resources, the low utilization of computing resources inside a streaming multiprocessor is still a very serious problem.
[0007] In 2014, Lee et al. from the Korea Advanced Institute of Science and Technology (KAIST) proposed a hybrid concurrent kernel function execution scheme (Published on: High Performance Computer Architecture (HPCA), 2014IEEE20thInternational Symposium on , Pages260-271)
However, this scheme does not specifically deal with the scheduling problem of different kernel function thread blocks in the stream multiprocessor, nor does it propose an optimization scheme for level-1 data cache pollution

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for distributing tasks by general purpose graphic processing unit in multi-task concurrent execution manner
  • Method for distributing tasks by general purpose graphic processing unit in multi-task concurrent execution manner
  • Method for distributing tasks by general purpose graphic processing unit in multi-task concurrent execution manner

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] Below in conjunction with accompanying drawing, further describe the present invention through embodiment, but do not limit the scope of the present invention in any way.

[0049] figure 1 It is a schematic diagram of dispatching thread blocks of different kernel functions to the same stream multiprocessor through a thread block dispatching engine method in the present invention. Such as figure 1 As shown, the rectangles in (a) from top to bottom are different kernel functions containing multiple thread blocks: kernel function A and kernel function B; where the white square represents the thread block of kernel function A, and the black square represents kernel function B (b) is a thread block containing different kernel functions in the same stream multiprocessor; the rectangles on the left side of the thread block distribution engine in the figure represent kernel function A and kernel function B respectively from top to bottom, and the white squares represent The t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for distributing tasks by a general purpose graphic processing unit in a multi-task concurrent execution manner. The method comprises the following steps: firstly classifying kernel functions through a thread block distribution engine method; carrying out classified counting on the kernel functions to obtain the number of thread blocks of the kernel functions which are respectively distributed to a streaming processor; and distributing the thread blocks with different kernel function corresponding numbers into a same streaming multiprocessor so as to achieve the aims of improving the resource utilization rate of each streaming multiprocessor in the general purpose graphic processing unit and enhancing the system performance and the energy efficiency ratio. A level-1 data cache bypass method can be further utilized; and according to the method, a dynamic method is used for determining the thread block of which kernel function is bypassed, and then bypassing is carried out according to the number of the bypassed thread blocks of the kernel functions, so as to achieve the aims of lightening the pressure of the level-1 data cache and further improving the performance.

Description

technical field [0001] The invention belongs to the technical field of high-performance computing, and relates to a multi-task concurrent execution method in high-performance computing, in particular to a task distribution method for a general-purpose graphics processing unit (GPGPU) multi-task concurrent execution. Background technique [0002] General-purpose graphics processing unit (GPGPU) is a processor that utilizes the characteristics of many-core structure, multi-threading and high memory access bandwidth of graphics processors to handle high-performance computing tasks such as biological computing, image processing and physical simulation. In modern computing, computing tasks have an urgent need for high performance and high throughput, so that general-purpose graphics processors are widely used in the computing field and play an increasingly important role. Moreover, with the development of cloud computing and the popularization of computing terminals, more and mor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/38G06T1/00
CPCG06F9/3836G06T1/00
Inventor 梁云李秀红
Owner PEKING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products