GPU parallel compute resource allocation method and device

A parallel computing and resource allocation technology, applied in the computer field, can solve the problems of poor acceleration effect, low parallel efficiency, low degree of parallelism, etc., and achieve the effect of improving efficiency

Active Publication Date: 2018-12-25
SICHUAN ENERGY INTERNET RES INST TSINGHUA UNIV +1
View PDF9 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The inventor found through research that for massive computing tasks, Kernel stream parallelism is only coarse-grained parallelism at the task level, and the parallelism between different Kernels is limited, and the actual acceleration effect is poor
In particular, for computing tasks with a high degree of process serialization (such as sparse matrix factorization, sparse triangular equation solving and other sparse algorithms based on directed graphs), the parallelism inside the above-mentioned Kernel is very low, while the parallelism between Kernels is still low. Limited, there are a lot of vacant computing resources in actual computing, making the actual parallel efficiency very low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GPU parallel compute resource allocation method and device
  • GPU parallel compute resource allocation method and device
  • GPU parallel compute resource allocation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is only a part of embodiments of the present invention, but not all embodiments. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations.

[0050] Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of computer technology, and particularly provideds a GPU parallel compute resource allocation method and apparatus, The methods include: a computing task determinedby a computing process is obtained, Aa layered directed acyclic graph model is obtained by adopting a two-layer parallel computing model to process the computational task under the initial parameters,a layered directional acyclic graph model is used to compute the computational task by using a two-layer parallel computational model, according to the number of a plurality of preset parameters andthe number of calculation elements corresponding to each preset parameter, the thread block is labeled according to the preset parameters and the calculation elements corresponding to each preset parameter, and the calculation elements corresponding to each preset parameter are allocated according to the labels of each thread block, so that each thread block calculates the allocated calculation elements. By using the above method, the parallel computing efficiency can be effectively improved.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a GPU parallel computing resource configuration method and device. Background technique [0002] With the rapid development of computer technology, the traditional CPU design process has gradually reached the physical limit, and the growth rate of computing power has lagged far behind "Moore's Law". The improvement of computing power gradually turns to rely on new parallel computing technologies such as multi-core and many-core. In recent years, GPU, as an advanced many-core heterogeneous parallel computing device, has been widely used to accelerate large-scale computing-intensive tasks such as climate simulation, protein folding, and deep learning. Taking the GPU supporting the NVIDIA CUDA architecture as an example, when processing computing tasks, the computing tasks are organized into one or more Kernel operating system kernels containing a large number of threads....

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/50G06F9/48
CPCG06F9/4843G06F9/5027G06F2209/5018
Inventor 宋炎侃陈颖于智同黄少伟沈沉
Owner SICHUAN ENERGY INTERNET RES INST TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products