GPU cluster environment-oriented method for avoiding GPU resource contention

A GPU cluster and resource contention technology, applied in the field of avoiding GPU resource contention, can solve problems such as unreasonable GPU assignment, low GPU utilization, and inability to execute in parallel, avoiding resource contention, improving system throughput, The effect of optimizing the execution

Active Publication Date: 2018-04-20
CHINA INFOMRAITON CONSULTING & DESIGNING INST CO LTD +1
View PDF4 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the programming process, some codes cannot or are not suitable to be completed on the GPU, such as some codes that cannot be executed in parallel; codes whose cost of data migration is greater than the benefits of parallel computing; I / O operations, etc.
Therefore, programmers need to specify the ratio of code execution on the CPU and GPU in advance, which causes the GPU to be idle, resulting in low GPU utilization
[0005] 2) The GPU assignment method is unreasonable
Solving these problems is full of difficulties and challenges

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GPU cluster environment-oriented method for avoiding GPU resource contention
  • GPU cluster environment-oriented method for avoiding GPU resource contention
  • GPU cluster environment-oriented method for avoiding GPU resource contention

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0087] combine figure 1 and figure 2 , GPU clusters are divided into two types of nodes: GPU head nodes and GPU computing nodes. Among them, there is only one GPU head node, and the rest are GPU computing nodes, and the nodes are connected through Ethernet or Infiniband. Each node in the GPU cluster is configured with the same number and model of NVIDIA Kepler GPUs. On each GPU computing node, install the GPU operating environment of CUDA7.0 or later.

[0088] Combined with the existing functions in the GPU cluster platform, three modules are added to the platform: application GPU behavior feature extraction module, application scheduling module and multi-application fine-grained concurrency module. The specific implementation steps will be described with an example:

[0089] In the application GPU behavior feature extraction module, the extracted information includes: GPU memory application operations (cudaMalloc, etc.), data copy operations between the host and GPU devi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a GPU cluster environment-oriented method for avoiding GPU resource contention. The method includes the steps of constructing a plug-in supporting multi-application fine-grained concurrent execution, application behavior feature extraction and application task scheduling. Against the problem of GPU resource contention that may arise for multiple applications running on thesame NVIDIA GPU node, a platform that supports multi-application fine-grained concurrent execution is built, so that the multiple applications can undergo concurrent execution on the same GPU node asmuch as possible. GPU behavior features of each application, including GPU usage patterns and GPU resource requirement information, are extracted. According to the GPU behavior features of each application and the resource usage status of each GPU node in a current GPU cluster, the applications is dispatched to a suitable GPU node, thereby minimizing the resource contention of multiple independentapplications on the same GPU node.

Description

technical field [0001] The invention relates to the field of GPU high-performance computing, in particular to a method for avoiding contention of GPU resources facing a GPU cluster environment. Background technique [0002] GPU-accelerated computing refers to the simultaneous utilization of graphics processing units (GPUs) and CPUs to accelerate scientific, analytical, engineering, consumer and enterprise applications. GPU-accelerated computing can deliver extraordinary application performance by offloading computationally intensive parts of an application to the GPU while still allowing the CPU to run the rest of the program code. From the user's point of view, the application runs significantly faster. The use of GPUs to accelerate the execution of applications has become more and more popular. For example, in the field of scientific computing, researchers use GPUs to accelerate Monte Carlo simulation experiments; GPUs are used to accelerate numerical calculations; in th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
CPCG06F9/5027G06F9/5066
Inventor 东方师晓敏罗军舟查付政王睿孙斌
Owner CHINA INFOMRAITON CONSULTING & DESIGNING INST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products