A resource management method and system for gpu cluster

A GPU cluster and resource management technology, applied in the field of computer high-performance computing, can solve problems such as heavy load on management nodes and waste of GPU resources, and achieve the effect of reducing load

Active Publication Date: 2016-05-25
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of the above defects or improvement needs of the prior art, the present invention provides a GPU cluster-oriented resource management method and system, the purpose of which is to solve the technical problems of heavy management node load and waste of GPU resources existing in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A resource management method and system for gpu cluster
  • A resource management method and system for gpu cluster
  • A resource management method and system for gpu cluster

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0048] Such as figure 2 As shown, the resource management method for GPU clusters of the present invention includes the following steps:

[0049] (1) The main management node establishes the resource information table and task information table; specifically, the resource information table records the node number, CPU number, idle CPU number, GPU number, idle GPU number, etc. of each node in t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a resource management method facing a GPU (Graphic Processing Unit) cluster, which comprises the following steps: a main management node establishes two charts (a resource information chart and a task information chart); the main management node receives a new task; the main management node judges whether the task is a CPU (Central Processing Unit) task or a GPU task; the main management node seeks free resource meeting the requirement of the task; if the task is a CPU task, a secondary management node conducts pretreatment on the data of the task, and dispensing pieces of the data to all nodes managed by the secondary management node for calculation, the main management node reclaims CPU resource related to all the nodes managed by the secondary management node according to the number of the task after calculation; if the task is a GPU task, the main management node reclaims the GPU resource related to all the nodes managed by the secondary management node according to the number of the task when GPU calculation is detected to be finished; meanwhile, the CPUs of all the nodes managed by the secondary management node are used for post-processing of a result, and the post-processing is finished. According to the invention, CPU resource and the GPU resource are treated differently; through the detection of the task, free GPU resource can be reclaimed fast.

Description

technical field [0001] The invention belongs to the field of computer high-performance computing, and more specifically relates to a GPU cluster-oriented resource management method and system. Background technique [0002] In recent years, with the continuous development of high-performance computing, GPU clusters have received more and more attention. The high performance of the GPU cluster is mainly attributed to its massively parallel multi-core structure, high throughput in multi-threaded floating-point arithmetic, and the use of large on-chip caches to significantly reduce the time for moving large amounts of data. GPU clusters not only provide a huge leap in speed performance, but also significantly reduce space, power, and cooling requirements. [0003] However, the current management of GPU clusters is mainly based on the management mode of CPU clusters, scheduling for CPU cores, and adopting a single centralized unified management mode: the entire cluster has only ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50
CPCG06F9/5011
Inventor 金海郑然冯晓文朱磊
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products