Unlock instant, AI-driven research and patent intelligence for your innovation.

A hybrid load scheduling optimization method for deep learning of heterogeneous GPU clusters

A GPU cluster and deep learning technology, applied in the field of GPU clusters, can solve problems such as poor performance, failure to consider the heterogeneous characteristics of nodes, and inability to take advantage of the performance advantages of heterogeneous computing nodes, etc., to achieve the effect of improving execution efficiency

Active Publication Date: 2022-07-22
CHINA UNIV OF MINING & TECH (BEIJING)
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This allocation method does not consider the heterogeneous characteristics between nodes, and cannot take advantage of the performance advantages of heterogeneous computing nodes. In a heterogeneous environment, it has poor performance in processing mixed loads of deep learning.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A hybrid load scheduling optimization method for deep learning of heterogeneous GPU clusters

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] refer to figure 1 , a solution for optimizing the execution efficiency of a heterogeneous GPU cluster deep learning mixed load includes: statically adding node type labels to multiple lower layer computing nodes of a heterogeneous GPU cluster; the GPU cluster is composed of three or more lower layer computing nodes.

[0018] When there are three lower-level computing nodes in the above, they respectively include: multiple K80 GPUs, multiple P40 GPUs, and multiple V100 GPUs.

[0019] Then, the classification application is performed for the upper-layer application of the distributed cluster; the classification application for the upper-layer application of the distributed cluster includes: the task of applying VAE, the task of applying DCGAN, and the task of applying ResNet-50.

[0020] For multiple applications served by the upper layer of the distributed cluster, multiple different types of lower layer computing nodes are evenly distributed to multiple applications to ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A hybrid load scheduling optimization method for deep learning of heterogeneous GPU clusters, comprising: statically adding node type labels to multiple lower-layer computing nodes of a heterogeneous GPU cluster; classifying applications for upper-layer applications of a distributed cluster; For multiple applications, multiple different types of lower-level computing nodes are evenly distributed to multiple applications through the scheduling module; calculate the time required for multiple different types of lower-level computing nodes to run on multiple applications; time, the performance differences of heterogeneous GPU applications are found; and the performance differences of heterogeneous GPU applications are traded by the second price trading method. In the present invention, in the heterogeneous GPU cluster, the scheduling optimization model method is more excellent than the traditional distributed processing framework in processing the mixed load of deep learning, especially when the cluster environment is complex and the isomerization is serious, the cluster can be fully utilized resources and significantly improve the execution efficiency of the system.

Description

Technical field: [0001] The invention relates to the technical field of GPU clusters, in particular to a hybrid load scheduling optimization method for deep learning of heterogeneous GPU clusters. Background technique: [0002] With the development of information technology and the gradual expansion of the cluster scale, the upper-layer applications of distributed clusters are gradually becoming more complex, such as common network search, voice assistant, etc. These applications are obtained through deep learning task training. The lower-level nodes of a distributed cluster are composed of a large number of GPU servers that provide computing resources for deep learning training tasks. However, with the continuous optimization and updating of GPU servers, the lower-level nodes also gradually show the characteristics of heterogeneity. Therefore, how to allocate reasonable and efficient computing resources for deep learning mixed workloads in heterogeneous GPU clusters has bec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50G06N3/063G06N3/04G06N3/08
CPCG06F9/505G06F9/5072G06N3/063G06N3/08G06N3/045
Inventor 张潇田琨
Owner CHINA UNIV OF MINING & TECH (BEIJING)