GPU resource management and intelligent scheduling method for deep learning

A deep learning and resource management technology, applied in resource allocation, machine learning, genetic laws, etc., can solve problems such as low GPU resource utilization and poor job execution performance, achieve good framework compatibility and ease of use, and enhance scheduling , the effect of improving execution performance

Pending Publication Date: 2021-02-26
NANJING UNIV
View PDF0 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Purpose of the invention: In view of the problems and deficiencies in the prior art above, the purpose of the present invention is to provide a deep learning-oriented GPU resource management and intelligent scheduling method to solve the problem of low utilization of GPU resources in the existing system in deep learning scenarios, Problems with poor job execution performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GPU resource management and intelligent scheduling method for deep learning
  • GPU resource management and intelligent scheduling method for deep learning
  • GPU resource management and intelligent scheduling method for deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] Below in conjunction with accompanying drawing and specific embodiment, further illustrate the present invention, should be understood that these embodiments are only for illustrating the present invention and are not intended to limit the scope of the present invention, after having read the present invention, those skilled in the art will understand various aspects of the present invention Modifications in equivalent forms all fall within the scope defined by the appended claims of this application.

[0029] The present invention proposes a deep learning-oriented GPU resource management and intelligent scheduling method, which solves the problems of low GPU resource utilization and poor job execution performance in deep learning scenarios.

[0030] Such as figure 1 As shown, the complete process of the present invention includes 8 parts: job submission stage, authority verification stage, job manager startup stage, resource application stage, job feature modeling and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a GPU resource management and intelligent scheduling method for deep learning, and the method comprises the following steps: 1, enabling a user to submit deep learning jobs through a front-end interface module, wherein the deep learning jobs comprise a to-be-executed deep learning program and a training data set; 2, after verification, adding the job to a queue to be scheduled corresponding to the scheduler; 3, starting an independent job manager for the job; 4, applying for computing resources required by job operation from a resource manager; 5, carrying out feature modeling and analysis on the to-be-scheduled job; 6, generating a resource scheduling scheme according to the operation characteristics and the cluster computing node characteristics; 7, scheduling thejob to a specified computing node according to a scheduling scheme; 8, enabling the job executor to start the container and execute a deep learning program. According to the method, the problems of low GPU resource utilization rate and poor job execution performance of an existing cluster resource scheduling method in a deep learning scene can be solved.

Description

technical field [0001] The invention relates to the technical field of cluster resource scheduling, in particular to a deep learning-oriented GPU resource management and intelligent scheduling method. Background technique [0002] Research and practice in recent years have shown that compared with traditional machine learning techniques, deep learning can achieve higher accuracy in fields such as computer vision and speech recognition, and thus has been widely used. There is a large amount of calculation in the training process of the deep learning model, and the Graphics Processing Unit (GPU) can perform this simple but large-scale computing task more efficiently, so it has become an important basic computing resource for running deep learning programs. [0003] Since GPU cards are usually expensive, it is costly to deploy an independent private cluster for each user (group), and users do not always perform model training, so users usually share these GPU resources to reduc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F9/48G06F9/455G06N3/12G06N20/00
CPCG06F9/45558G06F9/4881G06F9/5011G06F9/5016G06F9/5027G06F2009/4557G06F2009/45575G06N3/126G06N20/00
Inventor 顾荣刘率王肇康袁春风黄宜华
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products