The invention discloses a
deep learning-oriented multi-type
GPU cluster resource management and scheduling method and
system. The method comprises the following steps: dividing a
GPU cluster into a plurality of GPU groups according to the model of a GPU, counting the idle operational capability of each GPU group, obtaining all users accessing the
GPU cluster, and recording the minimum operationalcapability requirement of each user; and periodically accessing the
job queue, obtaining the job to be processed with the highest priority in the
job queue, and scheduling GPU cluster resources according to the job to be processed. According to the invention, GPUs of different brands and models are uniformly managed as one cluster for
deep learning, the number of maintained GPU clusters is reduced, and the GPU cluster management complexity is simplified; the requirements of different users in
deep learning can be met; reasonable user attributes are set according to user requirements, users donot need to be familiar with and care about GPU cluster environments,
resource scheduling is carried out according to operational capability requirements and priorities of the users, resources meetingthe requirements can be automatically allocated through the scheduling method, and the
resource utilization rate of different GPU type groups is increased.