Elastic scheduling method, system, device and storage medium for GPU virtualization computing power

A scheduling method and virtualization technology, applied in the field of GPU virtualization, can solve problems such as unsatisfactory, non-corresponding, and reduced service availability, and achieve the effect of improving utilization and improving deployment efficiency

Active Publication Date: 2021-05-28
杭州博盾习言科技有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, unlike AI training with a relatively fixed computing cycle and long running time, the call of the AI ​​reasoning container will fluctuate with the fluctuation of the business, and there are often periodic peaks and valleys
Therefore, in the case of large-scale and high-concurrency node requirements, the conventional deployment scheme obviously cannot meet such requirements
[0004] Before the existing AI inference container is deployed, the quota allocation for GPU virtualization computing power needs to go through a series of pressure testing operations and manual analysis of monitoring logs before it can be finally determined, and the usage method is relatively complicated.
In addition, after the AI ​​reasoning container is deployed, its computing power quota and the number of container instances are fixed. When there is a burst of traffic, it will not be able to cope. Mitigates the impact of traffic spikes that reduce business availability
[0005] Therefore, the current allocation of GPU virtualization computing power quotas for AI inference containers still relies on manual operations and related experience. Not only is the use method more complicated, but the deployment efficiency of AI inference containers is low. The peak-valley effect cannot make better use of computing power resources. For this, there is no corresponding solution in related technologies

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Elastic scheduling method, system, device and storage medium for GPU virtualization computing power
  • Elastic scheduling method, system, device and storage medium for GPU virtualization computing power
  • Elastic scheduling method, system, device and storage medium for GPU virtualization computing power

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described and illustrated below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application. Based on the embodiments provided in the present application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

[0025] At present, the allocation of GPU virtualization computing power quotas is done manually without a systematic feedback process. There are two problems with this method: one is that the use of the device is more complicated, and it takes a long time to manually The operation process, first of all, requires manual pressure testing, and observes the...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This application relates to the elastic scheduling method, system, equipment and storage medium of GPU virtualization computing power, which belongs to the technical field of GPU virtualization. Indicators, automatically determine the computing power quota of the container; detect the real-time running indicators of the container in the business scenario; automatically adjust the number of container instances according to the real-time running indicators and the preset computing power elastic scheduling conditions. It realizes the automatic determination of the computing power quota allocated by the AI ​​reasoning container, improves the deployment efficiency of the AI ​​reasoning container, and greatly improves the utilization rate of GPU computing power.

Description

technical field [0001] The present application relates to the technical field of GPU virtualization, in particular to a flexible scheduling method, system, device and storage medium of GPU virtualized computing power. Background technique [0002] With the rapid development of AI technology, many algorithm developers provide the most basic guarantee for deep learning technology, and the development of cloud computing is becoming more and more mature, cloud computing provides computing power guarantee for the progress of AI technology. The gradual maturity of GPU virtualization technology and container technology has improved the resource utilization rate of online reasoning and the flexibility of business deployment, thereby promoting the vigorous development of business. Among them, containerization is a method of software development. In this method, the program, its dependent components and collection packages, and related environment variable configuration files will be ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/455G06F9/445G06F9/50G06N5/04
CPCG06F9/4451G06F9/45558G06F9/5027G06F2009/45562G06N5/04
Inventor 谢建超
Owner 杭州博盾习言科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products