Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Cloud computation resource scheduling method oriented to distributed machine learning

A machine learning and resource scheduling technology, applied in the software field, can solve the problems of ignoring the quality of model training tasks, large overhead, etc., and achieve the effect of quickly adapting to dynamic changes and improving resource utilization.

Inactive Publication Date: 2018-10-23
JIANGSU HOPERUN SOFTWARE CO LTD
View PDF0 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The above methods mainly have the following problems: First, offline analysis of task execution, when the user debugs or adjusts the model, the calculation structure is likely to change frequently, and will bring huge overhead; second, resource scheduling focuses on resource utilization and The performance of individual tasks, but ignores the overall task completion quality of model training

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cloud computation resource scheduling method oriented to distributed machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] Below in conjunction with specific embodiment and accompanying drawing, the present invention is described in detail, as figure 1 As shown, the method flow of the embodiment of the present invention:

[0017] Establish a resource scheduling framework such as figure 1 As shown, the scheduler is used to coordinate the resource allocation of multiple machine learning model training jobs sharing cloud computing resources. The job driver contains iterative training logic, generates tasks for each iteration, and tracks the overall progress of the job. The scheduler communicates with drivers that execute jobs concurrently, tracks job progress, and periodically updates resource allocations. At the beginning of each scheduling phase, the scheduler allocates resources based on the job's workload, resource requirements, and task progress. Each job consists of a set of tasks, and each task processes data on a dataset partition. In machine learning model training, tasks update...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a cloud computation resource scheduling method oriented to distributed machine learning. The method comprises the steps of establishing a model between the number of iterations and a model quality improvement through historical data for online prediction of an influence of resource allocation on the model quality improvement; and formulating a resource allocation strategyso as to achieve the effect of maximization of overall performances of multiple concurrently executed model training tasks running in a cloud computation platform. Therefore, the resource utilizationrate is improved and dynamic changes of the tasks and loads are rapidly adapted.

Description

technical field [0001] The invention relates to a cloud computing resource scheduling method, in particular to a cloud computing resource scheduling method oriented to distributed machine learning, and belongs to the field of software technology. Background technique [0002] Machine learning is an increasingly important technology for large-scale data analysis, widely used in online search, marketing, healthcare and information security. Machine learning includes two stages: training and inference. The training stage builds a machine learning model from the training data set, and the inference stage uses the model to predict new inputs. A machine learning model is an approximate function of the input-to-output mapping. Model training usually needs to be based on a large-scale data set and calculated through multiple iterations until convergence. Model training is an exploratory process in which hyperparameters and model structure are adjusted through repeated training to g...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F9/50H04L29/08
CPCG06F9/505G06F9/5027H04L67/10
Inventor 周红卫刘延新李亚琼李守超吴昊
Owner JIANGSU HOPERUN SOFTWARE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products