Unlock instant, AI-driven research and patent intelligence for your innovation.

Cluster job scheduling method and system for task multi-copy execution

A job scheduling, multi-copy technology, applied in the computer field, can solve problems such as poor performance, unestimable task completion time, affecting cluster performance, etc.

Active Publication Date: 2021-10-01
SHANGHAI JIAOTONG UNIV
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] With the increasing size and complexity of clusters, how to ensure the measurability and predictability of cluster performance has become increasingly important. However, the prevalence of lagging phenomenon in clusters is a key factor affecting the predictability of cluster performance. The execution time of a task on a certain computing node in a cluster is greatly extended, which makes it impossible to estimate the task completion time, which greatly affects the cluster performance
The most basic way to deal with this lagging phenomenon is to run several copies of the lagging task on other machines. When any copy finishes executing the task first, the task execution is completed, and the other copies that are still running are terminated and cleared. And data, the most classic algorithm is the speculative execution method. According to the execution of each task, it is speculated which tasks will become the tasks that are slowing down, and then according to the results of the speculation, copies of the tasks that are slowing down are executed on other machines. The passive method of performing the copy does not perform well in some tasks that are particularly sensitive to latency

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cluster job scheduling method and system for task multi-copy execution
  • Cluster job scheduling method and system for task multi-copy execution
  • Cluster job scheduling method and system for task multi-copy execution

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0085] The present invention will be described in detail below in conjunction with specific embodiments. The following examples will help those skilled in the art to further understand the present invention, but do not limit the present invention in any form. It should be noted that those skilled in the art can make several changes and improvements without departing from the concept of the present invention. These all belong to the protection scope of the present invention.

[0086] In the present invention, a cluster job scheduling strategy based on machine learning-based multi-copy execution of tasks is proposed, and the method of machine learning is used to find out the computing nodes (computing machines) that are lagging behind the current running tasks, so that the tasks on the lagging nodes The copies and all tasks are started at the same time, and an optimization model with the goal of minimizing task execution time and operating cost is established, and then the opti...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a cluster job scheduling method and system for multi-copy execution of tasks, including: searching for lagging machines: using machine learning methods to find lagging machines currently running tasks; calculating the optimal number of copies: tasks on lagging machines The copy and all tasks are started at the same time, and an optimization model with the goal of minimizing task execution time and operating cost is established, and then the optimal number of starting copies is obtained by solving the optimization model using the alternating direction method. The invention eliminates the detection process and the execution time of the lagging task before it is discovered; it establishes the optimization goal of minimizing the process time of the job and the computing cost in the cluster at the same time, and the number of tasks executed in the cluster does not exceed the cluster The optimization model with the constraints that the number of available computing nodes and the number of copies of each task do not exceed a given threshold.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a machine learning-based cluster job scheduling method and system for multi-copy execution of tasks. Background technique [0002] Support Vector Machine (SVM) is a machine learning method based on statistical learning theory developed in the mid-1990s. It improves the generalization ability of learning machines by seeking the minimum structured risk, and minimizes the empirical risk and confidence range, so as to achieve In the case of a small statistical sample size, the purpose of good statistical regularity can also be obtained. It is a two-class classification model, and its basic model is defined as a linear classifier with the largest interval in the feature space, that is, the learning strategy of the support vector machine is to maximize the interval, which can finally be transformed into the solution of a convex quadratic programming problem. . [0003] Alte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/48
CPCG06F9/4881
Inventor 薛广涛曹燕华钱诗友俞嘉地李明禄
Owner SHANGHAI JIAOTONG UNIV