Task scheduling method of self-learning feedback under Hadoop multi-job environment

A task scheduling and self-learning technology, applied in multi-programming devices, resource allocation, etc., can solve problems such as self-evident limitations, different actual weights, and no reference value, so as to improve accuracy and hit hits. rate, and the effect of promoting optimal utilization
CN103440167AActive Publication Date: 2013-12-11FUZHOU UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
FUZHOU UNIV
Publication Date
2013-12-11

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention relates to a task scheduling method in the field of high-performance clusters. The task scheduling method comprises the steps: after analysis of each working node of a Hadoop in a job submitting stage, acquiring an actual stage weight according with a task, after processing of a geometric mean method, establishing a reference standard of the stage weight for left tasks of the job; in a task feedback stage, adopting the reference standard for the left tasks of the job, and estimating the task residual execution time by combining with progresses of sub-stages; in a job feedback stage, solving a geometric mean of stage weights of all tasks by using a staging manner, and establishing a job name-stage weight mapping record to be used as a reference of executing subsequent jobs on the node. According to the task scheduling method, self-learning and information feedback can be respectively carried out on the task of each job in a multi-job parallel executing environment to obtain more accurate stage weight estimation, the accuracy of pre-estimating the task residual execution time is improved, the hit rate of selecting dated tasks is increased, and the optimal utilization of cluster resources is promoted.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a task scheduling method in the field of high-performance clusters, in particular to a task scheduling method based on task autonomous learning and an information feedback mechanism in a Hadoop multi-job environment. Background technique

[0002] MapReduce is a parallel data processing model for large-scale data-intensive applications. As an open source implementation of MapReduce, Hadoop has been widely used in various fields. However, the existing Hadoop has great limitations, because its development principle is originally aimed at the homogeneous environment, and the default scheduling mechanism is also designed based on the assumptions of node homogeneity and task linear execution, but in actual applications , due to differences in hardware configuration, resource virtualization, etc., these assumptions cannot be satisfied.

[0003] Usually, a Hadoop cluster is composed of many conventional computers. These machines are ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More