Hadoop job scheduling method based on genetic algorithm

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
A job scheduling and genetic algorithm technology, applied in the field of Hadoop job scheduling based on genetic algorithm, can solve the problem of inability to take into account job fairness and job execution efficiency.

Inactive Publication Date: 2015-04-29

XI'AN POLYTECHNIC UNIVERSITY

View PDF3 Cites 19 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] The purpose of the present invention is to provide a Hadoop job scheduling method based on a genetic algorithm, which solves the problems in the prior art that cluster resources need to be preconfigured before job scheduling, and that the fairness of the job and the execution efficiency of the job cannot be taken into account. technical problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0056] The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0057] see figure 1 , the Hadoop job scheduling method based on genetic algorithm of the present invention, comprises the following steps:

[0058] Step 1: Job Preprocessing

[0059] At the JobTracker node, firstly summarize the jobs waiting to be scheduled and the TaskTracker nodes in the cluster. For each job in the job queue, count the number of fragments of each job l m and the maximum number of TaskTrackers it can be scheduled b m ,As shown in Table 1:

[0060] Table 1

[0061] job

split

TaskTracker

job 1

l 1

b 1

job 2

l 2

b 2

……

job m

l m

b m

[0062] Among them, Job 1 、Job 2 ... Job m The order of jobs is first-come-first-served.

[0063] For each TaskTracker node, read the maximum number of parallel slots s in the corresponding con...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a Hadoop job scheduling method based on a genetic algorithm. The Hadoop job scheduling method comprises the following steps: firstly, pre-processing work to generate an encoding and decoding table; secondly, generating initial scheduling tables of a plurality of executing work, and carrying out fitness detection sorting on the initial scheduling tables to obtain a scheduling table list; finally, carrying out genetic operation on the initial scheduling tables in the scheduling table list to form a final scheduling table list; taking the scheduling table ranked in the most front of the final scheduling table list as an optimal scheduling table; distributing tasks of different work to corresponding TaskTracker for execution according to the optimal scheduling table, so as to finish a Hadoop job scheduling task. According to the scheduling method, resources in a platform do not need to be pre-set before jobs are scheduled; dynamic acquisition, counting and distribution are carried out in a scheduling process and the burden of an administrator is alleviated; furthermore, the total finishing time of the work and the average finishing time of the work can be controlled by the scheduling method, so that the fairness of executing the work is guaranteed and the executing efficiency can also be ensured.

Description

technical field [0001] The invention belongs to the field of information technology and relates to a Hadoop job scheduling method based on a genetic algorithm. Background technique [0002] Apache Hadoop is an open source distributed platform, mainly composed of two core projects, MapReduce and HDFS. MapReduce is the core computing framework of Hadoop. It is a software framework with a master-slave structure, which is divided into two roles: JobTracker and TaskTracker. The JobTracker node forms task fragments (splits) through the preprocessing of the job data Job, and then distributes them to each TaskTracker node to ensure the parallelism of tasks, and then decomposes each fragment in the Map stage and summarizes in the Reduce stage, and finally outputs the processing The results are saved; HDFS is the storage cornerstone for Hadoop to realize distributed computing. It is a highly fault-tolerant system suitable for deployment on inexpensive machines. HDFS is also a frame...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F9/50

Inventor薛涛燕明磊

OwnerXI'AN POLYTECHNIC UNIVERSITY

Hadoop job scheduling method based on genetic algorithm

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements:Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology