Scheduling method and system for job task of file

A job task and scheduling system technology, applied in the field of distributed computing, can solve the problems of increased IO overhead and data blocks not on the same node, and achieve the effects of reducing IO overhead, load balancing, and satisfying data locality

Inactive Publication Date: 2011-08-17
NAT UNIV OF DEFENSE TECH
View PDF1 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Since the data blocks are scattered on multiple different data nodes, and Map tasks also need to be distributed to multiple different data nodes for execution, there are situations where different Map tasks are distributed to different nodes than the required data blocks, resulting in the task execution process Increase in IO overhead in

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Scheduling method and system for job task of file
  • Scheduling method and system for job task of file
  • Scheduling method and system for job task of file

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] In order to enable those skilled in the art to better understand the scheme of the present application. The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.

[0041] The flow chart of the file job task scheduling method provided by the embodiment of the present application is as follows: figure 1 shown, including:

[0042] Step S101: Find the node where the data block of the file required to execute the job task is located, and the number of the data block is multiple;

[0043] Each data block of the file has a copy data block, and t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application discloses a scheduling method for a job task of a file, which comprises the following steps of: searching nodes where a plurality of data blocks of a file needed to carry out the job task are located; calculating loads produced by the file on the nodes where respective data blocks are located and operation loads of the nodes; comparing the operation loads of the nodes, and acquiring a node with the lightest load as a preset node; and scheduling the job task to be carried out on the preset node when the sum of the load produced by the file on the preset node and the operation load of the preset node is smaller than the threshold of a set load. When the method provided in the application is adopted, the system can distribute the job task to be carried out on the calculated preset node when the task is delivered in the file, and therefore the local property of data is met to the maximum extent, the increase of IO (Input/Output) expense due to data movement during the parallel execution of tasks is reduced, and the loads of the system are more balanced.

Description

technical field [0001] The present application relates to the field of distributed computing, in particular to a scheduling method and system for file job tasks. Background technique [0002] Invented by Google, MapReduce is an emerging parallel programming model. It puts parallelization, fault tolerance, data distribution, load balancing, etc. in one library, and boils down all the operations of the system on data into two steps: the Map (mapping) stage and the Reduce (simplification) stage, so that those who do not have much Developers experienced in parallel computing can also develop parallel applications for parallel processing of massive data. [0003] When using the MapReduce model for parallel computing of large-scale data, in the Map stage, a MapReduce job (that is, a computing request from a user) needs to be split into multiple Map tasks and distributed to multiple nodes for execution. Map task) to the node to complete the calculation, so as to reduce the system...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/46G06F9/50
Inventor 杨树强王凯王怀民吴泉源贾焰周斌韩伟红滕猛陈志坤赵辉金松昌罗荣凌舒琦
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products