Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Hadoop task scheduling method and device

A task scheduling and task technology, applied in the direction of multi-program device, program control design, instrument, etc., can solve the unproposed solution, increase the use of I/O resources and network bandwidth consumption, and the FIFO scheduler can not be well utilized Cluster resources and other issues to achieve the effect of optimizing scheduling and improving resource utilization

Active Publication Date: 2022-07-05
BANK OF CHINA
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

FIFO preferentially runs the tasks of the same job at the top of the queue, which can reduce the throughput of the entire system, but severely limits the processing power of the cluster, because although the tasks of the same job often have the same characteristics, the I / O and CPU resources are not Not fully used, and a task performing I / O is blocked because the scheduler prevents it from using the CPU until the I / O operation is complete
With the increase in the number of users and user programs in the Hadoop cluster, the FIFO scheduler cannot make good use of cluster resources, nor can it meet the service quality requirements of different applications, and in severe cases, it will also affect the normal operation of jobs.
[0004] Capacity Scheduler divides resources to each queue in proportion, and sets strict constraints to prevent resource monopoly, which solves the problem of multi-user scheduling, but the scheduling strategy lacks support for load balancing, and the data locality is not ideal
[0005] Fair Scheduler tries to allocate resources equally to all jobs. If a user submits a new job, some resources will be released to the new job. This method ensures that all jobs get the same amount of resources, but the locality of data is not ideal. , resulting in the need to obtain data from other nodes, increasing I / O resource usage and network bandwidth consumption
[0006] For the above problems, no effective solutions have been proposed so far

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Hadoop task scheduling method and device
  • Hadoop task scheduling method and device
  • Hadoop task scheduling method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

[0027] Before introducing the embodiments of the present invention, the technical terms involved in the embodiments of the present invention are first introduced.

[0028] 1. MapReduce is a programming model for parallel operations on large-scale datasets.

[0029] 2. Hadoop is an open source framework developed based on the MapReduce computing model and the Google file system for processing large-scale data in a distributed environme...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention provides a Hadoop task scheduling method and device, wherein the method includes the following steps: acquiring information of multiple tasks that have been run in each node job, and determining each The job type of the node; according to the job type of each node, the load of each node is predicted; according to the job type of each node and the load of each node, the job type of each node and the adaptability of each node are determined; according to the job type of each node Compatibility with each node, assign tasks that are not running in the job to each node. This solution is based on Hadoop task scheduling based on load prediction, which can improve the resource utilization of Hadoop clusters.

Description

technical field [0001] The invention relates to the technical field of Hadoop task scheduling, in particular to a Hadoop task scheduling method and device. Background technique [0002] Hadoop is an open source distributed storage and processing system used for big data batch jobs, such as big data analysis, web page indexing, etc. Hadoop's default job scheduling algorithm is implemented based on FIFO. Currently, Hadoop is configured with a variety of job schedulers, including FIFO (default scheduler), Capacity Scheduler (computing power scheduler), and Fair Scheduler (fair scheduler). [0003] Clusters often run different types of jobs at the same time, and these different types of workloads have different resource requirements. For example, I / O-intensive workloads will use more I / O resources, while CPU-intensive workloads will use more I / O resources. will use more computing resources. FIFO prioritizes the tasks of the same job at the top of the queue, which can reduce th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/48G06F9/50
CPCG06F9/4881G06F9/505G06F9/5061
Inventor 祝春祥翁星晨
Owner BANK OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products