Resource scheduling method under Hadoop-based multi-job environment

A resource scheduling and multi-job technology, applied in the field of big data, which can solve the problems of low utilization of cluster resources and poor system performance.

Active Publication Date: 2015-09-16
HUAZHONG UNIV OF SCI & TECH
View PDF5 Cites 67 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Aiming at the defects of the existing resource scheduling technology, the purpose of the present invention is to provide a resource scheduling method that can dynamically adjust resource requirements according to the heterogen

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Resource scheduling method under Hadoop-based multi-job environment
  • Resource scheduling method under Hadoop-based multi-job environment
  • Resource scheduling method under Hadoop-based multi-job environment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0065] Such as figure 1 As shown, in the present invention, the architecture of the resource scheduling system based on the Hadoop multi-job environment is a tripartite architecture: client, Hadoop2.0 cluster platform and monitoring server, wherein the Hadoop cluster includes a master node and multiple computing nodes, Resource Manager It is a process that runs independently on the master node ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a resource scheduling method under a Hadoop-based multi-job environment, which includes: (1) collecting the three-party monitoring information of cluster loads, a Hadoop platform and hardware in real time; (2) collecting the job execution monitoring information of a user on each computing node of a cluster in real time; (3) gathering the three-party monitoring data of the cluster, modeling to evaluate the computing capabilities of the nodes, and dividing the nodes of the cluster into superior computing nodes and inferior computing nodes; (4) if the nodes are the superior computing nodes, then starting a job task resource demand allocation policy based on similarity evaluation; (5) if the nodes are the inferior computing nodes, then returning to a default resource demand allocation policy of the Yarn. The resource scheduling method under the Hadoop-based multi-job environment solves the problem of resource fragments caused by oversize job resource demand division granularity in conventional resource schedulers of the Yarn, can comprehensively take the heterogeneity of cluster nodes and jobs into consideration, and increases the execution concurrency of the cluster by reasonably and effectively allocating the node resources, thus increasing the execution efficiency of the multiple jobs of the Hadoop cluster.

Description

technical field [0001] The invention belongs to the technical field of big data, and more specifically relates to a resource scheduling method based on Hadoop multi-job environment. Background technique [0002] With the advent of the era of big data and the Internet, big data technology has become a research hotspot in academia and industry, and Hadoop, as an open source big data processing platform, has been widely used in both enterprises and academic research fields. However, the first generation of Hadoop has practical problems such as single point of failure, low resource utilization, and inability to support multiple computing frameworks. In order to overcome the above shortcomings, Apache launched the second generation of Hadoop, and built the resource management module into an independent general resource management system Yarn, which is responsible for the resource allocation and task scheduling of the cluster. Yarn enables multiple computing frameworks (MapReduce...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06F9/48G06F11/14
CPCG06F9/4843G06F9/4881G06F11/1461G06F16/27
Inventor 王芳冯丹杨静怡潘佳艺周俊
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products