MapReduce short job optimization system and method based on resource reuse

A job optimization and resource technology, applied in the direction of resource allocation, multiprogramming device, program control design, etc., can solve problems such as damage characteristics, excessive number of reserved processes, waste of cluster resources, etc.

Active Publication Date: 2016-07-27
SHANDONG UNIV
View PDF9 Cites 24 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Using the process pool reduces the cost of starting new jobs, but Tenzing has two disadvantages: one is that the number of reserved processes exceeds actual needs, which wastes cluster resources; the other is that the Tenzin

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MapReduce short job optimization system and method based on resource reuse
  • MapReduce short job optimization system and method based on resource reuse
  • MapReduce short job optimization system and method based on resource reuse

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0098] The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

[0099] Hadoop is a parallel computing platform for processing large-scale data sets, which has the advantages of good scalability, high fault tolerance, and easy programming. Although its original design goal is to process large-scale jobs in parallel in a large number of computing nodes, in actual production, Hadoop is often used to process small-scale short jobs. Since Hadoop does not consider the characteristics of short jobs, the execution of short jobs in Hadoop is relatively inefficient. In view of the above challenges, the present invention first describes the problems existing in the short job processing process by analyzing the job execution process in Hadoop. Then, according to the characteristics of tasks running for multiple rounds under high load conditions, a short job optimization mechanism based on resource reuse is proposed to reduce t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a MapReduce short job optimization system and method based on resource reuse. The system comprises a master node, a primary slave node and a plurality of secondary slave nodes, wherein the master node is connected with the primary slave node; the primary slave node is connected with the plurality of secondary slave nodes; a resource manager and a primary scheduler are deployed on the master node; an application manager, a task performance estimator and a sub-scheduler are deployed on the primary slave node; the sub-scheduler is connected with the task performance estimator; the sub-scheduler is further connected with the master node; and node managers are deployed on the secondary slave nodes. Through adoption of the MapReduce short job optimization system and method, short job running performance is optimized from the aspect of increase of the effective resource utilization ratio; the resource allocation and recovery frequency is lowered; the resource allocation and recovery time is used for running short jobs; and the short job execution performance is improved in a way of shortening resource waiting time of jobs.

Description

technical field [0001] The invention relates to a resource reuse-based MapReduce short job optimization system and method. Background technique [0002] Industries such as the Internet, finance, and media are facing the challenge of processing large-scale data sets, but conventional data processing tools and computing models cannot meet their requirements. The MapReduce model proposed by Google provides an effective solution for it, and Hadoop is an open source implementation of MapReduce. Hadoop decomposes the submitted job into smaller-grained Map tasks and Reduce tasks. These tasks run in parallel on multiple nodes in the cluster, thus greatly reducing the running time of the job. Hadoop hides the details of parallel computing—distributing data to computing nodes, rerunning failed tasks, etc., allowing users to focus on specific business logic processing. Moreover, Hadoop provides good linear expansion, data redundancy, and high fault tolerance of computing, which make ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/48G06F9/50
CPCG06F9/4843G06F9/5038
Inventor 史玉良崔立真李庆忠郑永清张开会
Owner SHANDONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products