Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Optimal localized task scheduling method based on MapReduce

A task scheduling and task technology, applied in the computer field, can solve the problems of inability to obtain the degree of data localization and the overall execution time of the job, the scope of application is not wide, the applicability is not wide, etc., to shorten the overall execution time, improve the degree of parallelism, The effect of reducing network bandwidth usage

Inactive Publication Date: 2015-03-25
UNIV OF ELECTRONICS SCI & TECH OF CHINA
View PDF3 Cites 21 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] There are various scheduling methods to improve the degree of data localization in the Map stage, but there are some problems such as low practicability and limited scope of application.
Zaharia et al. proposed a delay scheduling algorithm that can effectively improve data localization ("Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling," in Proceedings of the 5th European conference on Computer systems. ACM, 2010, pp.265–278.), but this delayed scheduling method is based on the loss of execution efficiency of local jobs, and this scheduling algorithm is not widely applicable, when only one or a few jobs are running, Can not achieve optimal data localization and job overall execution time
Xie et al proposed a method to distribute data in advance according to the performance of computing nodes (“Improving mapreduce performance through data placement in heterogeneous hadoop clusters,” in Parallel & Distributed Processing, Workshops and PhD Forum (IPDPSW), 2010IEEE International Symposium on.IEEE , 2010, pp.1–9.), this method needs to measure the performance of each computing node in advance, and this method is not very practical under the MapReduce platform where the computing resources of computing nodes can be dynamically set by adjusting parameters

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimal localized task scheduling method based on MapReduce
  • Optimal localized task scheduling method based on MapReduce
  • Optimal localized task scheduling method based on MapReduce

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0068] This embodiment is set up in the Hadoop cluster experiment of 11 physical computing nodes (1 master node, 10 slave nodes), with 128MB as a data block, respectively run test cases wc16, wc22, wc38, wc60, wc98, wherein wc16 represents a wordcount test case with 16 data block sizes, and the obtained localization improvement ratio is as follows Figure 4 As shown, the localization degree improved by up to 17.9%. The performance improvement diagram of the Map phase and the entire MapReduce phase when the network is not congested is as follows Figure 5 As shown in the figure, it can be seen from the figure that the performance of the Map phase has increased by 19.7%, and the performance of the entire MapReduce phase has increased by 17.8%; ) The performance improvement diagram of the Map stage and the entire MapReduce stage is as follows Figure 6 As shown, the performance of the Map stage has been improved by 70.4%, and the performance of the entire MapReduce stage has be...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a MapReduce task scheduling algorithm capable of working in an isomorphic cluster environment and a heterogeneous cluster environment at the same time and belongs to the technical field of computers. According to the scheduling algorithm, the processing performance of each computational node in a cluster can be comprehensively considered, computational nodes and computational tasks are abstracted into a bipartite graph, and through proper expansion of the bipartite graph and combination with a KM weighted optimal matching algorithm, a final overall task scheduling scheme is formed. Experiment data show that the localization degree of data in the Map stage can be increased to almost 100% and overall execution time of MapReduce operation can be optimally reduced by 67.1%.

Description

technical field [0001] The invention belongs to the technical field of computers, and in particular relates to an optimal localization task scheduling method based on MapReduce. Background technique [0002] MapReduce task scheduling directly affects the execution time of MapReduce computing jobs, and an efficient scheduling algorithm can effectively improve job execution efficiency. [0003] The degree of data localization directly affects the execution efficiency of MapReduce jobs. The MapReduce job is mainly composed of the Map stage and the Reduce stage. The intermediate output data generated by the computing nodes in the Map stage needs to be transmitted through the network to the computing nodes in the Reduce stage as their input data. This intermediate stage is called Shuffle. The resource consumption of network bandwidth brought about by the data transmission in the Shuffle stage and the persistent storage of data in the Reduce stage is inevitable. Under the conditi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
Inventor 高胜立薛瑞尼敖立翔管仲洋
Owner UNIV OF ELECTRONICS SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products