MapReduce optimizing method suitable for iterative computations

An optimization method and iterative computing technology, which is applied in the field of MapReduce optimization, can solve the problems that iterative computing cannot be supported transparently and efficiently, and it is not suitable for iterative computing, so as to improve cluster usage efficiency, reduce network resource competition, and have good data locality Effect

Active Publication Date: 2014-03-05
HUAZHONG UNIV OF SCI & TECH
View PDF3 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the current Hadoop (an open source implementation of the MapReduce model) cannot support iterative computing transparently and efficiently, and even some features of Hadoop are not suitable for iterative computing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • MapReduce optimizing method suitable for iterative computations
  • MapReduce optimizing method suitable for iterative computations
  • MapReduce optimizing method suitable for iterative computations

Examples

Experimental program
Comparison scheme
Effect test

example

[0091] In order to verify the feasibility and effectiveness of the present invention, under the experimental configuration environment shown in the following table 1, execute the computer program written, and the invention is tested, and the test results are shown in the following table 2 and table 3:

[0092] Table 1: Experimental configuration environment

[0093]

[0094] In Table 2 and Table 3, the comparison objects of the present invention are Hadoop-0.20.0 and Haloop, and the experimental algorithm is fuzzy C-Means. Table 2 shows the comparison of the network transmission volume of dynamic data of the three MapReduce implementation schemes under different experimental scales. Table 3 shows the execution time comparison of the three MapReduce implementation schemes at different iteration times under a certain experimental scale. Experimental results show that the present invention has ideal improvements in network data transmission and time performance.

[0095] Tab...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a MapReduce optimizing method suitable for iterative computations. The MapReduce optimizing method is applied to a Hadoop trunking system. The trunking system comprises a major node and a plurality of secondary nodes. The MapReduce optimizing method comprises the following steps that a plurality of Hadoop jobs submitted by a user are received by the major node; the jobs are placed in a job queue by a job service process of the major node and wait for being scheduled by a job scheduler of the major node; the major node waits for a task request transmitted from the secondary nodes; after the major node receives the task request, localized tasks are scheduled preferentially by the job scheduler of the major node; and if the secondary nodes which transmit the task request do not have localized tasks, prediction scheduling is performed according to task types of the Hadoop jobs. The MapReduce optimizing method can support the traditional data-intensive application, and can also support iterative computations transparently and efficiently; dynamic data and static data can be respectively researched; and data transmission quantity can be reduced.

Description

technical field [0001] The invention belongs to the field of parallel computing and massive data processing, and more specifically relates to a MapReduce optimization method suitable for iterative computing. Background technique [0002] Entering the 21st century, the scale of data processing is getting larger and larger, and the scale of terabytes is becoming more and more common, and even the scale of petabytes has appeared. This level of data scale is far beyond the processing power of a personal computer. It is this need for processing power that has prompted the development of parallel or distributed computing models. In this case, Google's MapReduce model came into being, which is a popular data-intensive computing model in a large cluster environment. [0003] MapReduce is a programming model for parallel operations on large-scale data sets (greater than 1TB). The concepts "Map" and "Reduce", and their main ideas, are borrowed from functional programming languages,...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
Inventor 金海郑然余根茂章勤朱磊
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products