Non-accurate task parallel processing method based on MapReduce

A parallel processing, inaccurate technology, applied in the field of cloud computing, can solve the problems of shortening the map task time, inability to guarantee, waste of computing resources, etc., to achieve the effect of shortening time, improving efficiency, and improving execution time

Active Publication Date: 2014-03-19
NAT UNIV OF DEFENSE TECH
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although this solution can improve the lag of Map subtasks, the redistribution of Map subtasks cannot guarantee that the time of the entire Map task will be shortened, and even if the time of the entire Map task can be shortened, more computing resources will be wasted.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Non-accurate task parallel processing method based on MapReduce
  • Non-accurate task parallel processing method based on MapReduce
  • Non-accurate task parallel processing method based on MapReduce

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0041] Test environment: Use virtual machine technology to build a computing cluster with 40 homogeneous computing nodes in the same rack, of which 32 nodes are used for Map subtask execution. The rack contains multiple physical nodes, and each physical node has a quad-core Xeon processor and 4G memory. Four homogeneous virtual machines are deployed, and the nodes are interconnected through switches in the rack.

[0042] The content of the experiment is a WordCount task containing 128 Map subtasks, each subtask is responsible for processing 128MB of data, and the serious lag of subtasks is simulated by adjusting random parameters. The judgment condition of the Check node selects the number of completed subtasks, and the number of received Map subtask execution completion information reaches this value to end the Map task execution process.

[0043] Experiment 1: Verify the effect of lagging subtasks on the execution flow of MapReduce.

[0044] 128 Map subtasks of 128MB data a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A non-accurate task parallel processing method based on MapReduce comprises the steps that (1) M Map subtask executing nodes and a Check node are selected, and the Check node stores logic for judging whether a task to be executed with non-accuracy computing feature is completed; (2) the task is divided into N Map subtasks to be distributed to the M executing nodes for executing; (3) after each Map subtask is executed, task executing-finishing information and result information related to judging logic are sent to the Check node; (4) each time the Map subtask executing-finishing information is received, the information is accumulated into the previous result by the Check node until the accumulated result meets the requirement of judging logic; (5) Map subtasks in which the executing-finishing information is not received by the Check node are ended, a Map task executing process is finished, and a Reduce subtask is carried out synchronously; and (6) the Reduce task executing result is obtained, and whole task executing is finished.

Description

technical field [0001] The invention belongs to the field of cloud computing, and relates to a task parallel processing method based on MapReduce oriented to non-accurate computing applications. Background technique [0002] MapReduce is the most popular parallel computing method in the field of cloud computing, and its appearance is a milestone event in this field. MapReduce supports the use of a large number of computing nodes to process big data jobs in parallel, which is of great significance for data-intensive computing. The MapReduce execution flow is as follows figure 1 As shown, its calculation process consists of two basic tasks: Map and Reduce. The Map task divides the input data into several data subsets, and generates a Map subtask running on a single node for each subset, and each subtask generates an intermediate result in the form of <key, value>; the Reduce task is also composed of several Each subtask is composed of subtasks running on a single node...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50G06F9/38
Inventor 汪昌健彭宇行李慧霸黄震彭绍亮李姗姗李兵谢宝宁
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products