Check patentability & draft patents in minutes with Patsnap Eureka AI!

Method and device for distributed computation on basis of MapReduce

A distributed computing and instruction technology, applied in the field of distributed computing, can solve the problem of increasing the processing pressure of the master node Namenode

Active Publication Date: 2013-12-18
阿里巴巴(成都)软件技术有限公司
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In a distributed file system (such as HDFS system), too many files will increase the processing pressure of the master node Namenode, making the read and write operations in the distributed file system a processing bottleneck

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for distributed computation on basis of MapReduce
  • Method and device for distributed computation on basis of MapReduce
  • Method and device for distributed computation on basis of MapReduce

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] see Figure 5 , the distributed computing method based on MapReduce of the present application comprises the steps:

[0063] In step S51, the client application sends instructions to the MapReduce compilation tool.

[0064] The instruction either directly includes the content of the data operated by the instruction, or includes the storage address of the data operated by the instruction in the distributed file system, the latter being more common.

[0065] The instructions are in an executable format of the MapReduce compilation tool, such as HiveQL statements, Pig Latin statements, and the like.

[0066] Step S52, the MapReduce compilation tool compiles the instruction into one or more jobs, and queries the job record corresponding to the instruction in the database, and uses the number of reducers in the job record found as the number of reducers for the job; The MapReduce compilation tool submits each job and the corresponding number of reducers to the MapReduce so...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for distributed computation on basis of MapReduce. The method includes processing data before and after MapReduce software framework processing operation on the level of MapReduce compilation tools, recording processing results on historic operation by MapReduce software framework on one hand, and processing current operation by the aid of successful processing results from the historic operation on the other hand so as to form a loop-locked feed-back system to solve the troubled problem on how to set numbers of reduction devices.

Description

technical field [0001] The present application relates to a distributed computing method. Background technique [0002] The Hadoop project is a software framework (software framework) developed by the Apache Software Foundation to support distributed applications (distributed applications) to process large amounts of data on many independent computers with computing power, and the amount of data processed can reach PB is 10 15 byte level. [0003] see figure 1 , Hadoop software framework includes at least: [0004] A distributed file system 10 located at the bottom layer is commonly HDFS (Hadoop Distributed File System, Hadoop Distributed File System). [0005] HDFS is a distributed file system using a master-slave architecture, which consists of a master node Namenode (name node) and multiple slave nodes Datanode (data node). HDFS divides a file into one or more data blocks (blocks), and these data blocks are stored on a set of Datanodes. Namenode is responsible for m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F9/50
Inventor 王勇廖新涛徐冬
Owner 阿里巴巴(成都)软件技术有限公司
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More