Unlock instant, AI-driven research and patent intelligence for your innovation.

Distributed data processing method, device and system

A technology of distributed data and processing methods, applied in the field of data processing, can solve problems such as large network traffic, achieve the effects of avoiding system resources, improving the effect of distributed data processing, and avoiding data migration

Active Publication Date: 2016-12-28
无锡苏惠信息技术服务有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Embodiments of the present invention provide a distributed data processing method, device, and system to avoid the problem of large network traffic caused by data migration between working nodes in the distributed data processing process, and improve the distributed data processing effect of the system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Distributed data processing method, device and system
  • Distributed data processing method, device and system
  • Distributed data processing method, device and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0028] figure 1 It is a flow chart of the distributed data processing method provided by Embodiment 1 of the present invention. Such as figure 1 As shown, the distributed data processing method provided in this embodiment is specifically applied to the data processing process of the Map-Reduce system, and the Map-Reduce system may specifically include a master node and at least two working nodes. The distributed data processing method provided in this embodiment may be executed by a distributed data processing device, and the distributed data processing device may be a master node, and may be implemented by means of software and / or hardware.

[0029] The distributed data processing method provided in this embodiment specifically includes:

[0030] Step 10, the master node generates a mapping Map task according to the obtained upload node indication information and the task acquisition request sent by the work node, wherein the upload node indication information includes addr...

Embodiment 2

[0039] image 3 It is a flow chart of the distributed data processing method provided by Embodiment 2 of the present invention. Such as image 3 As shown, the distributed data processing method provided by this embodiment is in figure 1 On the basis of the illustrated embodiment, in step 10, before the master node generates a mapping Map task according to the obtained upload node indication information and the task acquisition request sent by the working node, the following steps may be specifically included:

[0040] Step 30, the master node generates file division indication information and the upload node indication information according to the file information sent by the client, and sends the file division indication information and the upload node indication information to the client, so that the client divides the file to be processed into a plurality of data blocks according to the file division indication information, and sends each data block to a corresponding wor...

Embodiment 3

[0060] Figure 7 It is a schematic structural diagram of a distributed data processing device provided in Embodiment 3 of the present invention. Such as Figure 7 As shown, the distributed data processing device provided in this embodiment can specifically implement each step of the distributed data processing method provided in any embodiment of the present invention, which will not be repeated here.

[0061] The distributed data processing device provided in this embodiment specifically includes a task generating unit 11 and a task allocating unit 12 . The task generating unit 11 is configured to generate a mapping Map task according to the obtained upload node indication information and the task acquisition request sent by the working node, wherein the upload node indication information includes the addresses of the work nodes corresponding to the plurality of data blocks respectively, and the The data blocks corresponding to the Map task are distributed on the working node...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

An embodiment of the invention provides a DDP method, device and system. The DDP method comprises the steps as follows: an Map task is generated according to obtained uploading node indication information and task acquisition requests sent by work nodes, wherein the uploading node indication information comprises addresses of the work nodes corresponding to a plurality of data blocks respectively, and the data blocks corresponding to the Map task are distributed on the work nodes sending the task acquisition requests; and the Map task is allocated to the work nodes, so that the work nodes correspondingly process the data blocks corresponding to the Map task. According to the DDP method, device and system provided by the embodiment of the invention, the problem of large network flow caused by data migration among the work nodes in a DDP process is solved, system resources are prevented from being occupied by disk read-write, and the DDP effect of the system is improved.

Description

technical field [0001] Embodiments of the present invention relate to data processing technologies, and in particular, to a distributed data processing method, device, and system. Background technique [0002] With the rapid development of Internet technology, the era of massive data has come, how to deal with massive data has become a severe test that must be faced. Map-Reduce (Map-Reduce) system is a distributed parallel system, usually used in distributed massive data processing scenarios. The Map-Reduce system realizes the distributed processing of data through the process of mapping (Map) and simplification (Reduce). [0003] In the existing Map-Reduce system, there are usually multiple working nodes for data processing. After the client divides the file to be processed into multiple data blocks, the data block is uploaded to each working node in blocks. However, since the multiple data blocks corresponding to the Map task executed by the working node are not necessar...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50
Inventor 钱剑锋颜友亮
Owner 无锡苏惠信息技术服务有限公司