Method for processing cross task data in distributive network system

A distributed network and task data technology, which is applied in the field of processing cross-task data, can solve the problems of not considering the locality of cross-task data and limiting the calculation speed of tasks, so as to avoid repeated loading of input data and realize the effect of repeated use

Inactive Publication Date: 2011-07-27
FUDAN UNIV
View PDF3 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The current design and implementation of map / reduce parallel programming models (such as Hadoop) do not take into account data locality across tasks, so in such applications they must reload each task from the distributed file system over the network. Repeated loading of the same input data limits the calculation speed of the task

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for processing cross task data in distributive network system
  • Method for processing cross task data in distributive network system
  • Method for processing cross task data in distributive network system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0019] Such as figure 2 As shown, the network distributed file system architecture operated according to the method of the present invention includes a named node, namely the master node, and multiple working nodes, namely the slave nodes. On a cluster, each job executor corresponds to a worker node, and the master node (not shown in the figure) is responsible for scheduling all worker nodes.

[0020] According to the method of the present invention, a cache system is set up at each distributed slave node, including a cache server and a cache client, and works in an access proxy mode similar to a distributed file system. As shown in the figure, all read requests from the job executor are forwarded by the cache client to the cache server; the cache server will read the requested data from the distributed file system, save it in the local cache in the shared memory, and pass the signal The amount is notified to the master node. The cache system can save the data read by the t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the technical field of network information, relating to a method for processing cross task data in a distributive network system including a main node and a plurality of slave nodes. The method comprises the following steps of: providing a corresponding cache on each slave node; saving the data of the current task executed in the slave node in the corresponding cache, generating a message comprising the saved data and the position information of the saved data, and transmitting the message to the main node; requiring to execute a subsequent task at any one slave node of the network, wherein the main node generates a scheduling instruction to a slave node saving the same data when the data required by the subsequent task is same as the pre-saved data, and commands the slave node to execute the subsequent task by the pre-saved same data; therefore, the repeated utilization of the cross task data is realized, the repeated loading of the same data input from a distributive file system is avoided, the speed of accessing the same data many times is increased, and the network communication traffic is effectively reduced.

Description

technical field [0001] The invention belongs to the technical field of network information, and in particular relates to a method for processing cross-task data in a distributed network system. Background technique [0002] With the growth of user business needs and the development of network technology, the storage and calculation of massive data pose new challenges to traditional computer systems. Massive data refers to a data collection with a huge amount of data (often on the order of terabytes or more), which is much larger than general-purpose databases and far exceeds the storage and processing capabilities of a single computer. In order to solve the management and access problems of massive data, people divide such a task that requires huge computing and storage capabilities into many subtasks through a distributed network system, and then assign and schedule these subtasks to multiple interconnected computer nodes for parallelism. processing, and finally combine th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): H04L29/08
Inventor 陈海波肖之慰臧斌宇
Owner FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products