A shuffle method for non-volatile memory

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A non-volatile memory and persistent technology, applied in the field of big data processing, can solve the problems of high memory performance requirements, large time overhead, and dependence on network performance, and achieve the effects of improving efficiency, fast positioning, and improving space utilization

Active Publication Date: 2020-06-05

INST OF COMPUTING TECH CHINESE ACAD OF SCI

View PDF5 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0005] Themis published an article on Proceedings of the 3rd ACM Symposium on Cloud Computing (SoCC), 2012, proposing to use a dynamic memory allocation strategy in the Shuffle stage to store data in the process, that is, during the process of processing data, data from The number of disk reads and writes is only twice, and the rest of the process will not interact with the disk; SpongeFiles published an article on the Proceedings of the 2014 ACM SIGMOD international conference on Management of data, proposing to share the unused memory space in the Task, the above two methods Acceleration is only through memory, which requires high memory performance;

[0006] In addition, Sailfish published an article on Proceedings of the 3rd ACM Symposium on CloudComputing (SoCC), 2012, proposing that when writing Shuffle data, the data of each partition corresponding to the Map Task is gathered, and the distributed file system is used to store the corresponding data; Hadoop-A published an article on Proceedings of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, which proposes to use the characteristics of high-speed network (RDMA) and use the Network-Levitated Merge algorithm to execute the Shuffle stage, but the above two The disadvantage of this method is that it is too dependent on network performance, and the time overhead for data access in the form of a file system is relatively large

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] In order to make the object, technical solution and advantages of the present invention clearer, the Shuffle method for non-volatile memory provided in the embodiment of the present invention will be described below with reference to the accompanying drawings.

[0029] In order to study the impact of Shuffle performance on the overall performance, the inventor took the Sort application as an example and evaluated the results of the running time of the application on Spark as the amount of Shuffle data changed.

[0030] figure 2 It is a graph of the influence of Shuffle data volume on Sort execution time, such as figure 2 As shown, as the amount of Shuffle data increases, the performance of Spark drops significantly. This is because the data is partitioned when the data is read between the Map task and the Reduce task. Therefore, for a certain Reduce task, the amount of data read from a Map task is proportional to the total number of Reduce tasks. Inversely, this wil...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to a Shuffle method aiming at a nonvolatile memory. The Shuffle method includes following steps: utilizing a partition ID to write output data of a Map task into a persistent buffer zone; pulling data in the persistent buffer zone corresponding to a Reduce task.

Description

technical field [0001] The invention relates to the technical field of big data processing, in particular to a Shuffle method for non-volatile memory. Background technique [0002] With the development of science and technology, the world today has entered the era of big data. MapReduce is a popular programming model for large-scale data parallel computing. How to optimize the performance of MapReduce has always been a hot topic in the industry. [0003] Shuffle is a specific stage between the Map stage and the Reduce stage in the MapReduce framework. figure 1 is a schematic diagram of the MapReduce process, such as figure 1 As shown, Shuffle refers to the process that when the output result of Map is to be used by Reduce, the output result is hashed according to the key and distributed to each Reduce. Shuffle involves disk reading and writing and network transmission, so Shuffle The level of performance directly affects the operating efficiency of the entire program. [...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06F8/30

CPCG06F8/31

Inventor 潘锋烽熊劲

Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI

A shuffle method for non-volatile memory

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology