Unlock instant, AI-driven research and patent intelligence for your innovation.

Memory data set replacing system and replacing method for big data processing system

A technology for big data processing and in-memory data, applied in digital data processing, memory systems, memory architecture access/allocation, etc. It can solve problems such as affecting the overall time of the program, low recalculation overhead, and increased running time, and achieve complex algorithms. Low degree, low analysis cost, low cost effect

Active Publication Date: 2016-08-03
HUAZHONG UNIV OF SCI & TECH
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

When the data that has been removed is used later in the program, the processing system will run necessary operations to generate the data again. The operation steps in this process are the same as when the data was first generated, and the time required is also close to the time spent on the first generation. Computational overhead introduced by the calculation affects the overall time required for the program to run
[0007] Most of the current big data processing systems use the LRU (Least Recently Used) algorithm to decide how to replace the memory data set. The LRU algorithm will remove the memory data set that has not been used for the longest time since the current time. It does not consider the overhead of data recalculation, resulting in high The data for recalculation overhead is replaced out of memory, thus increasing the overall running time of the system
Other common replacement algorithms such as the FIFO (FirstInFirstOut) algorithm make decisions based on the generation time of the memory data set, and the LFU (Least Frequently Used) algorithm makes decisions based on the access frequency of the data set, which also cannot guarantee the minimum overhead introduced when recalculating the data set
At present, there is a lack of replacement methods for memory dataset replacement in big data distributed processing systems with low recalculation overhead

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Memory data set replacing system and replacing method for big data processing system
  • Memory data set replacing system and replacing method for big data processing system
  • Memory data set replacing system and replacing method for big data processing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0031] Such as figure 1 As shown, the present invention provides a memory data set replacement system for a big data processing system, including an analysis module, an information monitoring module, and a decision-making module. The analysis module is used to analyze the logic of the upper-level user program to obtain the set of operation steps for generating the memory data set in each operation stage; the information monitoring module is used to monitor the running user program and collect information when the memory data set is generated Submitted to the decision-making module; the decision-ma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a memory data set replacing system for a big data processing system. The system comprises an analysis module, an information monitoring module and a decision making module, wherein the analysis module is used for performing logic analysis on an upper layer user program to obtain the operation step set during the generation of the memory data set in each operation stage; the information monitoring module is used for monitoring the user program in the operation process, collecting information during the memory data set generation and submitting the information to the decision making module; the decision making module is used for analyzing and sequencing the collected information, judging whether the memory data set in the system needs to be replaced or not in the current stage, determining the memory data set to be removed when the replacement is needed by the system, and notifying the system of replacing the new memory data set. In the work process, the complete transparency is realized on the upper layer user layer program; the replacing scheme of the memory data set can be flexibly decided according to the real-time condition of the system. Meanwhile, compared with a conventional replacing method, the method has the advantage that the influence on the system performance is well reduced.

Description

technical field [0001] The invention belongs to the field of big data distributed computing, and more specifically relates to a memory data set replacement system and a replacement method for a big data processing system. Background technique [0002] In recent years, big data distributed processing systems have received extensive attention. Several big data distributed processing systems have emerged in industry and academia: Spark, Flink, Dryad, GraphX, etc. Developers use the APIs provided by such systems to describe big data applications. Usually, the amount of data processed by such applications far exceeds the computing power of a single machine. Therefore, in practice, such systems are usually deployed in clusters containing multiple computing nodes. [0003] The input data processed by the big data application is usually located in the storage shared by the cluster (such as HDFS, etc.), and the big data processing system divides the program into multiple operation ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/30G06F11/34G06F12/126
CPCG06F11/3006G06F11/302G06F11/3037G06F11/3051G06F11/3409G06F12/126G06F2201/805G06F2201/81G06F2201/885G06F2212/1016
Inventor 石宣化金海耿元振王斐
Owner HUAZHONG UNIV OF SCI & TECH