Unlock instant, AI-driven research and patent intelligence for your innovation.

Spark memory replacement method based on dynamic capacity

A technology that replaces algorithms and capacities, applied in memory systems, instruments, computing, etc., to reduce read and write overhead, improve cache hit rate, and improve operating efficiency

Pending Publication Date: 2022-06-10
CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the above algorithms still have their own shortcomings, which leads to unsatisfactory results in some specific scenarios; for example, LRFU, which is an improved algorithm based on the LRU algorithm, is essentially based on the LRU algorithm. Increase the data block information that has been used, and give a greater weight to the data block that has been used more recently to judge the possibility of the RDD being used in the future; therefore, it is necessary for the algorithm to judge the data block relatively accurately The possibility of being used requires a certain number of iterations. When the number of iterations is small or most of the work is a linear job, the algorithm is easy to expel the data blocks that need to be reused from the memory, resulting in a decrease in the cache hit rate and affecting Spark. Work performance, and the rest of the algorithms also have the above effects
It can be seen that the design of cache replacement solutions in the prior art still lacks comprehensive consideration of different scenarios and optimization capabilities corresponding to different usage scenarios, so there is still room for optimization in Spark’s read and write overhead performance during work

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Spark memory replacement method based on dynamic capacity
  • Spark memory replacement method based on dynamic capacity
  • Spark memory replacement method based on dynamic capacity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] It is to be understood that the terms "length", "width", "upper", "lower", "front", "rear", "first", "second", "vertical", "horizontal", " The orientations or positional relationships indicated by "top", "bottom", "inner", "outer", etc. are based on the orientations or positional relationships shown in the drawings, and are only for the convenience of describing the application and simplifying the description, rather than indicating or implying the It should not be construed as limiting the application to indicate that a device or element must have a particular orientation, be constructed, and operate in a particular orientation.

[0046] In addition, the terms "first" and "second" are used for descriptive purposes only, and cannot be interpreted as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as "first" and "second" may explicitly or implicitly include one or more of these feat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Spark memory replacement method based on dynamic capacity. The method comprises the steps that a first replacement algorithm and a second replacement algorithm are configured; when a new RDD needs to be stored, a replacement algorithm is selected according to the memory tension degree; when a second replacement algorithm is triggered, the existing RDD is divided into two tables with dependency counting and without dependency counting for maintenance; then obtaining the weight of the existing RDD; then, whether the RDD table without dependency counting is empty or not is determined, if not, the existing RDDs are expelled out of a memory one by one according to the weights from small to large in the RDD table without dependency counting, and the operation is stopped until new RDDs are cached enough; and if the RDD table is empty, traversing the RDD table, and expelling the RDD by using the same method until the space is enough to accommodate a new RDD. According to the scheme, under the condition of different memory environments, the read-write overhead and the influence on the performance of the Spark during operation can be reduced to a greater extent, the cache hit rate is increased, and the running efficiency of the Spark is improved.

Description

technical field [0001] The present application relates to the technical field of big data Spark computing engine, and more specifically, relates to a dynamic capacity-based Spark memory replacement method. Background technique [0002] Apache Sprak is a fast and general-purpose computing engine designed for large-scale data processing. It has the advantages of Hadoop MapReduce, but it is different from MapReduce in that the intermediate output of the job can be stored in memory, so that it is no longer It needs to read and write HDFS, so Spark is widely used in data mining and machine learning and other MapReduce algorithms that require iteration. In the original Spark, the computing engine will store data files in memory to reduce the read and write overhead when the data is used next time. When the memory space is not enough to store the new RDD (resilient distributed dataset) data file, the Spark engine will call the replacement algorithm to eliminate the old RDD and rea...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F12/123G06F12/127
CPCG06F12/123G06F12/127
Inventor 王进张睿涵张经宇王磊王静王建新
Owner CHANGSHA UNIVERSITY OF SCIENCE AND TECHNOLOGY