Dynamically adaptive LSM (Log-structured merge) tree combination method and system

A LSM tree, dynamic adaptation technology, applied in special data processing applications, instruments, electrical and digital data processing, etc., can solve the problems of unstable query performance, low merge efficiency, waste of I/O, etc., to improve merge efficiency and query Speed, optimize organizational form, improve the effect of merger efficiency

Active Publication Date: 2015-12-16
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF8 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] (1) The data distribution is inconsistent. A line of data may be in one file or across multiple files. In the worst case, it may be in each file. In this way, the number of files involved in different key queries is different, and the query performance is not good. Stablize;
[0011] (2) A lot of I / O is wasted, a file may be merged multiple times, and it is repeatedly rewritten like a snowball
Then, since there is no guarantee how long it will take for deleted data to be cleared, a large amount of space may be wasted in deletion-intensive operations;
[0013] (4) During the merging process, the old files can only be deleted after the new files are written, which requires additional disk space
Its problem is that the distribution and number of Stripes in Level-1 are difficult to adjust dynamically. If the Stripes are too small, the files written from Level-0 will be too small; if the Stripes are too large, the Stripe interval is very large, and each Stripe will encounter the same problem as TieredCompact inside the Region
[0032] The existing technology cannot dynamically adapt to the distribution of data, and the organization of data cannot be dynamically adjusted efficiently according to the distribution of data. The efficiency of merging is low, and some useless merging is often done, occupying system resources
[0033] In summary, there are obviously inconveniences and defects in the actual use of the existing technology, so it is necessary to improve

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Dynamically adaptive LSM (Log-structured merge) tree combination method and system
  • Dynamically adaptive LSM (Log-structured merge) tree combination method and system
  • Dynamically adaptive LSM (Log-structured merge) tree combination method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0073] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0074] The inventors found that defects in the prior art are caused by improper merging strategies when conducting research on LSM-based data storage systems. After studying the existing merging method and the traditional database data organization method, it is found that the solution to this defect can be realized by the method of tree structure data organization. LevelDBCompact organizes files according to levels, but there is no direct connection between levels. When merging occurs between levels, it does not know the lower-level intervals involved. StripeCompact defines two levels, but this l...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is suitable for the technical field of file processing and provides a dynamically adaptive LSM (Log-structured merge) tree combination method. The method comprises: dividing a key value interval into a plurality of nodes and organizing the nodes into a treelike structure, wherein each node corresponds to a key value interval and each key value interval contains a file corresponding to a key value interval range; dynamically adjusting the shape of a tree according to distribution of current data; when a newly written file exists, traversing the tree to look for an optimal node and putting the file into the node; and when the file is processed, performing Minor Compact processing on the interior of the node and executing Major Compact through a leaf node. The invention further correspondingly provides a dynamically adaptive LSM tree combination system realizing the method. Therefore, the dynamically adaptive LSM tree combination method and system can realize dynamic adaption to data distribution and improve data combination efficiency.

Description

technical field [0001] The invention relates to the technical field of file processing, in particular to a dynamically adaptable LSM tree merging method and system. Background technique [0002] Log-structuredmerge tree, also known as LSM tree, is a commonly used data organization method for NoSQL databases. It defers and batches changes to the index and efficiently migrates updates to disk in a merge-sort-like fashion. The node of the LSM tree is often a file in the specific implementation. The file is ordered internally and the files are disordered. When querying, it is necessary to query all files and merge the results of each file, resulting in low performance. Therefore, several files are generally Merge into one large file, reduce the number of files by merging files, and actually delete data to reduce the number of files involved in each query and improve data query efficiency, that is, Compact. The compact process reads multiple files and merges them and rewrites t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2246G06F16/24553
Inventor 程学旗张虔熙张敬亮廖华明
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products