Cache optimization method for reading performance of KV storage system based on LSM-tree

A storage system and cache optimization technology, applied in the field of storage systems, can solve problems such as serious read amplification, achieve the effects of performance improvement, separation of hot and cold, and high cache efficiency

Active Publication Date: 2022-04-26
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, when more levels are present, the read amplification will be more severe, possibly even exceeding 300 times

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cache optimization method for reading performance of KV storage system based on LSM-tree
  • Cache optimization method for reading performance of KV storage system based on LSM-tree
  • Cache optimization method for reading performance of KV storage system based on LSM-tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0032] Embodiments of the present invention will be described below in conjunction with the accompanying drawings.

[0033] like figure 2 As shown, the present invention relates to a cache optimization method based on LSM-tree KV storage system read performance, the method mainly includes the following steps:

[0034] Step 1. Add KeyRange Cache and BF Cache to the memory of the KV storage system.

[0035] In order to solve the read amplification problem caused by LSM-tree due to its unique structural characteristics and compaction operation, caching is an effective method worthy of choice. Caching is one of the main technologies to improve read performance. On the one hand, the deeper the KV pair is, the more storage I / O can be saved if it is cached, which means higher caching benefits. On the other hand, the larger the size of the KV pair, the greater the cache cost. For an excellent caching solution, not only the cost of caching should be considered, but also the benefit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cache optimization method for reading performance of a KV storage system based on an LSM-tree, which comprises the following steps of: adding a KeyRange Cache of which the structure is the same as the hierarchical structure of SST (Secure Sockets) on a disk into a memory, so that entries of each layer are in one-to-one correspondence with SST of a corresponding layer on the disk, and each entry caches a corresponding SST key range on the disk; the BF Cache only caches the Bloom filter of the SST containing the hot data; performing cold and hot separation on the data; if the data are judged to be hot data, caching a corresponding Bloom filter into a BFCache, and enabling a BFP pointer to point to the BFCache; otherwise, setting the pointer of the BFP (Basal File Point) to be NULL; querying a specified key k1, sequentially querying the MemTable and the Immersible MenTable, directly returning the value of the k1 if the MemTable and the Immersible MenTable are found, searching the KeyRange Cache layer by layer from low to high if the value of the k1 is not found, judging whether the data are hot data if the key range containing the k1 is found, and then finding the k1 from the corresponding SST in the disk; and if the layer is not found, continuing to search to the next layer. According to the method, layer-by-layer searching on the LSM-tree is not needed, the reading speed is increased, and high reading performance is achieved.

Description

technical field [0001] The invention relates to a caching optimization method for reading performance of an LSM-tree-based KV storage system, which belongs to the technical field of storage systems. Background technique [0002] Nowadays, many KV storage systems are designed based on LSM-tree (Log-Structured-Merge Tree), such as LevelDB and RocksDB. LSM-tree mainly consists of three parts: MemTable, Immutable MemTable, SSTable (ie SST), such as figure 1 shown. MemTable and Immutable MemTable are components that exist in memory. The latest updated data is stored in MemTable, and when MemTable reaches a certain size, it will be converted into ImmutableMemTable. Because the data is temporarily stored in the memory, the memory is not a reliable storage, and the data will be lost if the power is turned off. Therefore, WAL (Write-ahead logging) is usually used to ensure the reliability of the data. Then ImmutableMemTable will be flashed to the disk and become SST on L0. SST (...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06
CPCG06F3/0604G06F3/0614G06F3/0622G06F3/0628
Inventor 陈思晔陈珊珊
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products