Unlock instant, AI-driven research and patent intelligence for your innovation.

A hash-based hdfs backend storage system

A storage system and hash function technology, applied in the direction of file system, response error generation, data processing input/output process, etc., can solve the problems of low disk utilization, read and write amplification, etc., to shorten the IO path and improve Access performance, security and consistency effects

Active Publication Date: 2022-07-19
BEIJING INST OF COMP TECH & APPL
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is: how to solve the problems of read-write amplification and low disk utilization when using the local file system in the HDFS back-end storage process, and improve the read-write performance of HDFS

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A hash-based hdfs backend storage system
  • A hash-based hdfs backend storage system
  • A hash-based hdfs backend storage system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] In order to make the purpose, content, and advantages of the present invention clearer, the specific embodiments of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.

[0032] In order to improve the read-write performance of HDFS and solve the problems of read-write amplification and low disk utilization when using the local file system in the HDFS back-end storage process, the present invention proposes a hash-based HDFS back-end storage system design method. and HDFS backend storage system. The storage system bypasses the local file system, realizes the direct management of the disk by the DataNode, and stores the checksum in the memory, thereby improving the overall read and write performance of HDFS.

[0033] A hash-based HDFS back-end storage system designed by the present invention can provide better performance and higher disk utilization for the distributed file system HDFS, such as figu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a hash-based HDFS back-end storage system, and relates to the technical field of data storage. The present invention adds a hash function to the DataNode part of HDFS, which is used to map the data block ID of HDFS into the corresponding memory and disk address offset. The DataNode can directly interact with the memory and disk according to the address offset, bypassing the original In some local file system layers, metadata related to the HDFS backend storage system is stored in the disk header, and the in-memory data structure can be reconstructed according to the metadata when the cluster is restarted, thus ensuring data security and consistency. The invention shortens the IO path of the HDFS, improves the access performance of the HDFS, and improves the user's data storage experience.

Description

technical field [0001] The invention relates to the technical field of data storage, in particular to a hash-based HDFS back-end storage system. Background technique [0002] In the distributed file system HDFS, the cluster file data is divided into fixed-size blocks and stored on each DataNode. These data blocks are directly stored in the local file system in the form of files. The local file system has good stability and compatibility. HDFS can realize data read and write operations by simply calling the file system interface, but this design will bring certain performance losses. First, the metadata information such as the directory structure and file permissions in the local file system is not used by the HDFS cluster and is invalid data. Secondly, the local file system designs a local log protection mechanism to ensure data consistency, which is redundant data for the data consistency guarantee mechanism of HDFS. The storage of invalid data and redundant data not only...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/13G06F16/16G06F16/182G06F3/06G06F11/10
CPCG06F16/137G06F16/162G06F16/182G06F3/0632G06F3/0644G06F3/064G06F3/0643G06F3/0652G06F3/0676G06F11/1004
Inventor 刘彬彬殷双飞邓玲
Owner BEIJING INST OF COMP TECH & APPL