A distributed metadata management method and system based on file pre-creation strategy

A management method and metadata technology, applied in file metadata retrieval, file system, and other database retrieval based on metadata, etc., can solve problems such as high frequency of metadata access, scalability bottlenecks, and large network overhead , to achieve the effect of reducing access delay, improving access performance, and reducing access load

Active Publication Date: 2020-05-15
SUN YAT SEN UNIV +1
View PDF10 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the distribution of metadata, an additional search for metadata is added when obtaining the metadata of the file, which increases the complexity of file access and network overhead.
[0004] In traditional distributed file systems: distributed file systems such as GFS, HDFS, Lustre, and PVFS all use single-node metadata management methods, and there are bottlenecks in scalability; Panasas distributed file systems use distributed metadata management strategy, but it requires special hardware support and is not suitable for general application scenarios; the Ceph distributed file system uses dynamic subtree division to realize the strategy of distributed metadata management, and the metadata namespace is divided into different The subtree is divided into different metadata management nodes. This method can solve the scalability problem of the distributed file system, but once the namespace changes, it will also cause a large amount of data migration, which will cause a lot of network overhead.
[0005] Although traditional distributed file systems can provide high-performance storage services, once encountering large-scale mass storage requirements, this high-performance advantage will be difficult to reflect
Although the space occupied by metadata is small, the frequency of access to metadata is very high. Metadata management in traditional distributed file systems is difficult to handle today's large-scale data storage requirements.
In view of the characteristics of large-scale data processing, in order to use high-performance distributed file systems in the field of big data processing, some researchers archive a large number of small files and merge them into large files for processing. Reduce the access pressure of the metadata server, but this also brings additional overhead, which is not conducive to the high performance of the distributed file system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A distributed metadata management method and system based on file pre-creation strategy
  • A distributed metadata management method and system based on file pre-creation strategy
  • A distributed metadata management method and system based on file pre-creation strategy

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] Such as figure 1 As shown, this embodiment is based on the distributed metadata management method of the file pre-creation strategy, and a proxy server for processing the metadata request of the client is added at the front end of the metadata server of the distributed file system, such as figure 2 As shown, and the detailed steps for the proxy server to perform metadata management include:

[0049] 1) Initialize the file index table used to record the file information pre-created by the metadata server;

[0050] 2) Waiting for the metadata request from the client, if the metadata request received from the client is a metadata creation request, then jump to step 3); if the metadata request is a metadata read and write request, then jump to step 6);

[0051] 3) Determine whether the metadata creation request is a dense metadata creation request, if it is a dense metadata creation request, skip to step 4); otherwise, skip to step 5);

[0052] 4) Assign the assignable i...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a distributed metadata management method and system based on a file pre-creation strategy. The invention adds a proxy server at the front end of the metadata server of the distributed file system, and the proxy server records the metadata server locally through a file index table. Create file information in advance. For intensive metadata creation requests, first allocate pre-created file information directly from the file index table and return the results to the client, and then synchronize to the metadata server at regular intervals; for metadata read and write requests, first determine whether Hit the file index table, if hit, read and write the file index table directly. The present invention optimizes the metadata management of a high-performance distributed file system to provide storage services for large-scale data processing, fully utilizes the powerful I / O capabilities in the distributed storage system, and realizes high-performance and big data fields The effective integration of various large-scale data processing platforms and applications provides high-performance shared storage services.

Description

technical field [0001] The invention belongs to the field of file systems for large-scale data storage, and in particular relates to a method and system for managing metadata by pre-creating file strategies in a distributed file system. Background technique [0002] Due to the rapid development of the Internet, the demand for the transmission and storage of massive information is increasing. Especially with the advent of the era of big data, a large number of small files generated by the Internet require the support of high-performance storage systems, which further promotes the integration of big data and high-performance fields. Distributed file systems in the high-performance field can provide high-performance storage services. At present, distributed file systems are widely used in various big data processing platforms. The distributed file system is mainly composed of a metadata server and a data server. When the client reads and writes files in the distributed file sy...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/13G06F16/14G06F16/182G06F16/907H04L29/08
CPCH04L67/1097H04L67/56
Inventor 肖侬黎红波陈志广卢宇彤
Owner SUN YAT SEN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products