Big data environment oriented metadata organization method and system

A metadata and big data technology, applied in the field of distributed file system, can solve problems such as imbalance

Inactive Publication Date: 2016-05-04
HUAZHONG UNIV OF SCI & TECH
View PDF7 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Aiming at the above defects or improvement needs of the prior art, the present invention provides a metadata organization method and system oriented to a big data environment. The metadata migration problem caused by the renaming operation can quickly determine the location of the metadata in the back-end storage cluster and reduce the memor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Big data environment oriented metadata organization method and system
  • Big data environment oriented metadata organization method and system
  • Big data environment oriented metadata organization method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0083] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0084] Below at first explain and illustrate with regard to the technical terms of the present invention:

[0085] Distribution code: a globally unique 32-bit unsigned integer number, which is used to calculate the corresponding server nodes in the backend server cluster according to the consistent hash algorithm.

[0086] Distribution code list: The metadata of files and subdirectories under t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a big data environment oriented metadata organization method, which comprises the following steps: a client-side receives a file creation request from a user, and sends the file creation request to an index server; and the index server, according to an absolute path of a to-be-created file in the file creation request, obtains a global ID (Identification) and a distribution code list of a parent directory of the corresponding file, obtains a key value right key according to the global ID of the parent directory and a filename of the to-be-created file, saves file index information of the to-be-created file according to the key, and obtains a distribution code which is used for storing metadata, wherein if the distribution code is newly added or more than one distribution code exists in the distribution code list of the parent directory, a bloom filter and the global ID of the file are used for updating a bitmap of the distribution codes. According to the method and the system disclosed by the invention, the technical problem that the existing method, in order to guarantee the locality of reference of the metadata, has unbalanced load in large directory metadata centralized storage can be solved.

Description

technical field [0001] The invention belongs to the technical field of distributed file systems, and more particularly relates to a metadata organization method and system for a big data environment. Background technique [0002] With the advent of the era of big data, the scale and quantity of data are constantly increasing, and the scale and complexity of metadata managed by existing distributed file systems are also increasing. Recent research shows that the metadata managed by the distributed file system will exceed one billion, and metadata operations account for 50%-80% of the operation ratio of the entire system. In addition, the number of files contained in the directories is not uniform in the system, 90% of the directories contain less than 128 files, and a few directories have more than a million files. These characteristics bring great challenges to metadata management in the big data environment. [0003] Existing distributed file systems adopt different metad...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/2255
Inventor 李春花周可杨勇
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products