Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash

A sensitive hashing and storage system technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of the surge in the amount of metadata and high overhead, and achieve the effect of improving the recognition rate

Active Publication Date: 2011-02-02
TSINGHUA UNIV
View PDF4 Cites 36 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] When building a Primary Storage Deduplication System (that is, a redundant storage system), there are two important technical challenges: (1) how to eliminate the large amount of computing overhead caused by redundant deduplication; (2) Compared with ordinary storage systems, in redundant storage systems, the amount of metadata increases sharply, and when performing data writing operations, it is necessary to find out whether the data to be written already exists in the system, and the overhead of this search is extremely high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash
  • Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash
  • Method for managing metadata of redundancy deletion and storage system based on location sensitive Hash

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] In order to make the purpose, content, and advantages of the present invention clearer, the implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0028] The main principle of the present invention is: because the access mode of the redundant storage system to the metadata of the data block is related to the file, that is to say, the metadata of a file is usually accessed continuously, so the metadata of the same file are organized and accessed together. It will greatly reduce the number of disk random accesses and improve metadata management performance. When performing metadata search, if a small set can be found, and the final result of data search only for the elements in the set can be the same as the result of data search in the entire data set, the data can be improved. Lookup speed. For redundant storage systems, doing this means requiring similar files (that is, files containin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for managing metadata of redundancy deletion and storage system based on location sensitive Hash, which combines the metadata of a similar file data block together rapidly by a location sensitive Hash function, so that when a data block is written into the redundancy deletion and storage system, the method can quickly search whether the data block has existed in the system, improves the metadata search performance of the redundancy deletion and storage system and finally improves the throughput rate of the system. In the method, the query speed, the memory overhead and the redundancy deletion effect of a metadata management system are changed by setting the number of the used location sensitive Hash functions and adjusting the identification rate of similar files. The method can lead the metadata management to be suitable for different demands of the redundancy deletion and storage system, can improve the identification rate of similar files by using a plurality of Hash functions, improves the redundancy deletion capability of the redundancy deletion and storage system and reduces the memory overhead of the metadata index.

Description

technical field [0001] The invention relates to the technical field of computer data storage, in particular to a method for managing metadata of a redundant storage system based on position-sensitive hashing. Background technique [0002] With the explosive growth of digital information, the space occupied by data is getting bigger and bigger; in the past 10 years, the capacity of storage systems provided by many industries has grown from tens of GB to hundreds of TB, or even several PB, which has fully doubled. More than 10,000 times. With the exponential growth of data, enterprises are faced with more and more time points for quick backup and recovery, and the cost of managing and saving data and the consumption of data center space and power are becoming more and more expensive. The study found that up to 60% of the data stored in the application system is redundant, and as time goes by, it becomes more and more serious, and people may spend more than 10 times the storag...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 余宏亮孙竞
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products