Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Redundancy removal file system based on limited binary tree bloom filter and construction method of redundancy removal file system

A technology of bloom filter and file system, applied in the field of de-redundancy file system and its construction, which can solve the problems of lack of block positioning and increased error probability, achieve low CPU usage, reduce misjudgment, and high de-redundancy rate storage. take effect

Inactive Publication Date: 2013-10-09
BEIHANG UNIV
View PDF2 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the ordinary fixed-length Bloom filter can only determine the existence of file blocks, and does not have the function of block positioning, and as the amount of stored data increases, the error probability of determining existence will continue to increase

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Redundancy removal file system based on limited binary tree bloom filter and construction method of redundancy removal file system
  • Redundancy removal file system based on limited binary tree bloom filter and construction method of redundancy removal file system
  • Redundancy removal file system based on limited binary tree bloom filter and construction method of redundancy removal file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0036] The present invention will be further described in detail with reference to the accompanying drawings and embodiments. These implementation examples are described in sufficient detail to enable those skilled in the art to understand and practice the present invention. Logical, implementation and other changes may be made in the implementation without departing from the spirit and scope of the invention. Therefore, the following detailed description should not be taken in a limiting sense, and the scope of the present invention is defined only by the claims.

[0037] The fingerprint of the data block can be set by an algorithm with a low conflict rate, such as MD5, SHA and other algorithms. In the description of the specific implementation of the present invention, the fingerprint of the data block is obtained by using the MD5 algorithm (Message-Digest Algorithm5, information-digest algorithm 5) , the obtained MD5 value is used as the fingerprint of the data block.

[...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a limited binary tree bloom filter, a redundancy removal file system based on the limited binary tree bloom filter, and a construction method of the redundancy removal file system based on the limited binary tree bloom filter. According to the limited binary tree bloom filter, an upper limit is arranged for the number of nodes of each layer, each node is a two-stage bloom filter, and each two-stage bloom filter comprises a standard bloom filter body and a second portion which contains fingerprints and addresses of data blocks in a stored mode. Firstly, a fingerprint of each data block is searched in the standard bloom filter bodies; when the fingerprint of each data block is not found, a node misses, otherwise, search is continuously carried out in the second portions, the node hits the target when a completely matched fingerprint is found in the second portions, and otherwise the node misses. The redundancy removal file system and the construction method of the redundancy removal file system achieve write-in, reading and deletion of files on the basis of the limited binary tree bloom filter. Misjudgment is reduced through a secondary query, and the limited binary tree bloom filter, the redundancy removal file system based on the limited binary tree bloom filter, and the construction method of the redundancy removal file system based on the limited binary tree bloom filter have the advantages of being low in occupation of the memory, low in usage of a CPU, low in occupation of additional space, high in redundancy removal ratio, and excellent in access performance and extensibility.

Description

technical field [0001] The invention belongs to the technical field of redundant data management, and relates to a deredundancy file system based on a finite binary tree Bloom filter and a construction method thereof, which are used to solve the problem of dynamic scale data storage. Background technique [0002] In the digital age, the volume and complexity of data is exploding. According to the International Data Corporation, nearly 75% of the data in the world are duplicates—that is, only 25% of the data is unique, and more than 90% of redundant data exists in backup files. With the increasing popularity of cloud storage and cloud computing applications, data centers may continue to emerge in the near future. Applying data de-redundancy technology to store these big data can reduce the redundancy rate of data and improve the efficiency of data storage and transmission. [0003] The most common way to remove redundancy is to divide the complete file into smaller file blo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 姜博刘俊龙王星河龙翔高小鹏万寒
Owner BEIHANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products