Method for retrieving data block indexes

A technology of data block and index, which is applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of deduplication rate reduction, slow index record retrieval speed, and scale limitation of deduplication system, so as to reduce memory The occupancy and deduplication rate have no effect, and the effect of stable memory consumption

Active Publication Date: 2014-05-07
HUAZHONG UNIV OF SCI & TECH
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0016] The present invention provides a data block index retrieval method, which solves the problem that the scale of the existing data block index retrieval method is limited and the deduplication rate is reduced; or when the scale of the deduplication system increases, the index records Issues with slow retrieval speeds to increase the efficiency and scalability of deduplication systems and reduce their cost when storing large amounts of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for retrieving data block indexes
  • Method for retrieving data block indexes
  • Method for retrieving data block indexes

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0058] The present invention will be further described below in conjunction with the accompanying drawings.

[0059] Such as figure 1 As shown, the present invention includes a fingerprint retrieval step and a step of storing a new index record, and the step of storing a new index record includes the substep of creating an index record, the substep of judging a write cache package, the substep of judging a write cache queue, and the substep of disk refreshing.

[0060] Such as figure 2 As shown, the fingerprint retrieval step of the present invention includes: Bloom filter judgment sub-step, read cache queue judgment sub-step, mobile copy sub-step, write cache queue judgment sub-step, reverse mapping set judgment sub-step, dynamic identification set judgment substeps and disk access substeps.

[0061] Such as image 3 As shown, the disk access sub-steps include the following processes:

[0062] (1-7-1) locate the disk index file according to the index record package ident...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for retrieving data block indexes, and belongs to a storage and backup method for computer data. The method solves problems that the scale of a data de-duplication system implementing an existing method for retrieving data block indexes is limited, de-duplication rate is reduced, or the retrieving speed is slow when the scale of the data de-duplication system is increased. The method includes a fingerprint retrieving step and a new index record storing step, the fingerprint retrieving step includes sub-steps of bloom filter judging, read cache queue judging, moving copying, write cache queue judging, inverse mapping set judging, dynamic identification set judging and disk access, and the new index record storing step further includes sub-steps of index creation recording, write cache package judging, write cache queue judging and disk refreshing. The method has the advantages that retrieving efficiency for data de-duplication indexes in a massive data environment is improved, low internal memory usage is maintained, excellent extensibility is realized, and service for retrieving large-scale data de-duplication indexes can be provided.

Description

technical field [0001] The invention belongs to computer data storage and backup methods, in particular to a data block index retrieval method for deleting duplicate data. Background technique [0002] Data deduplication (Data De-duplication, DD), by deleting duplicate data blocks in the global data set, only retaining one of them, thereby eliminating redundant data, can effectively improve storage efficiency and utilization, and greatly reduce the It is also a green storage technology that can effectively reduce energy consumption and is widely used in the field of storage backup. However, under large storage capacity, especially when the granularity of the data block is fine, the fingerprint data used to identify the data block is too large to be stored in the memory and needs to be stored in disk space. Therefore, in a data deduplication system, the index system for retrieving fingerprints becomes the key to system performance. The existing data deduplication system, or...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 周可王桦宋兵强夏德军
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products