Unlock instant, AI-driven research and patent intelligence for your innovation.

Convergence blocking method and device for data deduplication

A block and data technology, applied in the computer field, can solve the problems of low deduplication efficiency, increased number of hash values, and reduced block efficiency.

Active Publication Date: 2017-05-03
SANGFOR TECH INC
View PDF6 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] At present, in the field of deduplication of data streams, blocks are used to obtain hash values ​​for comparison to determine whether the data stream is repeated. After the block is divided, the larger the average block length, the greater the granularity of deduplication of data streams, and the deduplication rate decreases; the average block The smaller the length, the lower the deduplication efficiency
However, the currently widely used content-based block method uses a byte-by-byte sliding window, and then calculates and judges the hash value of the window content. Due to the randomness of the hash value in the content-based block algorithm, after block The number of generated block lengths and the block length are exponentially distributed: there are a lot of ultra-small data blocks and very large data blocks; among them, a lot of small data blocks are divided into blocks, which will lead to a large number of block data blocks. Large, resulting in an increase in the number of times to calculate and judge the hash value of the window content, reducing the efficiency of block

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Convergence blocking method and device for data deduplication
  • Convergence blocking method and device for data deduplication
  • Convergence blocking method and device for data deduplication

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0066] It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0067] The present invention provides a convergent block method for data deduplication, referring to figure 1 , in the first embodiment, the data deduplication convergent block method includes:

[0068] Step S10, the starting position of the recording data stream is a block position, and the sliding window is gradually moved forward from the starting position;

[0069] For a given file or data stream that needs to be divided into blocks, start the block operation, set the current position as Cur, take the length of the sliding window as 48 bytes, and gradually move the sliding window from the Cur position to the The end position of the data stream, in this embodiment, each step in the step-by-step movement is 1 byte.

[0070] Step S20, when the sliding window moves by one step, determine whether the current position...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a convergence blocking method for data deduplication. The method comprises the following steps: recording the initial position of a data stream as a blocking position, and enabling a sliding window to gradually move forwards from the initial position; when the sliding window moves one step each time, judging whether the current position of the sliding window is the end position of the data stream; and if not, dynamically selecting judgment conditions to judge whether the current position is the blocking position according to the current blocking length. The invention also discloses a convergence blocking device for data deduplication. According to the convergence blocking method and device disclosed by the invention, the current blocking length is introduced as the blocking parameter during blocking, the extension or tightening of the blocking judgment conditions is dynamically selected, and thus the average block length of data blocks can be controlled, the total number of the data blocks can also be reduced, and the deduplication blocking efficiency can be increased.

Description

technical field [0001] The invention relates to the field of computers, in particular to a convergent block method and device for data deduplication in a storage system. Background technique [0002] At present, in the field of deduplication of data streams, blocks are used to obtain hash values ​​for comparison to determine whether the data stream is repeated. After the block is divided, the larger the average block length, the greater the granularity of deduplication of data streams, and the deduplication rate decreases; the average block The smaller the length, the lower the deduplication efficiency. However, the currently widely used content-based block method uses a byte-by-byte sliding window, and then calculates and judges the hash value of the window content. Due to the randomness of the hash value in the content-based block algorithm, after block The number of generated block lengths and the block length are exponentially distributed: there are a lot of ultra-small...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06
CPCG06F3/0626G06F3/0641
Inventor 夏文付忞吴大立古亮
Owner SANGFOR TECH INC