Method and system for subblock splitting for hash-based deduplication
A sub-block, hash value technology, applied in the computer field, can solve problems such as reducing storage requirements
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0016] As mentioned earlier, hash-based deduplication involves dividing data into sub-blocks of variable or fixed size, computing the hash value of each sub-block, and matching identical sub-blocks by its hash value. However, hash-based deduplication systems experience inefficiencies and productivity losses due to large variations in sub-block size. Artificial minimum and maximum sizes on sub-blocks reduce the probability of finding valid sub-block boundaries, as well as reduce the effect of bias on the average sub-block size. Moreover, artificial minimum and maximum sizes on sub-blocks destroy the fundamental property of reproducible sub-block boundaries.
[0017] Thus, the illustrated embodiments seek to provide defined minimum and maximum sub-block sizes (for ease of data management) and a tight distribution of sub-block sizes around a predictable average size (for predictability of storage and processing resource consumption), while generating Reproducible and statistical...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 