Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and system for subblock splitting for hash-based deduplication

A sub-block, hash value technology, applied in the computer field, can solve problems such as reducing storage requirements

Inactive Publication Date: 2016-12-28
INT BUSINESS MASCH CORP
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Subsequent copies are replaced by pointers to stored occurrences, which significantly reduces storage requirements if the data is in fact duplicated

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for subblock splitting for hash-based deduplication
  • Method and system for subblock splitting for hash-based deduplication
  • Method and system for subblock splitting for hash-based deduplication

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] As mentioned earlier, hash-based deduplication involves dividing data into sub-blocks of variable or fixed size, computing the hash value of each sub-block, and matching identical sub-blocks by its hash value. However, hash-based deduplication systems experience inefficiencies and productivity losses due to large variations in sub-block size. Artificial minimum and maximum sizes on sub-blocks reduce the probability of finding valid sub-block boundaries, as well as reduce the effect of bias on the average sub-block size. Moreover, artificial minimum and maximum sizes on sub-blocks destroy the fundamental property of reproducible sub-block boundaries.

[0017] Thus, the illustrated embodiments seek to provide defined minimum and maximum sub-block sizes (for ease of data management) and a tight distribution of sub-block sizes around a predictable average size (for predictability of storage and processing resource consumption), while generating Reproducible and statistical...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The methods and systems that are disclosed based on disadvantaged weights are disclosed.The segmentation of subdue -based heavy blocks is performed by defining the minimum and maximum size of the sub -block.For the starting position of each boundary of the sub -block, use multiple search standards to test the distribution value calculated during the search period, and start searching for the boundary position of the subsequent subsequent subsequent subsequent subsequent subsequent subsequent subsequent subsequent subsequent subsequent sub -blocks.If one of the scatter values meets one of the multiple search criteria, the position of the scattered scale is declared to end the boundary of the sub -block.If the maximum size to the sub -block in multiple search standards, the position of another large distribution value based on the multiple search standards based on the multiple search standards is declared to the end of the border of the sub -block to end the position.Essence

Description

technical field [0001] The present invention relates generally to computers, and more particularly to improved subblock partitioning for hash-based deduplication in a computing environment. Background technique [0002] In today's society, computer systems are ubiquitous. Computer systems may be found at the workplace, at home, or at school. Computer systems may include data storage systems, or disk storage systems, to process and store data. Huge volumes of data must be processed every day, and current trends indicate that these volumes will continue to increase for the foreseeable future. One effective way to alleviate this problem is through the use of weight loss. The idea underlying a deduplication system is to exploit the fact that most of the available data is copied and forwarded again and again without any change, by locating duplicate data and storing only its first occurrence. Subsequent copies are replaced by pointers to stored occurrences, which significantl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F3/0641G06F3/0608G06F3/067G06F16/2365G06F16/245G06F16/951G06F16/1748G06F16/1752G06F16/2255
Inventor L·阿罗诺维奇M·海尔什
Owner INT BUSINESS MASCH CORP