Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Data Deduplication Blocking Method Based on Extremum

A data and block technology, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as excessive operations, performance bottlenecks, and excessive time consumption, and achieve the effect of improving the deduplication rate

Active Publication Date: 2017-08-25
HUAZHONG UNIV OF SCI & TECH
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existing block method judges too many operations and is too time-consuming, which makes data block the performance bottleneck of the writing process of the entire data deduplication system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Data Deduplication Blocking Method Based on Extremum
  • A Data Deduplication Blocking Method Based on Extremum
  • A Data Deduplication Blocking Method Based on Extremum

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0040] The invention provides a data deduplication and block method based on extreme value. This method uses sliding window technology to find the extremum in the local area (for the convenience of description, the maximum value is used here as an example for illustration), the sliding window has two attributes: position P and value V, the first value of the unblocked data stream The P of the w...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a data deduplication blocking method based on an extreme value. The method is characterized in that the method is an improvement of the existing blocking method. Compared with the existing blocking method, the data deduplication blocking method is characterized in that 1, a local extreme value is found in a local asymmetrical region instead of a symmetrical region for solving the boundary shifting problem; 2, the position with the local extreme value (i.e., the extreme value point) is put into the middle of a data block instead of being used as a boundary of the data block; 3, when an identical extreme value is met, the position with the extreme value occurring at first is used as the extreme value point. Through the previous two different points, the method provided by the invention has the advantages that the required operations for tangency point judgment are few, so that the throughput capacity much higher than that of the existing blocking method can be obtained. Through the third different point, the method provided by the invention can be used for detecting and eliminating repeated data in partial low-entropy strings. In addition, the block length variance of the data block generated by the invention is smaller, and in addition, the block length limitation is not forced, so that the deduplication rate identical to or higher than that of the traditional blocking method can be obtained.

Description

technical field [0001] The invention belongs to the field of computer storage technology and computer network, and more specifically relates to a data deduplication and block method based on extremum. Background technique [0002] With the rapid development of the network, more and more individual users and enterprises are connected to the Internet, and the total amount of data is growing explosively. According to statistics, in the next 10 years from 2014, the total amount of global data will increase by 40% every year, that is, the total amount of data will double every two years; it is estimated that by 2020, the total amount of global information will Up to 44ZB. Storing and transmitting such large amounts of data is a major challenge today. As a technology that can effectively eliminate redundant data, data deduplication (or Data Deduplication) has become a research hotspot in the field of storage and network optimization. [0003] Although data deduplication technol...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/24556
Inventor 冯丹张宇成夏文付忞黄方亭周玉坤
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products