Data deduplication method, apparatus and system

A data and data block technology, applied in the input/output process of data processing, electrical digital data processing, instruments, etc., can solve problems affecting system performance, large metadata information, and many data blocks, and achieve the goal of improving system performance Effect

Inactive Publication Date: 2016-07-27
LENOVO (BEIJING) LTD
View PDF4 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, since the metadata information size of each data block is fixed, if a finer-grained data block division is adopted, the overall metadata information will be larger, and the system needs to process more data blocks, which will affect the system performance. performance
Therefore, the existing data deduplication system based on a single-level data block partition strategy has to make a trade-off between deduplication effect and system performance, and sometimes even sacrifices deduplication effect for system performance, that is, adopts a coarser-grained Data block division

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data deduplication method, apparatus and system
  • Data deduplication method, apparatus and system
  • Data deduplication method, apparatus and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Note that in this specification and the drawings, substantially the same steps and elements are denoted by the same reference numerals, and repeated explanations of these steps and elements will be omitted.

[0017] First, refer to figure 1 A method 10 for deduplication of data according to an embodiment of the present invention is described. figure 1 is a flowchart illustrating a method 10 for data deduplication according to an embodiment of the present invention.

[0018] Such as figure 1 As shown in , when the method 10 for data deduplication in the embodiment of the present invention starts, first, in step S101, the target file is divided into coarse-grained data blocks, so as to divide the target file into a plurality of coarse-grained data blocks Piece. Next, in step S102, duplicate data block detection is performed on the plural...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a data deduplication method, apparatus and system. The method comprises: carrying out coarse-grained data block division on a target file, so as to divide the target file into a plurality of coarse-grained data blocks; carrying out duplicate data block detection on the coarse-grained data blocks, and obtaining a first result; based on the first result, carrying out fine-grained data block division on each non-repetitive coarse-grained data block in the plurality of coarse-grained data blocks, so as to divide the non-repetitive coarse-grained data block into a plurality of fine-grained data blocks; carrying out duplicate data block detection on the fine-grained data blocks, and obtaining a second result; and based on the second result, storing non-repetitive fine-grained data blocks in the plurality of fine-grained data blocks, wherein the size of each coarse-grained data block is greater than the size of each fine-grained data block, and the demarcation of the coarse-grained data block division belongs to the demarcation of the fine-grained data block division.

Description

technical field [0001] The present invention relates to the field of data deduplication, and more specifically, the present invention relates to a method, device and system for data deduplication. Background technique [0002] Data deduplication refers to the removal of duplicate (redundant) data in the data to be stored, so as to reduce the amount of stored data without destroying the integrity and fidelity of the original data, thereby saving storage resources and reducing hardware costs. The method implemented by data deduplication is usually to divide the data to be stored into multiple data blocks according to specific rules, and then remove the duplicate data blocks in these data blocks, and only store the remaining non-duplicated data blocks. [0003] Existing data deduplication systems generally adopt a single-level data block division strategy. In single-level data block division, generally, in order to improve the effect of deduplication, it is necessary to adopt ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06
CPCG06F3/0641
Inventor 郑阳李明强严正山王敏赵鑫
Owner LENOVO (BEIJING) LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products