Method, device and system for processing repeating data

A technique for duplicating data and processing methods, which is applied in the storage field, and can solve problems such as storage requirements that cannot be applied to gradually increasing data volumes and low efficiency, and achieve the effects of saving processing time, improving processing efficiency, and meeting storage requirements

Active Publication Date: 2011-10-12
CHENGDU HUAWEI TECH
View PDF4 Cites 38 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Embodiments of the present invention provide a duplicate data processing method, device, and system to solve the problem of low efficiency of using the CDC algorithm to deduplicate data in the prior art, which cannot meet the storage requirements of gradually increasing data volume.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method, device and system for processing repeating data
  • Method, device and system for processing repeating data
  • Method, device and system for processing repeating data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0022] figure 1 It is a flowchart of an embodiment of the repeated data processing method of the present invention, such as figure 1 As shown, the method of this embodiment may include:

[0023] Step 101, use the sliding window to divide the data object into blocks, and obtain the data of each block. In order to obtain the data of each block, th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The embodiment of the invention provides a method, device and system for processing repeating data. The method comprises the following steps of: blocking data objects with a sliding window to obtain data of each block, wherein in order to obtain data of each block, the sliding start position of the sliding window jumps backwards from the ending position of data of a precious block by a minimum block length, the start position of the data of each block is the ending position of the data of a precious block, and the length of the data of each block is equal to the sum of the minimum block length and the sliding length of the sliding window in the processing process of a corresponding block; and matching the data of each block with stored data of a block in storage equipment, and deleting the data of a block if the data of the block is stored in the storage equipment, wherein data of the block stored in the storage equipment is taken as the data of a block in the data objects. By adopting the embodiment of the invention, the deleting efficiency of repeating data can be increased, and the storage requirement of gradually increasing data amount is met.

Description

technical field [0001] Embodiments of the present invention relate to storage technologies, and in particular, to a method, device and system for processing repeated data. Background technique [0002] Data deduplication, also known as intelligent compression or single instance storage, is a method that can automatically search for duplicate data, keep only one copy of the same data, and replace other duplicate copies with pointers to the single copy to eliminate redundancy Data storage technology that reduces storage capacity requirements. [0003] In the prior art, the data deduplication method may use a variable-length block Content-Defined Chunking, hereinafter referred to as: CDC) algorithm. Specifically, this method uses the fingerprint algorithm to calculate the fingerprint of the data object in the sliding window. If the predetermined condition is met, the start position and the end position of the sliding window are used as the boundary of the data block, and the f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 段雨梅谢勇徐君
Owner CHENGDU HUAWEI TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products