Concurrent hierarchy type replicated data eliminating method and system

A technology for deduplication and deduplication systems, applied in the field of information security, can solve problems such as increased computing overhead, inability to fully utilize computing resources, and affect deduplication rate, etc., to solve computing resource bottlenecks, improve deduplication rate, and efficient duplication The effect of data erasure

Inactive Publication Date: 2010-12-15
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF4 Cites 51 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] By reducing the deduplication granularity, the deduplication rate can be significantly improved, but it will lead to a significant increase in computing overhead, especially when the byte-level deduplication mechanism is used, the computational overhead will increase significantly, which will seriously affect the deduplication rate; At the same time, in multi-core systems, none of the existing deduplication mechanisms can make full use of computing resources

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Concurrent hierarchy type replicated data eliminating method and system
  • Concurrent hierarchy type replicated data eliminating method and system
  • Concurrent hierarchy type replicated data eliminating method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0062] In order to make the purpose, technical solution and advantages of the present invention clearer, a concurrent hierarchical data deduplication method and system of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0063] A concurrent hierarchical data deduplication method and system of the present invention starts one or more threads according to the configuration of the multi-core system, and the data is divided into blocks, checked for duplicates and encoded by multiple threads in parallel, and the deduplication is completed independently. The whole process of heavy weight is then written to the storage device by different threads, or written to the storage device by a unified thread. After the concurrent hierarchical data deduplication syste...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a concurrent hierarchy type replicated data eliminating method and a system. The method comprises the following steps: an input device receives data input externally and puts the data into a shared buffer queue; a plurality of blocking devices concurrently acquire the data from the buffer queue and carry out blocking, and inputs the blocks to a plurality of coarse grain duplication removers to carry out coarse grain duplication removal; and the coarse grain duplication removers carry out coarse grain duplication removal and judge whether the data blocks are duplicated, and if so, the index information of the duplicated data blocks is written into a memory by using a data read-write subsystem; and else, fine grain duplication removers carry out the fine grain duplication removal on the non-duplicated data blocks, and store the data blocks subject to duplication removal and the index information thereof into the memory through the data read-write subsystem.

Description

technical field [0001] The present invention relates to the field of information security, in particular to a concurrent hierarchical data deduplication method and system that can effectively utilize multi-core computing resources. Background technique [0002] With the continuous improvement of the degree of informatization, the amount of data continues to explode. According to statistics, in 2002 the world produced 5 EB of data, and it grew rapidly at a rate of 30% per year. It is estimated that by 2010, the total amount of global data will exceed 988 EB. At the same time, the importance of data continues to increase, and more and more data needs to be stored centrally through archiving and backup. According to the statistics of Enterprise Strategy Group (ESG), the amount of archiving and backup data is growing rapidly at a rate of 60% every year. , the scale has reached the PB level and will soon grow to hundreds of PB levels; the amount of data in the backup system is u...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06G06F12/06
Inventor 王树鹏云晓春包秀国李楠宁
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products