A repeated data deleting method and device

A technology for data deduplication and data duplication, applied in the direction of electrical digital data processing, input/output process of data processing, instruments, etc., can solve the problems of low efficiency, high resource consumption, etc., to avoid overhead and reduce the consumption of system resources , Improve the efficiency of data deduplication

Inactive Publication Date: 2017-02-15
ZHENGZHOU YUNHAI INFORMATION TECH CO LTD
View PDF5 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to provide a deduplication method and device to solve the problems of low efficiency and high resource consumption of existing deduplication methods with mutex locks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A repeated data deleting method and device
  • A repeated data deleting method and device
  • A repeated data deleting method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. Apparently, the described embodiments are only some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0038] A flowchart of a specific embodiment of the data deduplication method provided by the present invention is as follows figure 1 As shown, the method includes:

[0039] Step S101: Divide the data stream into data blocks with a preset block size;

[0040] Step S102: Perform fingerprint calculation on each data block, and add the calculated fingerprint information to the attributes of the data block struc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a repeated data deleting method and device. The method comprises the steps of dividing data flow into data blocks of a preset block size; performing fingerprint calculation on each data block and adding calculated fingerprint information to the attributes of data block structures; acquiring a fixed length prefix of the calculated fingerprints and distributing the data blocks into different processing queues according to the fixed length prefix. The working threads in the processing queues perform repetition checking operations in a parallel manner to delete repeated data in the data blocks. The method distributes received data blocks into different processing queues based on a fixed length prefix of fingerprints of data blocks and a single thread is used for processing data blocks in each processing queue; repetition checking is only performed from repetition deletion metadata block sub-lists corresponding to the fixed length prefix of fingerprints, so that the expenses of uniformity locks are avoided; the working threads of the processing queues realize the repetition checking operations in a parallel manner, so that the consumption of system resources in repetition removal computing is reduced and the data repetition deletion efficiency is increased.

Description

technical field [0001] The invention relates to the technical field of data storage, in particular to a method and device for deduplicating data. Background technique [0002] With the development of information technology and Internet technology, all kinds of data are increasing at a multiple level year by year, and solid-state storage media with high performance but relatively small capacity account for an increasing proportion of storage systems. The data deduplication technology divides the data stream sent to the storage system into blocks, compares and calculates, deletes duplicate data, and realizes efficient utilization of storage resources. In the implementation of data deduplication, the data stream written by the user to the storage is first divided according to a certain block size, and then the deduplication module performs hash calculation (MD5 or SHA1) on all the data of the divided data block to generate the The globally unique data fingerprint of a data blo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06
CPCG06F3/0608G06F3/0641
Inventor 苑忠科殷雷
Owner ZHENGZHOU YUNHAI INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products