Unlock instant, AI-driven research and patent intelligence for your innovation.

A Duplicate Data Detection Method Based on Rabin Fingerprint and XOR Calculation

A technology of duplicate data and detection methods, applied in computing, digital data protection, electrical digital data processing, etc., to accelerate the duplicate data detection process, reduce the amount of data calculation, and improve data security.

Active Publication Date: 2021-02-19
XI AN JIAOTONG UNIV
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Therefore, it is an urgent problem to study new duplicate data detection methods and eliminate the performance overhead caused by traditional SHA-1 / MD5 calculations.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Duplicate Data Detection Method Based on Rabin Fingerprint and XOR Calculation
  • A Duplicate Data Detection Method Based on Rabin Fingerprint and XOR Calculation
  • A Duplicate Data Detection Method Based on Rabin Fingerprint and XOR Calculation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0026] The present invention is described in further detail below in conjunction with accompanying drawing:

[0027] see figure 1 , the duplicate data detection method based on rabin fingerprint and XOR calculation of the present invention, comprises the following steps:

[0028] 1) Calculate the rabin fingerprint value of the current data block, specifically obtained by the following steps;

[0029] (1) After data segmentation starts, read 48 bytes into the sliding window at a time and calculate the rabin fingerprint value by formula (1). At the same time, construct a counter and initialize its value to 0.

[0030] f(A)=A(t) mod P(t) (1)

[0031] Among them, f(A) is the rabin fingerprint value,

[0032] (2) Judging whether the current window forms a boundary point according to the rabin fingerprint value, according to the classic content-based block algorithm of the deduplication system, there are two conditions for judging whether the window forms a boundary point:

[0...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A duplicate data detection method based on rabin fingerprint and XOR calculation, which calculates the rabin fingerprint value of the current data block; checks whether the rabin fingerprint value of the current data block exists in the database, and if not, judges that the current data block is new data Otherwise, find all the data blocks in the database that have the same rabin fingerprint value as the current data block and read them out; compare the read data block with the current data block according to the XOR calculation, if all the data blocks are different Or if the result is 1, then the current data block is a new data block, otherwise it is judged that the current data block is a repeated data block. The invention can significantly reduce the performance cost of fingerprint calculation in the novel non-volatile storage device, and can eliminate the potential safety hazard in the traditional repeated data detection method.

Description

technical field [0001] The invention belongs to the field of deduplication data in storage systems, and relates to a method for detecting duplicate data oriented to novel non-volatile storage devices, in particular to a method for detecting duplicate data based on rabin fingerprint and XOR calculation. Background technique [0002] In recent years, non-volatile memory (NVM: non-volatile memory) represented by PCM, STT-RAM and 3DXpoint has attracted more and more attention from industry and academia. NVM devices feature high bandwidth and low latency, read and write performance comparable to DRAM, and can store data after power failure like a disk. At present, Intel's Optane SSD based on 3DXpoint technology has been sold in the market. Compared with traditional flash media, NVM brings subversive changes to the structure of computer systems. NVM is byte-addressable and has read and write performance close to that of DRAM. It can be used as memory, and the CPU can directly ac...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F21/62
CPCG06F21/6227
Inventor 王龙翔董小社张兴军朱正东陈衡王宇菲
Owner XI AN JIAOTONG UNIV