Method for implementing repeated data deletion technology based on single-hash averaging Bloom filter

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology for data deduplication and implementation methods, applied in the computer field, can solve the problems of increased resources consumed by Bloom filters, lower cost performance, and lower computing power, and achieve mutual independence, low computing consumption, and fast filtering Effect

Pending Publication Date: 2021-01-01

SOUTH CHINA UNIV OF TECH

View PDF0 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Although Bloom filters are widely used in network applications, due to the high requirements for hash functions (mutual independence and good randomness) and limited storage capacity in Standard Bloom filters (SBF) Space, resulting in more resources consumed by the Bloom filter, reduced computing power, and lower cost performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment

[0050] The implementation method of deduplication technology based on single hash uniform distributed long filter, such as figure 1 shown, including the following steps:

[0051] S1. Determine the length of the storage area, determine the length of the partition, determine the first data set D1 including D data that needs to be stored, set j=1, and determine the second data set D2 to be queried;

[0052] The length of the storage area is the storage size M of the single-hash uniform distribution filter, then the final length of each partition is the integer part of M / k, and the length of the last partition can be less than M / k.

[0053] In this embodiment, the first data set D1 contains a data x 1 , the storage area length is 24, k is 3, and 3 partitions p 1 ,p 2 ,p 3 The length of each is 8;

[0054] S2. Select a high-demand hash function within the scope of the storage area, and take the j-th data d in the first data set D1 j Carry out hash calculation and obtain hash ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method for implementing a repeated data deletion technology based on a single-hash averaging Bloom filter. The method comprises the steps that firstly, hash functions with high requirements in a partition range are used, then k hash maps are generated through k hash functions, the adopted k hash functions are modulo operation with extremely low calculation magnitude, andthen scaling mapping is carried out to partitions with the same size; a single-hash averaging Bloom filter is generated from the stored data through calculation and storing the single hash uniform distribution filter; new data is proved not to exist if the mapping blocks are not repeated by generating a new single-hash averaging Bloom filter. According to the method for implementing the repeated data deletion technology based on the single-hash averaging Bloom filter, the data which may be repeated can be quickly and effectively filtered.

Description

technical field [0001] The present invention relates to the field of computer technology, in particular to a method for implementing a deduplication technology based on a single-hash uniform distribution filter, Background technique [0002] Nowadays, there are often a large amount of data screening and qualification review requirements in network applications, such as data deduplication technology. It is usually a good solution to add a filter structure, among which Bloom filter is one of the most commonly used structures. . Although Bloom filters are widely used in network applications, due to the high requirements for hash functions (mutual independence and good randomness) and limited storage capacity in Standard Bloom filters (SBF) space, resulting in more resources consumed by the Bloom filter, lower computing power, and lower cost performance. (An example can be given here) Therefore, how to reduce the resources consumed by the Bloom filter and at the same time redu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & AuthorityApplications(China)

IPC IPC(8): G06F16/215G06F16/22

CPCG06F16/215G06F16/2255

Inventor齐德昱俞快

OwnerSOUTH CHINA UNIV OF TECH

Method for implementing repeated data deletion technology based on single-hash averaging Bloom filter

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology