Rapid data de-duplication method adapted to big data application

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
A technology for deduplication and big data, which is applied in special data processing applications, redundancy in operations, data error detection, response error generation, etc. It can solve the problem of low deduplication rate and inability to effectively adapt to complex Changeable application environment, unsuitable big data application environment and other problems, to achieve the effect of reducing the backup window and storage overhead

Active Publication Date: 2013-09-25

和宇健康科技股份有限公司

View PDF3 Cites 17 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

It can effectively detect redundant data for data that is easy to modify, but due to the frequent calculation of fingerprint values during the window sliding process, the deduplication rate is low, and it is not suitable for large data application environments

[0004] In summary, the above deduplication methods have their own limitations, and a single deduplication method cannot effectively adapt to complex and changeable application environments.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0037] The method of the present invention will be described in detail below with reference to the accompanying drawings.

[0038] see Figure 1 to Figure 2 As shown, the present invention is to deduplicate the redundant data existing in the backup process. Considering the impact of the existing deduplication method on the backup window under the big data application and the problem of limited scope of application, the combination becomes block and fixed The advantage of the length block algorithm is to use the deduplication factor and the acceleration factor to ensure the deduplication rate and greatly improve the deduplication rate. The specific ideas of the method of the present invention are as follows: figure 1 shown.

[0039]Data deduplication is suitable for application environments with a large amount of redundant data, such as backup systems, E-mail systems, data migration, and disaster recovery. In these application environments, a high deduplication rate can be ach...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a rapid data de-duplication method adapted to a big data application. The rapid data de-duplication method is applied to backup de-duplication systems under the big data application and solves the problems that the existing variable-length partition algorithm based on content identification is low in de-duplication rate and fails to identify redundant data rapidly. According to the rapid data de-duplication method, through adjusting de-duplication factors and acceleration factors in a partition process, the de-duplication rate is substantially improved on the premise that the de-duplication ratio is ensured, de-duplication detection can be performed rapidly, the contradiction between the de-duplication ratio and the de-duplication rate is balanced, backup windows are reduced, and network bandwidth and memory spaces are saved.

Description

technical field [0001] The invention belongs to the technical field of computer information storage, and in particular relates to a fast data deduplication method suitable for big data applications. Background technique [0002] In the information age, with the fissile growth of data, the era of big data is coming. The so-called big data means meeting the following characteristics: huge data volume, various types, low value density, and fast generation speed. In the era of big data, there is a large amount of redundant data in the process of data backup and storage. How to eliminate duplicate data in the backup process to reduce storage space and network bandwidth consumption has become a hot research topic in the storage field. [0003] The most effective way to eliminate redundant data in the backup process is to use data deduplication technology. It is generally believed that deduplication technology includes file-level full-file deduplication technology, block-level fi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30G06F11/14

Inventor张兴军朱国峰董小社朱跃光王龙翔姜晓夏

Owner和宇健康科技股份有限公司

Rapid data de-duplication method adapted to big data application

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology