Big-data-oriented cloud disaster tolerant backup method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A disaster recovery backup and big data technology, which is applied in the direction of data error detection and response error generation, can solve the problem of limiting the throughput of big data storage systems, failing to meet real-time system requirements, and aggravating the client. Load and other issues, to achieve the effect of enhancing the remote disaster recovery function, saving disaster recovery costs, and reducing the risk of data leakage

Active Publication Date: 2015-09-23

SOUTH CHINA UNIV OF TECH

View PDF7 Cites 36 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Deduplicate the backup database files in the resource pool, but as the database files grow, this differential deletion method will also cause system performance bottlenecks

On the other hand, using client-side compressed storage to decompose the high load problem of the storage server is usually that the client runs a data deduplication program on the input file to generate split data blocks and corresponding fingerprint feature values; The query request of the characteristic value; the distribution server records the storage location of the split data block; the distribution server forwards the query request to the corresponding repeated data processing device according to the fingerprint characteristic value; the repeated data processing device judges whether the fingerprint characteristic value already exists; if not The fingerprint feature value, the repeated data processing device will store the new split data block to the storage server according to the new fingerprint feature value, but such operations usually increase the load on the client

In practice, it has been shown that data in big data storage systems has different access heat. Usually, the access volume and update rate of hot data far exceed some old cold data. When distinguishing data heat, it is inevitable to face a large amount of data. Block segmentation and reassembly, while the I / O performance of the storage medium and the bandwidth of the storage network usually limit the throughput of the big data storage system

[0005] The current disaster recovery and backup system usually uses HDFS on the private cloud as the platform, uses MapReduce tasks to realize data segmentation and combines data deduplication technology based on content recognition, or directly stores data in the public cloud, relying on the deduplication of the public cloud technology and multi-copy remote disaster recovery strategy, etc. These methods are only suitable for offline storage backup services, and usually cannot meet the current real-time system requirements

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] The present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings, but the embodiments of the present invention are not limited thereto.

[0029] The present invention utilizes the repeated data deletion technology based on content recognition to perform distributed data deduplication. After the server of the cloud storage network performs disaster recovery backup for the client of the production system, it reads and extracts the metadata of the data objects in the backup set, and stores them in the cache nodes of the cloud storage network. When new metadata enters, the old and new Compare the metadata array spaces of different versions, and if metadata of the same version is found, further compare the data objects byte by byte, so as to find changed data (even if the metadata versions are the same). If the data object is repeated, assign a pointer to the data object, and finally delete the data object. Th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a big-data-oriented cloud disaster tolerant backup method, which comprises the following steps of: building a file block Hash fingerprint and a snapshot pointer to realize compression storage backup on different versions of a file; meanwhile, transmitting the file block fingerprint to a private cloud storage system; building a file block fingerprint index database by a private cloud; comparing the Hash fingerpoint through the MapRedue task to perform primary deduplication on a transmission block; performing fine-granularity content-based secondary blocking hash on a data block; calculating the similarity matrix and the block pointer distribution of the data block through another MapReduce sub task; counting the access hot degree of the data block; caching a fingerprint index database and hot data into a storage front section; storing cold data and filing backup data in a centralized way; building a version snapshot; and regularly backing up the data in a public cloud storage system. The big-data-oriented cloud disaster tolerant backup method solves the problems of poor real-time performance and the like of a data deduplication technology in the conventional disaster tolerant backup through a cache fingerprint database and hot data.

Description

technical field [0001] The invention relates to the field of data backup, in particular to a big data-oriented cloud disaster recovery backup method. Background technique [0002] In the past, data protection solutions were based on data deduplication of stand-alone devices, but the development trend of data storage and backup networks is a large-scale distributed storage network. Multiple storage and data processing devices are connected through high-speed communication lines to provide cloud storage and high-speed Available services. The disaster recovery backup of massive heterogeneous data usually uses a distributed cloud storage network. A backup set is stored in different devices in the form of data blocks. This has the advantage of sharing the load of each device and improving the fault tolerance of the data. However, there may be The same data block is repeatedly stored in different devices, and a large amount of redundant data is accumulated in the cloud storage ne...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F11/14

Inventor 林伟伟张子龙钟坯平

Owner SOUTH CHINA UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Big-data-oriented cloud disaster tolerant backup method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology