Data backup method and system based on submodule model

A technology of data backup and sub-modeling, which is applied to redundancy in computing for data error detection, electrical digital data processing, and response error generation. Utilization and other issues, to achieve the effect of reducing the number of container reads, reducing disk read and write overhead, and increasing data deduplication rate

Active Publication Date: 2017-08-04
HUAZHONG UNIV OF SCI & TECH
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, for a large number of identical data blocks in multiple selected containers, once one of the data blocks is selected to deduplicate and restore all data blocks with the same content in the data stream, then the data blocks with the same content in other containers will not Will be referenced and become redundant data blocks, so these data blocks are not referenced data blocks and cannot be counted in container u

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data backup method and system based on submodule model
  • Data backup method and system based on submodule model
  • Data backup method and system based on submodule model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

[0036] Such as figure 1 Shown, the inventive method comprises the following steps:

[0037] (1) Data preprocessing: divide the data stream to be backed up into multiple data blocks, and then form data segments of fixed size, and use the data segment as the basic unit of deduplication and recovery operations;

[0038] (2) Generate a total collection of containers: obtain the referenced containe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data backup method and system based on a submodule model, and belongs to the technical field of computer storage. The method comprises the steps that partitioning is conducted on data flow, and different data segments are formed; a fingerprint of each data block of the corresponding data segment is calculated, fingerprint indexing is inquired, and container information quoted by repeating data blocks is obtained; a submodule function maximization model is established for selecting containers containing more quotable and non-redundant data blocks; the data blocks quoting the containers are subjected to duplication eliminating, and fragment data blocks quoting other containers are rewritten. The invention further provides the data backup system based on the submodule model. According to the technical scheme, more duplicating data blocks can be removed, consumption of redundant and non-quoting data blocks on bandwidth recovery is reduced, and the recovery performance is improved while the high duplication eliminating rate is guaranteed.

Description

technical field [0001] The invention belongs to the technical field of computer storage, and more specifically relates to a data backup method and system based on a submodel model. Background technique [0002] In the backup system, periodically backing up data consumes a lot of storage space, and there is a lot of redundant data between backups, so data deduplication technology (data deduplication technology) is commonly used to eliminate these redundant data and reduce storage space. overhead. [0003] However, in the backup system based on deduplication, the new version and the old version of the data backup share data blocks, so that the originally logically continuous data streams are scattered and stored in different containers (Container: the data on the disk in the deduplication system In the basic unit of reading and writing), a large number of data fragments are formed, which seriously reduces the recovery performance of data. The reason is that, firstly, the dis...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F11/14
CPCG06F11/1453G06F11/1469
Inventor 华宇吴婕冯丹左鹏飞
Owner HUAZHONG UNIV OF SCI & TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products