Method and system for data deduplication in cloud backup process

A technology for backing up data and cloud backup, applied in the field of data processing, can solve problems such as low deduplication efficiency, and achieve the effect of improving deduplication efficiency

Active Publication Date: 2016-02-03
ZHEJIANG GONGSHANG UNIVERSITY
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the embodiments of the present invention is to provide a method and system for deduplication of data in the cloud backup process, so as to solve the problem of low deduplication efficiency in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for data deduplication in cloud backup process
  • Method and system for data deduplication in cloud backup process
  • Method and system for data deduplication in cloud backup process

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0023] Such as figure 1 Shown is a flow chart of the data deduplication method in the cloud backup process provided by the embodiment of the present invention, and the method includes the following steps:

[0024] In step S101, the cloud backup client performs classification processing on the data to be backed up.

[0025] In the embodiment of the present invention, the cloud backup client first classifies the data to be backed up, and classifies the data to be backed up into one or more of the following categories:

[0026] 1. FSCF (Fixed-SizeChunkFile, referred to as: fixed-length block file), FSCF refers to a file whose content is formed at one time, with small changes and internal redundancy. This type of file includes: system image files, virtual machine files, etc.;

[0027] 2. DSCF (Dynamic-SizeChunkFile, abbreviated as: Dynamic-SizeChunkFile), DSCF refers to files whose content changes frequently and has redundancy inside. Such files include: word files, report files,...

Embodiment 2

[0058] Such as Figure 5 Shown is the structure diagram of the data deduplication system in the cloud backup process provided by the embodiment of the present invention. For the convenience of description, only the parts related to the embodiment of the present invention are shown, including:

[0059] The cloud backup client 501 is configured to classify and process the data to be backed up, perform slicing of the classified data to be backed up using a preset slicing algorithm, and use the sub-database and the main database to store the sharded data to be backed up fingerprint information of the data, and send the fingerprint information to the cloud backup server end 502, and the sub-database is established according to the type of the data to be backed up.

[0060] The cloud backup server 502 is configured to receive the fingerprint information sent by the cloud backup client 501, and perform a global search on the local database of the cloud backup server according to the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention is suitable for the field of data processing, and provides a method for data deduplication in cloud backup process. The method comprises the following steps: classifying the data to be backed up by a cloud backup client; switching the classified data to be backed up by the cloud backup client through a preset switching algorithm; storing the fingerprint information of the switched data to be backed up by the cloud backup client through a secondary database and a primary database, and sending the fingerprint information to a cloud backup server; and globally searching a local data of the cloud backup server by the cloud backup server according to the fingerprint information, and carrying out subsequent processing according to the searching result. The method provided by the invention has the advantage of improving the data deduplication efficiency.

Description

technical field [0001] The invention belongs to the field of data processing, and in particular relates to a method and system for deduplicating data in a cloud backup process. Background technique [0002] With the rapid arrival of the big data era, the amount of data in the information world has shown explosive growth, and the data has shown growth of PB, EB, and even ZB levels. According to the research, the global data volume will reach 40ZB by 2020. With the growth of data, the data management center is facing more and more problems. The consumption and maintenance of storage media are becoming more and more difficult. Some ordinary small companies and individuals can no longer manage data alone. They pay more attention to the cloud storage technology that has attracted much attention in the current market. This technology can greatly reduce the data management costs of companies and individuals. At the same time, cloud storage technology also provides these companies ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F11/14G06F17/30
Inventor 蒋晓宁赵文文甘志刚
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products