System and method for deleting repeating data

A technology of deduplication and deletion method, which is applied in electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of affecting the restoration of multiple files, poor reliability, etc. wasteful effect

Inactive Publication Date: 2013-06-26
XIAN UNIV OF TECH
View PDF6 Cites 52 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The purpose of the present invention is to provide a de-duplication system to solve the problems in the prior art that if a data block is lost or an error occurs, it will affect the restoration of multiple files and the reliability is poor

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for deleting repeating data
  • System and method for deleting repeating data
  • System and method for deleting repeating data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The present invention will be described in further detail below in combination with specific embodiments and accompanying drawings.

[0045] The core function of data deduplication technology is to compare the data to be stored with the data already saved in the storage system when storing data. If the same data exists, it means that the data has been saved, filter out this part of data, and refer to this part of data through pointers. Otherwise, save the data. According to the deduplication granularity, deduplication technology can be divided into file level and data block level. The data block level deduplication granularity is smaller and provides higher data deduplication rate. The present invention adopts a data block level deduplication algorithm.

[0046] There are three main types of data partitioning algorithms: fixed-size partitioning algorithms, variable-length partitioning algorithms, and sliding block partitioning algorithms. The fixed-size chunking algor...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a system and a method for deleting repeating data. The delete system for deleting the repeating data is of a distributed-type structure, and is mainly composed of a client side, a management server and memory node servers. The client side is mainly used for receiving user requests for saving files/ restoring the files, and segmentation files/ composition files. The management server has the main functions of fingerprint value comparison, fingerprint map database maintenance, erasure code encoding and data compression. The memory node servers are mainly used for storing compressed data blocks. The client side and a management server end are connected through a local area network, and the management server is connected with the memory node server ends through the local area network. A user saves and restores the files through the client side. The erasure code encoding and the data compression are carried out on segmentation blocks, the compressed data blocks are stored in the different memory node servers in a scattered mode, once a part of memory nodes break down, the saved data in the rest memory nodes can be used for carrying out file restoring, the reliability of the system for deleting the repeating data is improved, and waste of a memory space is reduced.

Description

technical field [0001] The invention belongs to the field of deduplication technology, relates to the field of distributed storage technology, and in particular to a deduplication system based on data compression and erasure code technology; the invention also relates to a deletion method of the deduplication system. Background technique [0002] With the rapid development of global informatization, data centers in companies, enterprises and organizations are facing the challenges of increasing data volume and high-speed data growth. Research shows that the era of big data has come. Big data has four characteristics. The most notable feature is the huge volume of data. According to a report, the amount of data created and copied worldwide in 2011 exceeded 1.8ZB (1.8 trillion GB), a nine-fold increase in five years. Studies have found that up to 60% of the data stored in the enterprise is duplicated, and as time goes by, there will be more and more. The existence of a large a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王磊任振刚黑新宏高阔费蓉
Owner XIAN UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products