Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method for deleting duplicated data in file system in real time

A file system and data duplication technology, applied in digital data processing, special data processing applications, instruments, etc., can solve problems such as low cutting efficiency, improve efficiency, improve removal effect, save backup computing and storage resource overhead Effect

Inactive Publication Date: 2010-12-08
TSINGHUA UNIV
View PDF2 Cites 45 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In the DataDomain system, since each deduplication needs to scan the entire disk, it can only be performed periodically at a lower frequency.
It is not difficult to understand that this non-real-time data deduplication method has the following obvious disadvantages: 1. Data deduplication and data reading and writing operations are performed independently at different stages. In order to support random reading and writing of data in files, the storage system must Save all complete data in the first-level storage stage, but can only play a role in saving storage space in the second-level storage stage of backup and archiving; second, the efficiency of slicing is low, even if only a small part of the data is modified, it is necessary to modify a piece of data. All data in the file or even the disk is re-cut into chunks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for deleting duplicated data in file system in real time
  • Method for deleting duplicated data in file system in real time
  • Method for deleting duplicated data in file system in real time

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The following describes in detail the real-time deduplication and transmission method of data in the file system proposed by the present invention in conjunction with the accompanying drawings:

[0044] (1) if figure 1 As shown, the method registers the file system driver module under the file system virtual layer of the operating system, receives and responds to the operation commands initiated by the application program on the file system, and the real-time deduplication management process of the file system is responsible for the metadata information and The content of the data block is stored in the storage device. Specifically, the method uses an embedded database in the storage device to store metadata information in the file system. File metadata table, data block index table and file composition table are set in this embedded database; Described file metadata table records the metadata of each file in the file system, and this metadata includes file identificat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a method for deleting duplicated data in a file system in real time, and belongs to the technical field of computer data storage. In a file system establishment stage, a file metadata table, a data block index and a file constitution table are set in an embedded database; and in a file system operation stage, operating commands initiated to the file system by applications are received and responded through a file system driver, and include establishing a new file, writing data to an existing file, reading data from the existing file and deleting the existing data. The method simultaneously supports fixed-length and variable-length file blocking methods, and can delete the duplicated data in the file system in real time, save storage space and improve the utilization efficiency of storage equipment; and the process of deleting the duplicated data is completely transparent to the applications and a user, the file operation of various conventional applications is seamlessly compatible, and almost all negative effects on the user experience are avoided.

Description

technical field [0001] The invention relates to a method for deleting duplicate data in real time in a file system, belonging to the technical field of computer data storage. Background technique [0002] With the rapid development of digital devices, human society is entering the digital age in an all-round way, and the amount of data that needs to be stored is also showing an explosive growth trend. In this context, in order to reduce storage costs and improve the scalability of storage systems, how to store as much data as possible with as little space as possible has become the hottest issue in the storage field. [0003] Data deduplication technology emerged at the beginning of this century and has been widely popularized and applied in recent years. The basic idea of ​​data deduplication can be summarized as follows: Firstly, the files in the storage system are divided into several data blocks, and the hash value of the data block content is used to build an index for...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 唐力汪东升
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products