Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Block-level data deduplication method based on NTFS file system

A file system and data technology, applied in the computer field, can solve the problems of large amount of data, huge amount of user data, huge image file, etc.

Active Publication Date: 2021-02-26
成都傲梅科技有限公司
View PDF10 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a block-level data deduplication method based on the NTFS file system to solve the problem that the amount of user data is too large at present, and the image files generated for data backup are also quite large, especially the problem that the data amount is huge due to repeated data storage.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Block-level data deduplication method based on NTFS file system
  • Block-level data deduplication method based on NTFS file system
  • Block-level data deduplication method based on NTFS file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0039] Such as figure 1 As shown, a block-level data deduplication method based on the NTFS file system includes the following steps:

[0040] Step S1. Create a snapshot for the NTFS file system that needs to be backed up;

[0041] Step S2. Constructing a bitmap from the snapshot;

[0042] Step S3. Calculate the granularity of the data block according to the size of the NTFS file system;

[0043] Step S4. Calculate the total number of blocks of the data blocks of the NTFS file system;

[0044] Step S5. Find the sector that needs to be backed up according to the bitmap data of the data block;

[0045] Step S6. Read the data of the used sector of the data block and calculate the checksum;

[0046] Step S7. Determine whether the checksum already exists, and if it exists, record the index; if not, record the index after compressing and encrypting, and write the data into the image file;

[0047] Step S8. Judging whether all the data blocks have been fully backed up; if all th...

Embodiment 2

[0062] This embodiment is a further specific technical solution of the first embodiment.

[0063]First, use the snapshot technology to create a snapshot for the NTFS file system that needs to be backed up, and then analyze the used cluster information of the NTFS file system by reading the snapshot, and construct the used bitmap of the file system based on the used cluster information According to the file information in the exclusion list, the bitmap information of the excluded files is constructed, and the bitmaps of these files to be excluded are excluded from the bitmap of the entire file system to generate the final bitmap data that needs to be backed up.

[0064] Then divide the entire file system into blocks according to the total size of the file system, and divide the data into blocks by dividing the number of bytes in the default block by the number of bytes in each cluster to calculate the minimum number of clusters each block occupies, and check the calculation. Wh...

Embodiment 3

[0098] Embodiment 3 is a further optimization of Embodiment 1 and Embodiment 2.

[0099] When the present invention is applied to a cloud storage system that stores massive data files, similar data blocks cannot be removed, resulting in huge image files generated during data backup. To solve this problem, this embodiment makes further optimizations. The scheme is as follows:

[0100] A method for deduplication of block-level data based on an NTFS file system for a cloud storage system, comprising the steps of:

[0101] Step A. Create a snapshot for the file system to be backed up;

[0102] Step B. Construct a bitmap from the snapshot; add a first-level index node in the cloud storage system, and the first-level index node is used to obtain the data block fingerprint of the file;

[0103] Step C. The primary index node constructs a secondary index consisting of a primary index and a secondary index according to the similarity of the files;

[0104] Step D. Deduplicating the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a block-level data deduplication method based on an NTFS file system. The block-level data deduplication method based on the NTFS file system comprises the following steps: S1,creating snapshots for the NTFS file system needing to be backed up; S2, constructing a bitmap from the snapshot; S3, calculating the granularity of the data block according to the size of the NTFS file system; S4, calculating the total block number of the data blocks of the NTFS file system; S5, finding a sector to be backed up according to the bitmap data of the data block; S6, reading the dataof the sector used by the data block and calculating a checksum; S7, judging whether the checksum exists or not; S8, judging whether all the data blocks are backed up completely or not; and S9, if allthe data blocks are backed up completely, recording the indexes in the mirror image file, completing block-level data deduplication, and completing the data backup. According to the method, the problems that at present, the data volume of a user is too large, mirror image files generated by data backup are quite large, and particularly the data volume is large due to repeated data storage are solved.

Description

technical field [0001] The invention relates to the field of computers and the technical field of data backup, and in particular to a block-level data deduplication method based on an NTFS file system. Background technique [0002] With the rapid development of computer technology, there are more and more various data, and the demand for data storage is also higher and higher, and at the same time, more and more data security problems are brought about. Therefore, data backup is particularly important. However, due to the large amount of user data, the image files generated for data backup are also quite large, especially the problem of huge data volume due to repeated data storage. Therefore, it is necessary to provide a block-level data deduplication method based on the NTFS file system to overcome the above-mentioned problems. Contents of the invention [0003] The present invention provides a block-level data deduplication method based on the NTFS file system to solv...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/174G06F16/13
CPCG06F16/1748G06F16/13Y02D10/00
Inventor 先泽强
Owner 成都傲梅科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products