Method and device for cloning NTFS volumes between disks
A cloning method and disk technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problem of low copy efficiency, and achieve the effect of improving copy efficiency, saving system consumption, and preventing changes.
Active Publication Date: 2019-10-18
CHENGDU YIWO TECH DEV CO LTD
4 Cites 2 Cited by
AI-Extracted Technical Summary
Problems solved by technology
With the development of storage technology, the sector size of the disk has gradually changed from 512k to 4k, which means th...
Abstract
The invention provides a method and device for cloning NTFS volumes between disks. The method includes: when cloning is carried out among different disks, determining the cluster size of the source volume and the cluster size of the target volume; if the cluster size of the source volume is smaller than the cluster size of the target volume, cloning the data stored in each cluster of the source volume into each cluster of the target volume; based on a file identifier corresponding to the data, adjusting the data stored in each cluster of the target volume, and reconstructing each meta-file corresponding to the target volume based on the data stored in each cluster of the target volume, so that the data is directly copied into the target volume from the cluster of the source volume, the system consumption of frequently opening and closing the files is saved, and the copying efficiency is improved.
Application Domain
Input/output to record carriersFile system administration +1
Technology Topic
Cluster sizeCopying +3
Image
Examples
- Experimental program(1)
Example Embodiment
[0066] In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
[0067] See figure 1 , Which shows a flowchart of a method for cloning an NTFS volume between disks according to an embodiment of the present invention, which may include the following steps:
[0068] 101: Determine the cluster size of the source volume and the cluster size of the target volume. It is understandable that the source volume is the known storage location of the data currently to be copied, the target volume is the location where the data currently to be copied will be stored, and the source volume and the target volume may use different minimum storage units, such as source The volume and the target volume may use clusters of different sizes to store data. For this reason, the cluster size of the source volume and the cluster size of the target volume need to be determined before cloning.
[0069] In this embodiment, the cluster size of the source volume and the cluster size of the target volume can be represented by the number of bytes corresponding to the cluster of the source volume and the number of bytes corresponding to the cluster of the target volume, for example, a volume with a disk sector size of 512 bytes. The cluster size can be any of 512, 1k, 2k, 4k, 8k, 16k, 32k, and 64k. The volume cluster size with a disk sector size of 4K can be any of 4k, 8k, 16k, 32k, and 64k. Kind.
[0070] The cluster size of the source volume and the cluster size of the target volume include three situations: one is that the cluster size of the source volume is smaller than the cluster size of the target volume; the other is that the cluster size of the source volume is larger than the cluster size of the target volume; One is that the cluster size of the source volume is equal to the cluster size of the target volume.
[0071] When the source volume stores data, it will cross-store the data of different files in each cluster. In this case, if the cluster size of the source volume is equal to the cluster size of the target volume, the data obtained from the cluster of the source volume can be directly copied To a cluster corresponding to the target volume; if the cluster size of the source volume is greater than the cluster size of the target volume, after obtaining a piece of data from the cluster of the source volume, multiple clusters in the target volume can be used to store this data, where The total size of the cluster of the piece of data is the same as the size of the piece of data, that is, the storage space of all clusters storing the piece of data is the same as the amount of data; therefore, no matter the cluster size of the source volume is equal to the cluster size of the target volume, If the cluster size of the source volume is greater than the cluster size of the target volume, the data in each cluster of the source volume can be cloned into each cluster of the target volume in turn, and each cluster of the target volume can be full, but if the source volume The cluster size of is smaller than the cluster size of the target volume, there will be data obtained sequentially from the clusters of the source volume belonging to different files, and the data belonging to different files cannot be stored in the same cluster of the target volume, so in this case There is a problem that the clusters of the target volume cannot be filled. Correspondingly, the data storage needs to be adjusted after copying the data to the clusters of the target volume.
[0072] Based on this, in this embodiment, the cluster size of the source volume is smaller than the cluster size of the target volume, and the cluster size of the source volume is equal to or larger than the cluster size of the target volume to illustrate the cloning process from the source volume to the target volume. For the cloning process, see Follow the steps below.
[0073] 102: If the cluster size of the source volume is smaller than the cluster size of the target volume, clone the storage in each cluster of the source volume to each cluster of the target volume.
[0074] One copy method is to obtain the data stored in each cluster of the source volume in sequence from the first cluster of the source volume; based on the file identifier corresponding to the data stored in each cluster of the source volume, the data of the source volume obtained in turn The data stored in each cluster is sequentially stored in each cluster of the target volume to prevent data corresponding to different file identifiers from being stored in a cluster of the target volume, and the data is stored in each cluster of the target volume in turn. In the process, the clusters used to store data in the target volume are continuous, and there are no free clusters in the clusters used to store data (that is, there is no cluster that does not store data).
[0075] In this embodiment, the file identifier corresponding to the data is used to characterize the file to which the data belongs, that is, the file identifier is used to determine which file the data belongs to, and the data is used to characterize the directory organization structure of the file to which the data belongs, so as to realize the copy of the directory organization structure of the file. And the file to which the data belongs is not a compressed file. The reason for separating the compressed file is because the data storage in the compressed file will change after the cluster size changes. Copying the compressed file directly from the source volume cannot keep the original data corresponding to the compressed file valid. Therefore, the compressed file cannot be copied in the manner of this embodiment.
[0076] The above-mentioned acquisition of the data stored in each cluster of the source volume in sequence starting from the first cluster of the source volume indicates that the data stored in each cluster is continuously acquired from the first cluster of the source volume, and the corresponding data is ignored in the acquisition process. File identification, that is to say, regardless of whether the data stored in each cluster of the source volume corresponds to different file identifications, this embodiment continuously reads the data stored in each cluster starting from the first cluster, and the prior art needs to be based on data correspondence After obtaining all the data corresponding to one file ID, all the data corresponding to another file ID can be obtained after obtaining all the data corresponding to another file ID. The storage of all the data of the same file ID in the source volume may be discontinuous. In the process of obtaining data by file identification, it is necessary to intersperse and read data in each cluster of the source volume. Therefore, in this embodiment, compared with the prior art, there is no interleaved reading based on the file identification corresponding to the data in each cluster of the source volume. The phenomenon of fetching data improves copy efficiency compared to the prior art.
[0077] Such as figure 2 As shown, it shows an example of sequentially storing the data stored in each cluster of the source volume obtained sequentially in each cluster of the target volume. figure 2 Each cluster of the source volume is an NTFS volume with a size of 2K, and each cluster of the target volume is an NTFS volume with a size of 4K. The clusters of the source volume store data corresponding to file 1 and file 2, where file 1 has a ~ g 7 Data blocks, file 2 also has a to g data blocks, the data blocks of file 1 and file 2 are stored in the source volume as figure 2 As shown by the number 1, each data block is stored in each cluster of the source volume in cluster form.
[0078] Copy the data stored in each cluster of the source volume to each cluster of the target volume in turn. The distribution of data blocks in each cluster of the target volume is as follows figure 2 As shown by the number 2 in the target volume, it can be seen that the clusters of the target volume are closely adjacent from the first cluster and no free clusters appear in the middle, and the data blocks of file 1 and file 2 cannot share the same cluster in the target volume.
[0079] 103: Based on the file identifier corresponding to the data, adjust the data stored in each cluster of the target volume, merge the data in each cluster corresponding to the same file identifier in the target volume through data adjustment, and reduce the target volume through the merge method The number of clusters in which the storage space is not occupied.
[0080] In this embodiment, an adjustment method is: if the size of a cluster to be adjusted in the target volume is greater than the size of the data stored in the cluster to be adjusted, the data stored in the cluster to be adjusted is obtained from the remaining clusters of the target volume. The data identified by the file is the data to be moved, where the amount of data to be moved matches the remaining storage space in the cluster to be adjusted; the data to be moved is moved to the cluster to be adjusted; the data to be moved stored in the remaining clusters is deleted.
[0081] If the size of a cluster in the target volume is greater than the size of the data stored in the cluster, it means that the cluster is not occupied and there is still remaining storage space to store data. At this time, the cluster can be used as the cluster to be adjusted to adjust the cluster. The data in the cluster, specifically, the data in the remaining clusters that has the same file identifier as the data stored in the cluster to be adjusted is moved to the cluster to be adjusted as the data to be moved, and the data to be moved is moved to the cluster to be adjusted. , The cluster that originally stored the data to be moved can be deleted to prevent the same data from occupying different clusters, so that there are more free clusters in the target volume to store subsequent data.
[0082] Where the data volume of the data to be moved matches the remaining storage space in the cluster to be adjusted means: if the data volume of the data in the remaining clusters with the same file identifier as the data stored in the cluster to be adjusted is greater than or equal to the remaining storage in the cluster to be adjusted Space, the data to be moved with the same amount of data as the remaining storage space can be obtained. If the amount of data in the remaining clusters with the same file identifier as the data stored in the cluster to be adjusted is less than the remaining storage space in the cluster to be adjusted, the The data in the remaining clusters that has the same file identifier as the data stored in the cluster to be adjusted is regarded as the data to be moved.
[0083] Still with the above figure 2 As an example, from the data distribution of each cluster in the target volume numbered 2, it can be seen that the first cluster still has remaining storage space, and the data stored in the sixth cluster corresponds to the same file as the data stored in the first cluster In the adjustment, the data stored in the sixth cluster can be merged into the first cluster as the data to be moved, and then the data stored in the sixth cluster can be deleted. The same operation can also be referred to in other clusters. After the operation, the data distribution of each cluster in the target volume is as follows image 3 The number 3 in is shown, but there are still clusters with a remaining size in the target volume after the adjustment, which is determined by the data volume of the file data stored in the source volume and the cluster size of the target volume.
[0084] Although the above image 3 Shows an adjustment method provided by this embodiment, but after image 3 After the adjustment method shown is adjusted, the data corresponding to the same file identifier appears out of order, so before applying the data stored in each cluster of the target volume, sort adjustment needs to be performed. In order to solve this problem, this embodiment also provides another adjustment method. After being adjusted by this adjustment method, the data corresponding to the same file identifier stored in each cluster of the target volume is in order, that is, there will be no disorder. The adjustment method is as follows:
[0085] From the remaining clusters of the target volume, the data that has the same file identifier as the data stored in the cluster to be adjusted and is sequenced consecutively to the data stored in the cluster to be adjusted is the data to be moved. The data whose sorting is continuous with the data stored in the clusters to be adjusted refers to: the data to be adjusted and the data to be moved are adjacent to each other in the data identified by the same file and the data to be moved is located after the data to be adjusted, if the data stored in each cluster There is also a data block identifier, then the sorted data that is continuous with the data stored in the cluster to be adjusted means: the data block identifier is adjacent to the data block identifier of the data stored in the cluster to be adjusted, and the data block identifier is located in the data stored in the cluster to be adjusted The data after the block identifier.
[0086] To figure 2 For the first cluster numbered 2, the database ID corresponding to its data is 1a, and the sorted and continuous data is 1b, then 1b needs to be merged into the first cluster during adjustment, and 1b is adjusted After the first cluster, there will be remaining storage space in the fourth cluster. The data in the sixth cluster can be merged into the fourth cluster. The adjusted data distribution obtained by analogy is as follows Figure 4 The number 4 in is an example.
[0087] One thing to note here is: after adjustment, there may be idle clusters among the clusters storing data in the target volume, then after the adjustment, the data stored in the target volume can be moved again to prevent data from being stored in the target volume There are free clusters between each cluster, that is, data is stored through continuous clusters, such as Figure 5 Shown. In addition, it should be pointed out that in addition to compressed files, there is a special type of sparse files. Sparse files can be used figure 1 The process shown is copied from the source volume to the target volume, but it is different from files other than compressed files and sparse files (which are regarded as ordinary files) in that the sparse files are all 0s stored in the target volume.
[0088] 104: Based on the data stored in each cluster of the target volume, reconstruct each metafile corresponding to the target volume. The process of rebuilding each metafile corresponding to the target volume may be: based on the data stored in each cluster of the target volume, update the index distribution record data in the first metafile of the target volume, such as updating the logical cluster number in the non-resident attribute And the virtual cluster number in the index allocation and attribute list; based on the data stored in each cluster of the target volume, the rest of the meta files except the first meta file are reconstructed.
[0089] In the current NTFS system, the meta files of the NTFS system are shown in Table 1.
[0090]
[0091] The first metafile is the above-mentioned $MFT metafile, which is composed of file records, and file records are composed of attributes. There are many types of attributes, and each attribute has an attribute header and an attribute body. The attribute header is divided into resident attributes and non-resident attributes. The resident attributes are in the file record. The non-resident attributes need to be recorded in other data areas because a file record cannot fit (a file record is only 1024 bytes). The logical cluster number of the data area is recorded. Index allocation is a non-resident attribute. Index records are stored in a B+ tree (each index record size is 4096 bytes). Each index record consists of multiple index items, and each index item records the virtual cluster of the child item. number. The attribute list is a resident or non-resident attribute. When the file record is too large, the attribute list will appear, recording the sub-record number and the attribute type stored in the sub-record and the corresponding virtual cluster number. The virtual cluster number indicates the data block of the file. If the cluster size of the source volume is smaller than the cluster size of the target volume, the target volume will merge the data blocks stored in the cluster of the source volume, resulting in the occurrence of the virtual cluster number Change, the logical cluster number indicates the number of clusters in the target volume, in other words it indicates the position of the data block in the target volume. Similarly, when the cluster size of the source volume is smaller than the cluster size of the target volume, the data block of the file The location of is changed to the location in the target volume, so the logical cluster number will also change. In this case, the virtual cluster number and logical cluster number in the $MFT metafile need to be changed.
[0092] In addition to the above-mentioned $MFT meta-file, the reconstruction process of other meta-files can refer to the prior art. One point that needs to be explained here is: in the process of copying data from the source volume to the target volume, the $Bitmap and $LogFile , $BadClus, $MFTMirr, and $Boot will change. To this end, these meta files need to be rebuilt. The process is as follows:
[0093] Rebuild the volume bitmap according to the target partition data distribution, so as to update the $Bitmap metafile record according to the reconstructed volume bitmap;
[0094] Rebuild the file system log file to update the $LogFile file record, such as clearing all the contents of the $LogFile file;
[0095] Clear the bad cluster information to update the $BadClus metafile record;
[0096] Rebuild $MFTMirr so that the content in the $MFTMirr file is the same as the first 4 file records of $MFT;
[0097] Rebuild $Boot to update the basic parameter information of the file system.
[0098] For the foregoing reconstruction process, please refer to the prior art, which is not described in this embodiment.
[0099] 105: If the cluster size of the source volume is greater than or equal to the cluster size of the target volume, clone the data stored in each cluster of the source volume to each cluster of the target volume.
[0100] One copy method is to obtain the data stored in each cluster of the source volume in sequence from the first cluster of the source volume; based on the file identifiers corresponding to the data stored in each cluster of the source volume, the data of the source volume obtained in turn The data stored in each cluster is sequentially stored in each cluster of the target volume.
[0101] If the cluster size of the source volume is equal to the cluster size of the target volume, the data stored in a cluster of the source volume can occupy a cluster of the target volume. Therefore, the cluster size of the source volume is equal to the cluster size of the target volume. After reading a data in the target volume, select a cluster for storage, such as Image 6 As shown, it can be seen that if the cluster size of the source volume is equal to the cluster size of the target volume, the data distribution in the source volume cluster is the same as the distribution in the target volume cluster.
[0102] If the cluster size of the source volume is greater than the cluster size of the target volume, the data stored in one cluster of the source volume needs to occupy multiple clusters of the target volume, and the data needs to be split after reading data from a cluster of the source volume The number of data blocks obtained by splitting is related to the cluster size of the target volume and the cluster size of the source volume. The specific number of databases obtained by splitting is N=X/Y, and X is the size of the source volume. Cluster size, Y is the cluster size of the target volume, and the corresponding target volume needs N clusters to store data in one cluster of the source volume.
[0103] To Figure 7 For example, in Figure 7 The cluster size of the source volume is 8k, and the cluster size of the target volume is 4k. The data in the cluster of a source volume needs to be stored in two clusters in the target volume. Figure 7 Each data is split into two data blocks, such as 1a is split into 1a1 and 1a2.
[0104] 106: Based on the data stored in each cluster of the target volume, reconstruct each metafile corresponding to the target volume. The way to rebuild each metafile corresponding to the target volume can be: if the cluster size of the source volume is equal to the cluster size of the target volume, update the index distribution record data in the first metafile of the target volume, such as updating the logic in the non-resident attribute Cluster number, and rebuild other metafiles except the first metafile; if the cluster size of the source volume is larger than the cluster size of the target volume, update the index distribution record data in the first metafile of the target volume, such as updating non-resident attributes The logical cluster number in the index allocation and the virtual cluster number in the attribute list; rebuild the rest of the metafiles except the first metafile. For specific instructions, please refer to step 104, which will not be described in this embodiment.
[0105] With the above technical solution, when cloning between different disks, the cluster size of the source volume and the cluster size of the target volume are determined. If the cluster size of the source volume is smaller than the cluster size of the target volume, the data stored in each cluster of the source volume Clone into each cluster of the target volume, adjust the data stored in each cluster of the target volume based on the file identifier corresponding to the data, and reconstruct each metafile corresponding to the target volume based on the data stored in each cluster of the target volume, Realize the direct copying of data from the cluster of the source volume to the target volume, saving the system consumption of frequent opening and closing of many files, and improving the copy efficiency. And the copied data is used to characterize the directory organization structure of the file to which the data belongs. In the process of copying these data, file creation related data such as file creation time and multi-stream file naming stream can be copied to the target volume to prevent the change of file creation related data , Which can ensure the normal operation of programs that rely on files to create related data.
[0106] See Figure 8 , Which shows a flowchart of another method for cloning an NTFS volume between disks provided by an embodiment of the present invention, and explains the process of cloning between disks when the file to which the data belongs is a compressed file. figure 1 The following steps can also be included on the basis:
[0107] 107: If the file to which the data belongs is a compressed file, obtain the data belonging to the compressed file from each cluster in the source volume.
[0108] It is understandable that when a compressed file is stored in the form of clusters, a compressed file can use a certain number of clusters as a compression unit. For example, for the source volume, 16 clusters are used as a compression unit, which is correspondingly stored by a compression unit. When a compressed file with a size of 16 clusters is stored, its storage method is: 1 2 3 4 5'6'7'8'9'10'11'12'13'14'15'16', where the first 1 to The 4 clusters store the actual data in the compressed file, and 5 to 16 can be regarded as sparse clusters. The data stored in these clusters is 0, then the first 1 to 4 clusters can be copied directly when copying the data in the compressed file The stored data, after decompressing the first 1 to 4 clusters, is the data of the size of 16 clusters.
[0109] 108: Decompress the data belonging to the compressed file to obtain the original data corresponding to the compressed file. As for the method used for decompression, it depends on which method is used for compression, which is not described in this embodiment.
[0110] 109: Store the original data in each cluster of the target volume in turn. How many clusters in the target volume are required to store the original data depends on the data volume of the original data and the cluster size of the target volume.
[0111] Let’s take an example to illustrate. For example, a compressed file has 16*100 clusters, that is, a compression unit has 16 clusters, and 100 compression units are used to store a compressed file. The cluster size of the source volume is 2K, and the cluster size of the target volume is 2K. 4K, the corresponding process from the source volume to the target volume is:
[0112] 1) Read 16 clusters of compressed file in sequence each time;
[0113] 2) Decompress the 16 cluster data to obtain the original data corresponding to the 16 cluster data;
[0114] 3) Obtain 8 free clusters from the target volume. For example, if the first 100 clusters of the target volume are already occupied by ordinary files, then find 8 free clusters starting from the 101 cluster (because the cluster size of the target volume is the cluster size of the source volume). Twice the size), the original data is sequentially written into the 8 clusters obtained.
[0115] In writing the original data to each cluster of the target volume, the metafile also needs to be modified. For the specific process, please refer to the foregoing embodiment, which will not be described in this embodiment.
[0116] For the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that the present invention is not limited by the described sequence of actions, because according to the present invention, Some steps can be performed in other order or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the involved actions and modules are not necessarily required by the present invention.
[0117] Corresponding to the above-mentioned embodiment, the embodiment of the present invention also provides an inter-disk NTFS volume cloning device, the structure of which is as follows Picture 9 As shown, it may include: a determination unit 001, a judgment unit 002, a copy unit 003, an adjustment unit 004, and a reconstruction unit 005.
[0118] The determining unit 001 is used to determine the cluster size of the source volume and the cluster size of the target volume.
[0119] It is understandable that the source volume is the known storage location of the data currently to be copied, the target volume is the location where the data currently to be copied will be stored, and the source volume and the target volume may use different minimum disk units, such as the source The volume and the target volume may use clusters with different sizes to store data. For this reason, the cluster size of the source volume and the cluster size of the target volume need to be determined before cloning.
[0120] The cluster size of the source volume and the cluster size of the target volume include three situations: one is that the cluster size of the source volume is smaller than the cluster size of the target volume; the other is that the cluster size of the source volume is larger than the cluster size of the target volume; One is that the cluster size of the source volume is equal to the cluster size of the target volume.
[0121] When the source volume stores data, it will cross-store the data of different files in each cluster. In this case, if the cluster size of the source volume is equal to the cluster size of the target volume, the data obtained from the cluster of the source volume can be directly copied To a cluster corresponding to the target volume; if the cluster size of the source volume is greater than the cluster size of the target volume, after obtaining a piece of data from the cluster of the source volume, multiple clusters in the target volume can be used to store this data, where The total size of the cluster of the piece of data is the same as the size of the piece of data, that is, the storage space of all clusters storing the piece of data is the same as the amount of data; therefore, no matter the cluster size of the source volume is equal to the cluster size of the target volume, If the cluster size of the source volume is greater than the cluster size of the target volume, the data in each cluster of the source volume can be cloned into each cluster of the target volume in turn, and each cluster of the target volume can be full, but if the source volume The cluster size of is smaller than the cluster size of the target volume, there will be data obtained sequentially from the clusters of the source volume belonging to different files, and the data belonging to different files cannot be stored in the same cluster of the target volume, so in this case There is a problem that the clusters of the target volume cannot be filled. Correspondingly, the data storage needs to be adjusted after copying the data to the clusters of the target volume.
[0122] In view of this, in this embodiment, before cloning, it is necessary to first determine whether the cluster size of the source volume is smaller than the cluster size of the target volume through the judgment unit 002, and trigger corresponding processing based on the judgment result of the judgment unit 002, the specific copy unit 003 and adjustment unit 004 Each cluster of the source volume corresponding to the reconstruction unit 005 is smaller than each cluster of the target volume; each cluster of the source volume corresponding to the copy unit 003 and reconstruction unit 005 is larger than each cluster of the target volume, and each cluster of the source volume is equal to each cluster of the target volume. The implementation is as follows.
[0123] The copy unit 003 is used to clone data stored in each cluster of the source volume to each cluster of the target volume.
[0124] In this embodiment, a copying method of the copying unit 003 is: sequentially acquiring data stored in each cluster of the source volume starting from the first cluster of the source volume; based on the files corresponding to the data stored in each cluster of the source volume Identification, the data stored in each cluster of the source volume obtained in sequence are stored in each cluster of the target volume in order to prevent data corresponding to different file identifications from being stored in a cluster of the target volume, and the data is stored in the target volume in turn The clusters of the volume indicate that the clusters used to store data in the target volume are continuous during the process of copying data, and there are no free clusters in the clusters used to store data (that is, there is no cluster that does not store data).
[0125] In this embodiment, the file identifier corresponding to the data is used to characterize the file to which the data belongs, that is, the file identifier is used to determine which file the data belongs to, and the data is used to characterize the directory organization structure of the file to which the data belongs, so as to realize the copy of the directory organization structure of the file. And the file to which the data belongs is not a compressed file. The reason for separating the compressed file is because the data storage in the compressed file will change after the cluster size changes. Copying the compressed file directly from the source volume cannot keep the original data corresponding to the compressed file valid. Therefore, the compressed file cannot be copied in the manner of this embodiment.
[0126] Please refer to the method embodiment for the specific description of the foregoing copying method, which will not be described in this embodiment.
[0127] The adjustment unit 004 is configured to adjust the data stored in each cluster of the target volume based on the file identifier corresponding to the data if the cluster size of the source volume is smaller than the cluster size of the target volume, and merge the corresponding data in the target volume by data adjustment The data in each cluster identified by the same file is merged to reduce the number of clusters whose storage space in the target volume is not full.
[0128] In this embodiment, refer to Picture 10 , The adjustment unit 004 includes: an obtaining subunit 401, a moving subunit 402, and a deleting subunit 403.
[0129] The obtaining subunit 401 is used to obtain data with the same file identifier as the data stored in the cluster to be adjusted from the remaining clusters of the target volume if the size of a cluster to be adjusted in the target volume is greater than the size of the data stored in the cluster to be adjusted. For the data to be moved, the amount of data to be moved matches the remaining storage space in the cluster to be adjusted.
[0130] If the size of a cluster in the target volume is greater than the size of the data stored in the cluster, it means that the cluster is not occupied and there is still remaining storage space to store data. At this time, the cluster can be used as the cluster to be adjusted to adjust the cluster. The data in the cluster, specifically, the data in the remaining clusters that has the same file identifier as the data stored in the cluster to be adjusted is moved to the cluster to be adjusted as the data to be moved, and the data to be moved is moved to the cluster to be adjusted. , The cluster that originally stored the data to be moved can be deleted to prevent the same data from occupying different clusters, so that there are more free clusters in the target volume to store subsequent data.
[0131] In this embodiment, the obtaining subunit 401 is specifically configured to obtain, from the remaining clusters of the target volume, data that has the same file identifier as the data stored in the cluster to be adjusted and is sequenced consecutively to the data stored in the cluster to be adjusted as the data to be moved. After being adjusted by this adjustment method, the data corresponding to the same file identifier stored in each cluster of the target volume is in order, that is, there is no disorder.
[0132] The moving subunit 402 is used to move the data to be moved to the cluster to be adjusted.
[0133] The deletion subunit 403 is used to delete the data to be moved stored in the remaining clusters.
[0134] For a specific description of the adjustment method of the adjustment unit 004, please refer to the method embodiment, which is not described in this embodiment.
[0135] The reconstruction unit 005 is used to reconstruct each metafile corresponding to the target volume based on the data stored in each cluster of the target volume. For the reconstruction process of the above reconstruction unit 005, please refer to the method embodiment, which will not be described in this embodiment.
[0136] With the above technical solution, when cloning between different disks, the cluster size of the source volume and the cluster size of the target volume are determined. If the cluster size of the source volume is smaller than the cluster size of the target volume, the data stored in each cluster of the source volume Clone to each cluster of the target volume, adjust the data stored in each cluster of the target volume based on the file identifier corresponding to the data, and reconstruct each metafile corresponding to the target volume based on the data stored in each cluster of the target volume, Realize the direct copying of data from the cluster of the source volume to the target volume, saving the system consumption of frequent opening and closing of many files, and improving the copy efficiency. And the copied data is used to characterize the directory organization structure of the file to which the data belongs. In the process of copying these data, file creation related data such as file creation time and multi-stream file naming streams can be copied to the target volume to prevent changes in file creation related data , Which can ensure the normal operation of programs that rely on files to create related data.
[0137] See Picture 11 , The embodiment of the present invention also provides another inter-disk NTFS volume cloning device, in Picture 9 On the basis of, it may also include: an obtaining unit 006, a decompression unit 007, and a storage unit 008.
[0138] The judging unit 002 is also used to judge whether the file to which the data belongs is a compressed file.
[0139] The obtaining unit 006 is configured to obtain data belonging to the compressed file from each cluster in the source volume if the file to which the data belongs is a compressed file.
[0140] The decompression unit 007 is configured to decompress the data belonging to the compressed file to obtain original data corresponding to the compressed file.
[0141] The storage unit 008 is used to sequentially store the original data in each cluster of the target volume.
[0142] For the description of the working process of the above-mentioned units, please refer to the method embodiment, which will not be described in this embodiment.
[0143] The embodiment of the present invention also provides a terminal, the structure of which is as follows Picture 12 As shown, it may include a processor 11 and a disk 12, and the disk 12 is used as the disk where the target volume is located.
[0144] The processor 11 is used to determine the cluster size of the source volume and the cluster size of the target volume. If the cluster size of the source volume is smaller than the cluster size of the target volume, the data stored in each cluster of the source volume is cloned into each cluster of the target volume; based on the file identifier corresponding to the data, the data stored in each cluster of the target volume is performed Adjusted, the file identifier corresponding to the data is used to characterize the file to which the data belongs, and the data is used to characterize the directory organization structure of the file to which the data belongs and the file to which the data belongs is not a compressed file; based on the data stored in each cluster of the target volume, reconstruct each corresponding to the target volume Meta file.
[0145] In this embodiment, the processor 11 is also configured to clone data stored in each cluster of the source volume to each cluster of the target volume if the cluster size of the source volume is greater than or equal to the cluster size of the target volume, based on The data stored in each cluster of the target volume is reconstructed to each metafile corresponding to the target volume.
[0146] In this embodiment, the processor 11 is specifically configured to sequentially obtain data stored in each cluster of the source volume starting from the first cluster of the source volume, and based on the file identifier corresponding to the data stored in each cluster of the source volume, The data stored in each cluster of the source volume obtained sequentially is stored in each cluster of the target volume in turn.
[0147] In this embodiment, the processor 11 adjusting the data stored in each cluster of the target volume based on the file identifier corresponding to the data includes the following steps:
[0148] If the size of a cluster to be adjusted in the target volume is greater than the size of the data stored in the cluster to be adjusted, the data with the same file identifier as the data stored in the cluster to be adjusted is obtained from the remaining clusters of the target volume as the data to be moved. The amount of data matches the remaining storage space in the cluster to be adjusted; the data to be moved is moved to the cluster to be adjusted; the data to be moved stored in the remaining clusters is deleted.
[0149] In this embodiment, the processor 11 obtains data with the same file identifier as the data stored in the cluster to be adjusted from the remaining clusters of the target volume as the data to be moved. One way is to obtain from the remaining clusters of the target volume and The data stored in the cluster to be adjusted has the same file identifier and the sequence is continuous with the data stored in the cluster to be adjusted is the data to be moved.
[0150] In this embodiment, the processor 11 is also configured to obtain data belonging to the compressed file from each cluster in the source volume if the file to which it belongs is a compressed file; decompress the data belonging to the compressed file to obtain the compressed file Corresponding original data; the original data is sequentially stored in each cluster of the target volume.
[0151] For a specific description of the working process of the aforementioned processor 11, please refer to the method embodiment, which is not described in this embodiment.
[0152] The embodiment of the present invention also provides a storage medium in which computer program code is stored, and the computer program code is executed to implement the above-mentioned inter-disk NTFS volume cloning method.
[0153] It should be noted that the various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments. For the same and similar parts between the various embodiments, refer to each other. can. As for the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
[0154] Finally, it should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities Or there is any such actual relationship or sequence between operations. Moreover, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements not only includes those elements, but also includes those that are not explicitly listed Other elements of, or also include elements inherent to this process, method, article or equipment. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other same elements in the process, method, article, or equipment including the element.
[0155] The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be obvious to those skilled in the art, and the general principles defined in this document can be implemented in other embodiments without departing from the spirit or scope of the present invention. Therefore, the present invention will not be limited to the embodiments shown in this document, but should conform to the widest scope consistent with the principles and novel features disclosed in this document.
[0156] The above are only the preferred embodiments of the present invention. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present invention, several improvements and modifications can be made, and these improvements and modifications are also It should be regarded as the protection scope of the present invention.
PUM


Description & Claims & Application Information
We can also present the details of the Description, Claims and Application information to help users get a comprehensive understanding of the technical details of the patent, such as background art, summary of invention, brief description of drawings, description of embodiments, and other original content. On the other hand, users can also determine the specific scope of protection of the technology through the list of claims; as well as understand the changes in the life cycle of the technology with the presentation of the patent timeline. Login to view more.
Similar technology patents
Monitoring data processing system and method
Owner:TSINGOAL BEIJING TECH CO LTD
Fusion protein for cellulose accessibility measurement and application thereof
Owner:TSINGHUA INNOVATION CENT IN DONGGUAN +1
Unified communications module (UCM)
ActiveCN105607469Aavoid change
Owner:SCHNEIDER ELECTRIC SYST USA INC
Asynchronous processing method and system for rights and interests ordering
PendingCN112232911AReduce system consumptionImprove response time
Owner:北京思特奇信息技术股份有限公司
Automatic production line for mold grinding and polishing
Owner:WEICHAI POWER CO LTD
Classification and recommendation of technical efficacy words
- Reduce system consumption
Asynchronous processing method and system for rights and interests ordering
PendingCN112232911AReduce system consumptionImprove response time
Owner:北京思特奇信息技术股份有限公司
Monitoring data processing system and method
Owner:TSINGOAL BEIJING TECH CO LTD