A distributed file storage asynchronous remote replication method and device
By recording detailed differences on the file system metadata server and triggering incremental or full synchronization at thresholds, the problem of resource waste in existing technologies is solved, achieving efficient remote asynchronous file storage and improving synchronization performance and user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA ELECTRONICS CLOUD DIGITAL INTELLIGENCE TECH CO LTD
- Filing Date
- 2022-10-08
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies for remote asynchronous file storage cannot achieve efficient data storage with minimal processing resources, leading to resource waste or synchronization delays.
The master file system metadata server (MDS) records detailed differences in data changes through asynchronous remote replication. When the amount of data changes or directory differences exceeds a threshold, the corresponding synchronization program is triggered to perform incremental or full synchronization, adopting a multi-level difference recording strategy.
With minimal resource consumption, it meets the needs of asynchronous remote replication of file storage for different scenarios and scales, improves synchronization performance and user experience, and reduces the impact of difference log storage space and host access.
Smart Images

Figure CN115576911B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer technology, and in particular to a method and apparatus for asynchronous remote copying of distributed file storage. Background Technology
[0002] Existing remote asynchronous storage for files mainly relies on the synchronous data side to store changed data or to perform full synchronous storage of all data. However, this approach, which does not differentiate between the amount of data change and uniformly sets it to full storage or variable storage, often results in a waste of processing resources when the amount of changed data is small and full storage is used, or when the amount of changed data is large and the changed data is identified and then stored, leading to a waste of processing resources. Therefore, how to ensure asynchronous data storage with minimal processing resources has become an urgent problem to be solved. Summary of the Invention
[0003] This invention provides a method and apparatus for asynchronous remote replication of distributed file storage, in order to solve the problem that the prior art cannot achieve asynchronous data storage with minimal processing resources.
[0004] In a first aspect, the present invention provides a distributed file storage asynchronous remote replication method, the method comprising: recording detailed differences in data changes through a file system metadata server (MDS) on the asynchronous remote replication master, so that the synchronization program on the asynchronous remote replication master triggers the synchronization program on the asynchronous remote replication slave to perform data synchronization based on the recorded detailed differences; when the magnitude of the data changes on the asynchronous remote replication master exceeds a first preset threshold, the MDS on the asynchronous remote replication master is triggered to record directory differences in the data changes, and the synchronization program on the asynchronous remote replication master triggers the synchronization program on the asynchronous remote replication slave to perform data synchronization based on the directory information differences; when the magnitude of the directory information differences on the asynchronous remote replication master exceeds a second preset threshold, the synchronization program on the asynchronous remote replication master performs full synchronization of all data changes on the asynchronous remote replication master.
[0005] Optionally, the first preset threshold is a threshold for the number of data differences, a threshold for the time of difference records, or a threshold for the capacity of data differences; the second preset threshold is a threshold for the number of directory differences, a threshold for the time of difference records, or a threshold for the capacity of directory differences.
[0006] Optionally, recording detailed differences in data changes via the asynchronous remote replication master's MDS includes: when difference data is generated, the asynchronous remote replication master's MDS records the difference data through a distributed key-value store, wherein the key value prefix of the difference data is...
[0007] The variable `detail_diff_snapid_type_inode` is defined as follows: `inode` is the index number of the file or directory with the difference, `type` represents the file or directory, `snapid` is the current snapshot ID, and `detail_diff` is a fixed prefix representing the stored detailed difference information.
[0008] Optionally, the method further includes: recording the number of difference data by using the total number of difference records;
[0009] When the total number of difference records is less than the data difference record threshold, or the difference record time is greater than the difference record time threshold, or the data difference capacity is greater than the data difference capacity threshold, the synchronization process of the asynchronous remote replication master performs data synchronization based on the recorded detailed differences, including:
[0010] The synchronization process of the asynchronous remote replication master reads each difference data record and, according to the difference type of the difference bitmap record, reads the latest value of the file / directory from the snapshot for synchronization.
[0011] Optionally, the MDS of the asynchronous remote replication master records directory differences of data changes, including: the MDS of the asynchronous remote replication master first converts the existing detailed differences into directory differences and deletes the recorded detailed differences. At the same time, before the arrival of a new round of synchronization cycle, the parent directory of the new metadata and data changes generated is recorded as the directory where the differences exist.
[0012] Optionally, the synchronization program on the asynchronous remote replication master performs synchronization based on directory differences, including: for directories whose paths recorded in the difference data have not changed between the old and new snapshots, the synchronization program on the asynchronous remote replication master sorts all subdirectories / files in the two directories by inode number and compares them one by one. For subfiles with differences in metadata / extended attributes / data, and for subdirectories with differences in metadata / extended attributes, the program directly generates corresponding difference tasks for the differences; for subfiles unique to the old directory, deletion records are generated first, and for subfiles unique to the new directory, creation records are generated first; for directories whose paths have changed, the program continues to traverse the entire directory tree and only examines the descendant directories in the directory tree, generating deletion records for the descendant directories in the old snapshot and creation records for these descendant directories in the new snapshot.
[0013] Optionally, the step of comparing each sub-file / subdirectory and generating difference data in real time includes performing the following processing in sequence:
[0014] For directories that have both creation and deletion records, merge the creation and deletion records into a single rename record.
[0015] For all directory rename records, generate a synchronization task for the rename directory. At the same time, in the directory of the new and old snapshots, traverse all its subfiles, sort them by inode, and compare them. Generate synchronization tasks for the metadata, data, and extended attributes of the subfiles according to the actual situation, or create and delete records for the subfiles.
[0016] For each new record created in all directories, a synchronization task for the new directory is generated.
[0017] All files that contain both newly created and deleted records will have their newly created and deleted records merged into a single file named "rename".
[0018] For all records of creating, deleting, and renaming files, generate corresponding file creation, deletion, and renaming synchronization tasks;
[0019] For all directory deletion records, generate a synchronization task for the deleted directories;
[0020] The asynchronous remote replication master's synchronization program performs data synchronization based on directory differences, including: the difference tasks generated by the asynchronous remote replication master's synchronization program need to be executed in order from the root directory to the leaf directories / files to prevent execution failure on the slave end; the asynchronous remote replication slave's synchronization program parses the difference data sent by the asynchronous remote replication master's synchronization program and writes the data to the corresponding location.
[0021] Optionally, the method further includes: both the synchronization program of the asynchronous remote replication master and the synchronization program of the asynchronous remote replication slave delete previously stored snapshots according to a preset deletion cycle.
[0022] Secondly, the present invention provides a distributed file storage asynchronous remote replication device, which is set at the asynchronous remote replication master end, and the device includes a first processing unit, a second processing unit and a third processing unit;
[0023] The first processing unit is used to trigger the file system metadata server (MDS) of the asynchronous remote replication master to record detailed differences in the data changes of the asynchronous remote replication master, so that the first synchronization program on the asynchronous remote replication master can trigger the second synchronization program on the asynchronous remote replication slave to perform data synchronization based on the recorded detailed differences.
[0024] The second processing unit is configured to trigger the directory difference of the MDS record data change when the amount of data change at the asynchronous remote replication master exceeds a first preset threshold, and cause the first synchronization program to trigger the second synchronization program to perform data synchronization based on the directory information difference.
[0025] The third processing unit is configured to, when the difference in directory information at the asynchronous remote replication master exceeds a second preset threshold, trigger the second synchronization program to perform full synchronization of the difference data.
[0026] Thirdly, the present invention provides a computer-readable storage medium storing a signal-mapped computer program, which, when executed by at least one processor, implements any of the above-described distributed file storage asynchronous remote copying methods of the present invention.
[0027] The beneficial effects of this invention are as follows:
[0028] This invention, through extensive research, establishes thresholds for data change and directory information difference. By comparing the amount of data change and directory information difference with these thresholds, it determines whether to store the changed data or perform a full data storage. Specifically, this invention employs a multi-level difference recording approach. Detailed differences are recorded in the early stages, resulting in the highest incremental synchronization performance. When the difference exceeds the threshold, directory differences are used, significantly reducing the storage space of the difference log and minimizing the impact on host access. This also avoids a full synchronization of the entire directory tree, improving synchronization performance to some extent. Furthermore, when directory differences also exceed the threshold, no further differences are recorded; at this point, the entire directory tree is considered to have undergone significant changes, and the next synchronization will be a full synchronization. Therefore, this invention can meet the needs of asynchronous remote file storage replication for different scenarios and scales with minimal resource consumption, thereby greatly improving the user experience.
[0029] The above description is merely an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention and to implement it in accordance with the contents of the specification, and in order to make the above and other objects, features and advantages of the present invention more apparent and understandable, specific embodiments of the present invention are described below. Attached Figure Description
[0030] Various other advantages and benefits will become apparent to those skilled in the art upon reading the following detailed description of preferred embodiments. The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:
[0031] Figure 1 This is a flowchart illustrating a distributed file storage asynchronous remote copying method provided in the first embodiment of the present invention;
[0032] Figure 2 This is a system architecture diagram of the master cluster and slave cluster provided in the first embodiment of the present invention;
[0033] Figure 3 This is a schematic diagram of data synchronization from the master cluster to the slave cluster provided in the first embodiment of the present invention;
[0034] Figure 4 This is a schematic diagram of the structure of a distributed file storage asynchronous remote copying device provided in the first embodiment of the present invention. Detailed Implementation
[0035] This invention addresses the problem of existing methods failing to balance minimal resource consumption with effective remote asynchronous data storage. It records detailed differences in the early stages, achieving the highest incremental synchronization performance. When the difference exceeds a threshold, directory differences are used, significantly reducing the storage space of the difference log and minimizing the impact of host access. This also avoids a full synchronization of the entire directory tree, improving synchronization performance to some extent. When directory differences also exceed a threshold, no further differences are recorded, indicating a significant change in the entire directory tree. The next synchronization then performs a full synchronization. Practice has shown that this multi-level difference recording method can meet the needs of asynchronous remote file storage replication for different scenarios and scales with minimal resource consumption, thus greatly improving user experience. The invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and do not limit the scope of the invention.
[0036] The first embodiment of the present invention provides a distributed file storage asynchronous remote replication method, see [link to relevant documentation]. Figure 1 The method includes:
[0037] S101. The asynchronous remote replication master uses the file system metadata server (MDS) to record detailed differences in data changes, so that the synchronization program on the asynchronous remote replication master can trigger the synchronization program on the asynchronous remote replication slave to perform data synchronization based on the recorded detailed differences.
[0038] See Figure 2In this embodiment of the invention, remote replication synchronizes initial full data and incremental difference data from the master cluster to the slave cluster. The remote replication subsystem of the master cluster is responsible for configuring remote replication and synchronizing it to the slave cluster. It periodically executes remote replication tasks according to the current remote replication status, replicating data to the slave cluster. After receiving the data, the remote replication of the slave cluster updates the data to the file system. The synchronization program in this embodiment of the invention can be an MDS, i.e., a file system metadata server. The MDS is mainly responsible for reading and writing file system metadata and implements the file system snapshot function. When the remote replication function is enabled, the MDS is responsible for recording difference data when the metadata changes. Of course, in the entire method embodiment, the synchronization program in this embodiment of the invention will also use an OSD, i.e., an object server. The OSD is responsible for storing file data. A file consists of multiple objects of fixed length. When a file has a snapshot, each object records the data differences between the snapshots.
[0039] S102. When the amount of data change at the asynchronous remote replication master exceeds the first preset threshold, the directory difference of the MDS record data change at the asynchronous remote replication master is triggered, and the synchronization program at the asynchronous remote replication master triggers the synchronization program at the asynchronous remote replication slave to perform data synchronization based on the directory information difference.
[0040] Specifically, in this embodiment of the invention, when the amount of data change at the asynchronous remote replication master exceeds a first preset threshold, the first synchronization program is triggered to record directory information differences. Based on the recorded directory differences, the new and old snapshots are compared and analyzed to generate corresponding detailed differences, which are then sent to the second synchronization program, enabling the second synchronization program to perform data synchronization based on the recorded detailed differences.
[0041] S103. When the difference in directory information between the asynchronous remote replication master and the asynchronous remote replication master exceeds the second preset threshold, the synchronization program of the asynchronous remote replication master shall perform full synchronization of all data changes of the asynchronous remote replication master.
[0042] In other words, this embodiment of the invention employs a multi-level difference recording method. Detailed differences are recorded in the early stages, resulting in the highest incremental synchronization performance. When the amount of differences exceeds a threshold, directory differences are used, significantly reducing the storage space of the difference log and minimizing the impact of host access. This also avoids a full synchronization of the entire directory tree, improving synchronization performance to some extent. Furthermore, when directory differences also exceed a threshold, no more differences are recorded; at this point, the entire directory tree is considered to have undergone significant changes, and the next synchronization will be a full synchronization. Therefore, the multi-level difference recording method of this invention can meet the needs of asynchronous remote copying of file storage in different scenarios and of different scales with minimal resource consumption, thereby greatly improving the user experience.
[0043] It should be noted that the first preset threshold and the second preset threshold in the embodiments of the present invention can be arbitrarily set according to specific needs. For example, in specific implementation, the first preset threshold can be set as a data difference number threshold, a difference record time threshold, or a data difference capacity threshold; while the second preset threshold can be set as a directory difference number threshold, a difference record time threshold, or a directory difference capacity threshold. The specific threshold values can also be arbitrarily set, as long as the remote copy storage requirements can be guaranteed. The present invention does not impose specific limitations on this.
[0044] In specific implementation, this embodiment of the invention records detailed differences in data changes by asynchronously remotely replicating the file system metadata server (MDS) on the master end. Specifically, when difference data is generated, the file system metadata server (MDS) records the difference data through distributed key-value storage. The key value prefix of the difference data is detail_diff_snapid_type_inode, where inode is the index number of the file or directory with the difference, type represents the file or directory, snapid is the current snapshot ID, and detail_diff is a fixed prefix representing the stored detailed difference information.
[0045] In other words, the embodiments of the present invention use a unified distributed key-value pair to store and record differential data, and use the key-value pair in the specified format to record all differential data, so as to facilitate the recording, copying and access of differential data.
[0046] Meanwhile, this embodiment of the invention also needs to record the number of difference records total to record the number of difference data. When the total number of difference records is less than the data difference record threshold, or the difference record time is greater than the difference record time threshold, or the data difference capacity is greater than the data difference capacity threshold, the synchronization process of the asynchronous remote replication master end traverses all the difference data and converts the difference data into directory information differences. At the same time, for the new metadata and data changes generated before the arrival of a new synchronization cycle, the parent directory is also synchronously recorded as the difference directory.
[0047] For example, when the overall data change exceeds a first preset threshold, the MDS needs to traverse all recorded detailed differences and record the parent directories of these detailed differences as directory differences. Suppose that directories A and B contain differences at this time, and just before the synchronization period arrives, the data in directory C also changes. At this time, directory C also needs to be recorded as a directory difference. That is to say, the embodiment of the present invention compares the amount of changed data with the first and second preset thresholds based on the synchronization period as a whole. If the first preset threshold is reached within the synchronization period, the MDS no longer records detailed differences when processing metadata requests, but only records directory differences, which reduces the performance impact on metadata requests and also reduces the storage space occupied by difference records.
[0048] Furthermore, in this embodiment of the invention, remote replication can be configured based on a specific directory and requires the use of file system snapshot features. A full synchronization is performed during the initial synchronization. After the full synchronization is completed, incremental difference synchronization is periodically initiated according to a preset synchronization period. It should be noted that the preset synchronization period described in this embodiment can be arbitrarily set according to actual needs, and this invention does not impose detailed limitations on it.
[0049] In practical implementation, the synchronization program of the asynchronous remote replication master in this embodiment of the invention performs synchronization based on directory differences, including: for directories whose paths recorded by the difference data have not changed in the old and new snapshots: sort all subdirectories / files in the two directories by inode number and compare them one by one; for subfiles with differences in metadata / extended attributes / data, and subdirectories with differences in metadata / extended attributes, generate corresponding difference tasks directly for the differences; for subfiles unique to the old directory, generate deletion records first, and for subfiles unique to the new directory, generate creation records first; for directories whose paths have changed, continue to traverse the entire directory tree and only examine the descendant directories in the directory tree, generate deletion records for the descendant directories in the old snapshot, and generate creation records for these descendant directories in the new snapshot.
[0050] For directories that have both creation and deletion records, merge the creation and deletion records into a single rename record.
[0051] For all directory rename records, generate a synchronization task for the rename directory. At the same time, in the directory of the new and old snapshots, traverse all its subfiles, sort them by inode, and compare them. Generate synchronization tasks for the metadata, data, and extended attributes of the subfiles according to the actual situation, or create and delete records for the subfiles.
[0052] For each new record created in all directories, a synchronization task for the new directory is generated.
[0053] For all files that contain both newly created and deleted records, merge the newly created and deleted records into a single file named "rename".
[0054] For all records of creating, deleting, and renaming files, generate corresponding file creation, deletion, and renaming synchronization tasks;
[0055] For all directory deletion records, generate a synchronization task for the deleted directories;
[0056] The generated synchronization tasks need to be executed in order from the root directory to the leaf directories / files to prevent execution failures on the slave end.
[0057] Finally, the synchronization program on the asynchronous remote replication slave parses the difference data sent by the synchronization program on the asynchronous remote replication master and writes the data to the corresponding location.
[0058] The following will provide a detailed explanation and illustration of the method described in the embodiments of the present invention through a specific example:
[0059] The full synchronization process in this embodiment of the invention includes:
[0060] 1. Create a baseline snapshot for remotely replicating the home directory;
[0061] 2. Scan the baseline snapshot of the main directory and create a directory structure on the remote slave directory that is completely identical to the main directory;
[0062] 3. Write all the data in the main directory to the corresponding location in the secondary directory, including subdirectories, file data, and directory and file attributes (including basic attributes, extended attributes, ACLs, and some system-specific value-added feature data);
[0063] 4. After the data replication is complete, create a consistent snapshot from the directory.
[0064] The incremental synchronization process in this embodiment of the invention specifically includes:
[0065] 1. When the file system client writes data to the asynchronous remote replication master file storage cluster, the master MDS records the difference log while updating the metadata;
[0066] 2. When the synchronization period arrives, the synchronization program on the asynchronous remote replication master automatically starts synchronization. At this time, a baseline snapshot of this synchronization is created, and the difference log is analyzed to determine the data for incremental synchronization. The difference log records the changes in data and metadata of the master directory of the remote replication configuration between the two snapshots, including the creation, deletion, renaming, attribute modification, and file content modification of files and directories.
[0067] 3. The master synchronization program packages the difference logs into a message and sends it to the slave synchronization program. The slave parses the message, obtains the difference logs, and writes the difference information to the directory or file corresponding to the slave directory configured for remote replication.
[0068] 4. The master and slave ends will delete earlier snapshots based on the actual situation to ensure that there are not too many snapshots in the system;
[0069] Within the framework of snapshot-based asynchronous remote replication, this invention further proposes a difference recording method. This method is divided into three levels of difference recording: Level 1 records detailed differences, Level 2 records directory differences, and Level 3 does not record differences. By classifying the difference records, the system can record all detailed differences when the amount of difference between two snapshots is moderate. Incremental synchronization only synchronizes directories and files with differences, significantly improving performance. Furthermore, this invention can be applied to scenarios with large directory data volumes, such as billions of subdirectories and files, where numerous modifications are involved during the difference recording cycle. In such cases, raising the difference recording level to Level 2 or 3 can significantly reduce the storage space for difference records while expanding the granularity of difference records, thus greatly reducing the impact on the IO performance of the primary cluster. Simultaneously, this invention can also support abnormal scenarios, such as long-term network failures or long-term offline scenarios of the secondary cluster. It can adaptively raise the difference level to prevent the rapid expansion of detailed difference data from causing anomalies or failures in the primary cluster.
[0070] Specifically, the detailed difference recording algorithm in this embodiment of the invention includes:
[0071] All differences can be saved to a key-value database. For example, in a Ceph cluster, RocksDB can be used to store difference records. Key-value databases support efficient insert / modify / delete operations, as well as query and traversal operations.
[0072] The key value prefix for detailed difference data is detail_diff_snapid_hashid, where snapid is the current snapshot ID and hashid is used to distribute the difference data across all nodes in the file storage cluster to improve performance. The range of hash values can be selected according to the actual situation, and the hash value can be calculated using the inodes of the files / directories with differences.
[0073] There is a special key: detail_diff_snapid_hashid_total, which is used to record how many difference records have been recorded under this hash value, that is, the value is a number;
[0074] The other key is detail_diff_snapid_hashid_type_inode, where type represents a file or directory, inode is the inode number of that file or directory, and the core components of the value are as follows, which can be serialized during storage:
[0075] A bitmap that uses bits to represent events such as renaming, deletion, creation, metadata modification, extended attribute modification, and data modification of the file / directory;
[0076] nlink: the reference count of a file;
[0077] Initial path dentry list, which is the list of dentry names from the remote home directory to the inode when a difference occurs for the first time in this synchronization cycle;
[0078] Initial path inode list, which is the list of path inode numbers copied from the remote home directory to this inode when a difference occurs for the first time in this synchronization cycle;
[0079] Rename the target path dentry list;
[0080] Rename the target path inode list;
[0081] In this embodiment of the invention, the MDS, by default, records detailed differences at the beginning of each synchronization cycle. Whenever any directory / file is modified, the differences are recorded in the KV database according to the aforementioned format, and the bitmap is also cached in memory. Subsequent modifications of the same type are directly checked in memory to ensure they have already been recorded, without needing to be copied to disk. The MDS is also responsible for maintaining the detail_diff_snapid_hashid_total value. The MDS background periodically scans all detail_diff_snapid_hashid_total values and accumulates them. If the accumulated value exceeds the threshold, the difference record needs to be upgraded. The upgraded difference record is as follows:
[0082] All differences are saved to a KV database. The key value of the difference data is prefixed with `dir_diff_snapid_hashid`, where `snapid` is the current snapshot ID, and `hashid` is used to distribute the difference data across all nodes in the file storage cluster to improve performance. The range of hash values can be selected based on the actual situation. The hash value can be calculated using the inodes of the directories with differences. See details. Figure 3 .
[0083] In this embodiment of the invention, a special key, dir_diff_snapid_hashid_total, is set to record how many difference records have been recorded under this hash value. The value is a number. Other keys are dir_diff_snapid_hashid_inode, where the inode is the inode number of the difference directory, and the value is empty. Here, the difference directory refers only to any sub-file under this directory. Changes to the sub-directory are not included, but changes to nested sub-directories or changes to the directory itself are not included (if the directory itself changes, its parent directory will be recorded as the difference directory).
[0084] In specific implementation, the algorithm for upgrading detailed differences to directory differences in this embodiment of the invention is as follows:
[0085] MDS maintains a MAP table where the key is the snapshot ID and the value is a set containing the inode numbers of all differing directories. When a subfile or subdirectory under a directory changes, it can be quickly found in memory without needing to be copied to disk again, thus improving performance.
[0086] Iterate through all detailed difference records, record the parent inode number on its initial path as the difference directory. If the parent inode number does not exist in the MAP table, then remove the parent inode number and insert it into the MAP table.
[0087] If there is a destination path for renaming, the parent inode number on that path is recorded as the difference directory. If the parent inode number does not exist in the MAP table, the parent inode number is downloaded to disk and inserted into the MAP table. Finally, this detailed difference record is deleted to release database space. During this process, MDS is also responsible for maintaining dir_diff_snapid_hashid_total.
[0088] In this embodiment of the invention, the MDS background periodically scans all dir_diff_snapid_hashid_total values and accumulates them. If the accumulated value exceeds the threshold, a difference record upgrade is required. After the upgrade, differences are no longer recorded, and all difference records are cleared. This threshold is rarely reached in normal scenarios, but it typically occurs with large-scale file systems, and may happen during prolonged remote replication failures.
[0089] The incremental synchronization algorithm in this embodiment of the invention specifically includes:
[0090] For cases with detailed differences, the synchronization program on the remote replication master reads each difference record, retrieves the latest values of files / directories from the new snapshot according to the difference type recorded in the difference bitmap, and synchronizes them to the slave. However, for cases with only directory differences, it is necessary to traverse the directories of the old and new snapshots, compare each subfile / subdirectory, generate detailed differences in real time, and synchronize the differences to the slave.
[0091] For directories whose paths remain unchanged between the old and new snapshots, the following steps are taken: Sort all subdirectories / files in both directories by inode number and compare them one by one. For subfiles and subdirectories with differences in metadata / extended attributes / data, generate corresponding difference tasks directly for the differences. For subfiles unique to the old directory, generate deletion records first, and for subfiles unique to the new directory, generate creation records first. For directories whose paths have changed, continue traversing the entire directory tree and only examine the descendant directories in the directory tree. Generate deletion records for the descendant directories in the old snapshot and create records for these descendant directories in the new snapshot.
[0092] For directories that have both creation and deletion records, merge the creation and deletion records into a single rename record.
[0093] After completing the above steps, all changed directories have had their differences recorded. At this point, it's possible to identify which directories were renamed, which were created, and which were deleted. Additionally, for all directories whose paths remained unchanged, synchronization tasks have been generated for their files.
[0094] Now let's examine all the differences in the catalog again:
[0095] For all directory rename records, generate a synchronization task for the rename directory. At the same time, in the directory of the new and old snapshots, traverse all its subfiles, sort them by inode, and compare them. Generate synchronization tasks for the metadata, data, and extended attributes of the subfiles according to the actual situation, or create and delete records for the subfiles.
[0096] For each new record created in all directories, a synchronization task for the new directory is generated.
[0097] At this point, all directory differences except for those deleted have been generated into synchronization tasks and synchronized.
[0098] For all files with both new and deleted records, merge the new and deleted records into a single rename entry. At this point, all changed files have had their differences recorded. For all files with new, deleted, and rename records, generate corresponding file new, deleted, and rename synchronization tasks.
[0099] Finally, for all directory deletion records, generate a synchronization task for deleting directories;
[0100] The generated synchronization tasks need to be synchronized in order from the root directory to the leaf directories / files, and the task of deleting directories should be synchronized last to prevent the slave from failing.
[0101] In summary, this invention, through extensive research, sets thresholds for data change and directory information difference. By comparing the data change and directory information difference with these thresholds, a specific hierarchical replication strategy is implemented. Specifically, this invention employs a multi-level difference recording method. Detailed differences are recorded in the early stages, resulting in the highest incremental synchronization performance. When the difference exceeds the threshold, directory differences are used, significantly reducing the storage space of the difference log and minimizing the impact on host access. This also avoids a full synchronization of the entire directory tree, improving synchronization performance to some extent. Furthermore, when directory differences also exceed the threshold, no further differences are recorded, indicating a significant change in the entire directory tree, and the next synchronization will be a full synchronization. Therefore, the multi-level difference recording method described in this invention can meet the needs of asynchronous remote file storage replication for different scenarios and scales with minimal resource consumption, thereby greatly improving the user experience.
[0102] A second embodiment of the present invention provides a distributed file storage asynchronous remote replication device, the device comprising: the device being disposed at the asynchronous remote replication master end, see [link to documentation]. Figure 4 The device includes a first processing unit, a second processing unit, and a third processing unit;
[0103] The first processing unit is used to trigger the file system metadata server (MDS) of the asynchronous remote replication master to record detailed differences in the data changes of the asynchronous remote replication master, so that the first synchronization program on the asynchronous remote replication master can trigger the second synchronization program on the asynchronous remote replication slave to perform data synchronization based on the recorded detailed differences.
[0104] The second processing unit is configured to trigger the directory difference of the MDS record data change when the amount of data change at the asynchronous remote replication master exceeds a first preset threshold, and cause the first synchronization program to trigger the second synchronization program to perform data synchronization based on the directory information difference.
[0105] The third processing unit is configured to, when the difference in directory information at the asynchronous remote replication master exceeds a second preset threshold, trigger the second synchronization program to perform full synchronization of the difference data.
[0106] In other words, in the process of synchronizing data between the master end where data changes and the slave end where data is synchronized, the embodiments of the present invention set up a copying device on the master end. This device is used to record multi-level differences in the changed data, thereby enabling asynchronous remote copying of file storage in different scenarios and of different scales with minimal resource consumption, thus greatly improving the user experience.
[0107] Furthermore, the second processing unit in this embodiment of the invention is also used to, when the amount of data change at the asynchronous remote replication master exceeds a first preset threshold, trigger the MDS to record directory differences of data changes, traverse all directory differences, restore the actual detailed difference data of each directory and file through actual comparison of old and new snapshots, and generate a task to synchronize the detailed difference data to the synchronization program at the asynchronous remote replication slave, so that the synchronization program at the asynchronous remote replication slave can parse the difference data sent by the synchronization program at the asynchronous remote replication master and write the data to the corresponding location.
[0108] The relevant content of the embodiments of the present invention can be understood by referring to the method embodiments of the present invention, and will not be discussed in detail here.
[0109] A third embodiment of the present invention provides a computer-readable storage medium storing a computer program that maps signals. When executed by at least one processor, the computer program implements any of the distributed file storage asynchronous remote copying methods described in the first embodiment of the present invention.
[0110] The relevant content of the embodiments of the present invention can be understood by referring to the method embodiments of the present invention, and will not be discussed in detail here.
[0111] Although preferred embodiments of the invention have been disclosed for illustrative purposes, those skilled in the art will recognize that various modifications, additions, and substitutions are possible, and therefore the scope of the invention should not be limited to the embodiments described above.
Claims
1. A distributed file storage asynchronous remote replication method, characterized in that, include: The asynchronous remote replication master uses a file system metadata server (MDS) to record detailed differences in data changes, so that the synchronization program on the asynchronous remote replication master can trigger the synchronization program on the asynchronous remote replication slave to synchronize data based on the recorded detailed differences. When the amount of data change at the asynchronous remote replication master exceeds the first preset threshold, the directory difference of the MDS record data change at the asynchronous remote replication master is triggered, and the synchronization program at the asynchronous remote replication master triggers the synchronization program at the asynchronous remote replication slave to perform data synchronization based on the directory information difference. When the difference in directory information between the asynchronous remote replication master exceeds a second preset threshold, the synchronization program of the asynchronous remote replication master will perform full synchronization of all data changes on the asynchronous remote replication master. The synchronization program on the asynchronous remote replication master end triggers the synchronization program on the asynchronous remote replication slave end to perform data synchronization based on differences in directory information, including: The synchronization program on the asynchronous remote replication master end traverses all directory differences, restores the actual detailed difference data of each directory and file by comparing the old and new snapshots, and generates a task to synchronize the detailed difference data to the synchronization program on the asynchronous remote replication slave end. The synchronization program at the asynchronous remote replication slave end parses the difference data sent by the synchronization program at the asynchronous remote replication master end and writes the data to the corresponding location. The synchronization program on the asynchronous remote replication master end traverses all directory differences, reconstructs the actual detailed difference data for each directory and file by comparing the old and new snapshots, and generates a task to synchronize the detailed difference data to the synchronization program on the asynchronous remote replication slave end, including: For directories whose paths remain unchanged between the old and new snapshots, the following steps are taken: Sort all subdirectories / files in both directories by inode number and compare them one by one. For subfiles and subdirectories with differences in metadata / extended attributes / data, generate corresponding difference tasks directly for the differences. For subfiles unique to the old directory, generate deletion records first, and for subfiles unique to the new directory, generate creation records first. For directories whose paths have changed, continue traversing the entire directory tree and only examine the descendant directories in the directory tree. Generate deletion records for the descendant directories in the old snapshot and create records for these descendant directories in the new snapshot. For directories that have both creation and deletion records, merge the creation and deletion records into a single rename record; For all directory rename records, generate a synchronization task for the rename directory. At the same time, in the directory of the new and old snapshots, traverse all its subfiles, sort them by inode and compare them. Generate synchronization tasks for the metadata, data, and extended attributes of the subfiles, or create and delete records of the subfiles, as needed. For each new record in all directories, generate a synchronization task for the new directory; For all files that contain both newly created and deleted records, merge the newly created and deleted records into a single file named "rename". For all records of creating, deleting, and renaming files, generate corresponding file creation, deletion, and renaming synchronization tasks; For all directory deletion records, generate a synchronization task for the deleted directories; The generated synchronization tasks are executed in order from the root directory to the leaf directories / files to prevent execution failures on the slave end.
2. The method according to claim 1, characterized in that, The first preset threshold is a detailed difference count threshold, a difference record time threshold, or a detailed difference capacity threshold; The second preset threshold is the threshold for the number of directory differences, the threshold for the time of difference records, or the threshold for the capacity of directory differences.
3. The method according to claim 1, characterized in that, The method of recording detailed differences in data changes through asynchronous remote replication of the master's file system metadata server (MDS) includes: When difference data is generated, the MDS at the asynchronous remote replication master end records the difference data through distributed key-value storage. The key value of the difference data is prefixed with detail_diff_snapid_type_inode, where inode is the index number of the file or directory with difference, type represents the file or directory, snapid is the current snapshot ID, and detail_diff is a fixed prefix that represents the stored detailed difference information.
4. The method according to claim 3, characterized in that, The method further includes: When the total number of difference records is less than the threshold for the number of detailed difference records, or the difference record time is greater than the threshold for the difference record time, or the data difference capacity is greater than the threshold for the detailed difference capacity, the synchronization process of the asynchronous remote replication master performs data synchronization based on the recorded detailed differences, including: The synchronization process of the asynchronous remote replication master reads each difference data record and, according to the difference type of the difference bitmap record, reads the latest value of the file / directory from the snapshot for synchronization.
5. The method according to claim 4, characterized in that, The directory differences in the MDS record data changes at the asynchronous remote replication master include: The asynchronous remote replication master's MDS first converts the existing detailed differences into directory differences and deletes the recorded detailed differences. At the same time, for the new metadata and data changes generated before the arrival of a new synchronization cycle, it records the parent directory where the differences exist.
6. The method according to claim 5, characterized in that, The method further includes: The synchronization programs of both the asynchronous remote replication master and the asynchronous remote replication slave delete previously stored snapshots according to a preset deletion cycle.
7. A distributed file storage asynchronous remote replication device, wherein the device is installed at the asynchronous remote replication master, characterized in that, The device includes a first processing unit, a second processing unit, and a third processing unit; The first processing unit is used to trigger the file system metadata server (MDS) of the asynchronous remote replication master to record detailed differences in the data changes of the asynchronous remote replication master, so that the first synchronization program on the asynchronous remote replication master can trigger the second synchronization program on the asynchronous remote replication slave to perform data synchronization based on the recorded detailed differences. The second processing unit is configured to trigger the directory difference of the MDS record data change when the amount of data change at the asynchronous remote replication master exceeds a first preset threshold, and cause the first synchronization program to trigger the second synchronization program to perform data synchronization based on the directory information difference. The third processing unit is used to trigger the second synchronization program to perform full synchronization of the difference data when the difference in directory information on the asynchronous remote replication master exceeds a second preset threshold. The second processing unit is further configured to, when the amount of data change at the asynchronous remote replication master exceeds a first preset threshold, trigger the MDS to record directory differences of data changes, traverse all directory differences, restore the actual detailed difference data of each directory and file through the actual comparison of the old and new snapshots, and generate a task to synchronize the detailed difference data to the synchronization program at the asynchronous remote replication slave, so that the synchronization program at the asynchronous remote replication slave can parse the difference data sent by the synchronization program at the asynchronous remote replication master and write the data to the corresponding location; The synchronization program on the asynchronous remote replication master end traverses all directory differences, reconstructs the actual detailed difference data for each directory and file by comparing the old and new snapshots, and generates a task to synchronize the detailed difference data to the synchronization program on the asynchronous remote replication slave end, including: For directories whose paths remain unchanged between the old and new snapshots, the following steps are taken: Sort all subdirectories / files in both directories by inode number and compare them one by one. For subfiles and subdirectories with differences in metadata / extended attributes / data, generate corresponding difference tasks directly for the differences. For subfiles unique to the old directory, generate deletion records first, and for subfiles unique to the new directory, generate creation records first. For directories whose paths have changed, continue traversing the entire directory tree and only examine the descendant directories in the directory tree. Generate deletion records for the descendant directories in the old snapshot and create records for these descendant directories in the new snapshot. For directories that have both creation and deletion records, merge the creation and deletion records into a single rename record; For all directory rename records, generate a synchronization task for the rename directory. At the same time, in the directory of the new and old snapshots, traverse all its subfiles, sort them by inode and compare them. Generate synchronization tasks for the metadata, data, and extended attributes of the subfiles, or create and delete records of the subfiles, as needed. For each new record in all directories, generate a synchronization task for the new directory; For all files that contain both newly created and deleted records, merge the newly created and deleted records into a single file named "rename". For all records of creating, deleting, and renaming files, generate corresponding file creation, deletion, and renaming synchronization tasks; For all directory deletion records, generate a synchronization task for the deleted directories; The generated synchronization tasks are executed in order from the root directory to the leaf directories / files to prevent execution failures on the slave end.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program with signal mapping, which, when executed by at least one processor, implements the distributed file storage asynchronous remote copying method according to any one of claims 1-6.