Data reconciliation method and electronic device

By acquiring and utilizing a multi-level collaborative reconciliation method based on backup metadata, the problem of new data being mixed into the reconciliation scope during cloud platform backup was solved, achieving accurate reconciliation across all dimensions and improving business stability.

CN122309250APending Publication Date: 2026-06-30DAWNING CLOUD COMPUTING TECH CO LTD +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
DAWNING CLOUD COMPUTING TECH CO LTD
Filing Date
2026-06-03
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

During the cloud platform backup process, new data is generated in real time and mixed into the reconciliation scope, resulting in inaccurate reconciliation results.

Method used

By acquiring the backup metadata dataset of the backup task, including the first metadata, the second metadata, and the third metadata, and utilizing the correlation mapping relationship between task-level information, physical storage location, and global storage information, multi-level collaborative reconciliation can be achieved, avoiding the impact of new data on the reconciliation results during the backup process.

Benefits of technology

It achieves accurate reconciliation of original data and backup original data across all dimensions, improves the accuracy and reliability of reconciliation results, reduces multi-task concurrency conflicts, and ensures the stability of cloud platform business operations and the reliability of data security verification.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309250A_ABST
    Figure CN122309250A_ABST
Patent Text Reader

Abstract

This invention provides a data reconciliation method and electronic device, applicable to the field of cloud platform technology. The data reconciliation method includes: responding to a data reconciliation request received from a cloud platform; obtaining a backup metadata dataset for a backup task based on a task identifier and a backup version identifier in the data reconciliation request; the backup metadata dataset includes first metadata, second metadata, and third metadata; the first metadata includes task-level information of the backup task; the second metadata includes the physical storage location of the original backup data; and the third metadata includes global storage information of the original backup data and the association mapping relationship between the physical storage location and the global storage information; determining the backup process status of the backup task based on the task status in the task-level information; and, if the backup process status is complete, reconciling the original data and the original backup data based on the first metadata, second metadata, and third metadata to obtain a reconciliation result.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of cloud platforms, and more specifically to a data reconciliation method and electronic device. Background Technology

[0002] Currently, cloud platforms commonly deploy backup tasks to archive and retain original business data, generate backup data, and persistently store it. Simultaneously, metadata is generated to record relevant information about the backup tasks, enabling disaster recovery and reproducibility of business data. To ensure the consistency of backup data, original data, and metadata, cloud platforms must perform reconciliation tasks to identify and resolve anomalies such as data errors, missing data, and inconsistencies.

[0003] In cloud backup scenarios, the backup process generates new data in real time, which may mix the new data into the reconciliation scope, making the reconciliation results inaccurate. Summary of the Invention

[0004] In view of the above problems, the present invention provides a method and electronic device for improving data reconciliation.

[0005] According to a first aspect of the present invention, a data reconciliation method is provided, comprising: responding to receiving a data reconciliation request from a cloud platform, obtaining a backup metadata dataset of a backup task based on a task identifier and a backup version identifier in the data reconciliation request, wherein the backup task is used to back up the original data of a business on the cloud platform via the backup metadata dataset to obtain backup original data corresponding to the backup version identifier, the backup metadata dataset includes first metadata, second metadata, and third metadata, the first metadata including task-level information of the backup task, the second metadata including the physical storage location of the backup original data, and the third metadata including global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information; determining the backup process status of the backup task based on the task status in the task-level information; and, if the backup process status is complete, reconciling the original data and the backup original data based on the first metadata, the second metadata, and the third metadata to obtain a reconciliation result.

[0006] According to an embodiment of the present invention, in response to receiving a data reconciliation request from a cloud platform, the backup metadata dataset of the backup task is obtained based on the task identifier and backup version identifier in the data reconciliation request to accurately pinpoint the scope of the backup task; the backup process status of the backup task is determined based on the task status in the task layer information; and the reconciliation process is set after the backup task is completed to avoid new data generated during the backup process from affecting the reconciliation results. Based on the first metadata, second metadata, and third metadata in the backup metadata set, the original data and the backup original data are reconciled. The first metadata includes the task layer information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes the global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information, realizing multi-level collaborative reconciliation of the task layer (first metadata), physical addressing layer (second metadata), and logical storage rule layer (third metadata), achieving full-dimensional and accurate reconciliation of the original data and the backup original data.

[0007] According to an embodiment of the present invention, the third metadata further includes data management information for backing up the original data; based on the first metadata, the second metadata, and the third metadata, the original data and the backup original data are reconciled to obtain a reconciliation result, including: when the task identifier corresponds to the file identifier of the third metadata, matching the second metadata and the third metadata according to the association mapping relationship in the third metadata, global storage information, and the number of entries in the third metadata to obtain a matching result; when the matching result indicates that the second metadata and the third metadata match, retrieving the backup original data from the physical storage location according to the data management information; and performing a consistency comparison between the original data and the backup original data to obtain a reconciliation result.

[0008] According to an embodiment of the present invention, the task identifier corresponds to the file identifier of the third metadata, establishing a stable association between the task layer of the original data and the third metadata. The association between the second and third metadata is verified through the association mapping relationship, global storage information, and the number of entries in the third metadata. When the second and third metadata match, the backup original data is retrieved from the physical storage location in the second metadata using the data management information in the third metadata, thus achieving a stable association between the source original data, various metadata, and the backup original data. Therefore, reconciliation between the original data and the backup original data can be completed. Simultaneously, multiple metadata are used to complete multi-level collaborative reconciliation of association mapping relationships, storage addresses, number of entries, and content consistency, achieving comprehensive and accurate reconciliation between the original data and the backup original data.

[0009] According to an embodiment of the present invention, the second metadata and the third metadata are matched based on the association mapping relationship in the third metadata, global storage information, and the number of entries in the third metadata to obtain a matching result, including: determining the target second metadata that has an association mapping relationship with the third metadata from the second metadata; when the global storage information matches the storage ownership information of the physical storage location of the target second metadata, comparing the number of entries in the third metadata with the number of entries in the target second metadata; when the number of entries in the third metadata is consistent with the number of entries in the target second metadata, obtaining a matching result indicating that the second metadata and the third metadata are matched.

[0010] According to embodiments of the present invention, data management information includes at least one of the following: data parsing method, data compression method, data decryption method, and data encapsulation method.

[0011] According to an embodiment of the present invention, the method further includes: obtaining a matching result indicating a mismatch between the second metadata and the third metadata when any of the following conditions are met: the task identifier does not correspond to the file identifier of the third metadata; there is no target second metadata associated with the third metadata in the second metadata; the global storage information does not match the storage ownership information of the physical storage location of the target second metadata; the number of entries in the third metadata is inconsistent with the number of entries in the target second metadata.

[0012] According to an embodiment of the present invention, the above method further includes: marking the third metadata as erroneous metadata when the task identifier and the file identifier of the third metadata do not correspond; and correcting the number of entries of the target second metadata and the associated mapping relationship based on the number of entries of the third metadata when the number of entries of the third metadata is inconsistent with the number of entries of the target second metadata and there are no abnormalities in the backup original data.

[0013] According to embodiments of the present invention, when the task identifier and the file identifier of the third-party metadata do not match, the corresponding third-party metadata is promptly marked as erroneous metadata. This enables rapid and accurate identification of abnormal metadata with disordered identifier associations, preventing abnormal metadata from participating in subsequent mapping matching and data reconciliation processes, and preventing addressing deviations caused by the propagation of erroneous metadata. Furthermore, assuming the original backup data itself is not abnormal, if the number of entries in the third-party metadata is inconsistent with the number of entries in the target second-party metadata, the number of entries and their associated mapping relationships in the target second-party metadata are adaptively corrected based on the reliable number of third-party metadata entries. This automatically repairs discrepancies in the number of entries and disordered mapping links between logical-layer metadata and physical-layer metadata without altering the original backup data, effectively improving the standardization of cloud platform metadata management, self-repair capabilities, and the accuracy of data reconciliation results.

[0014] According to an embodiment of the present invention, the above method further includes: in the case that there is inconsistency between the target second metadata and the backup original data, determining the abnormal instance from the backup original data based on the inconsistency information, clearing the abnormal instance and performing a full backup, wherein the inconsistency information includes at least one of the following: mismatch in the number of physical addresses, or inconsistency in data content.

[0015] According to an embodiment of the present invention, the method further includes: continuing to execute the backup task when the backup process is in the process of execution, and performing reconciliation when the backup process is switched to the completion state; and delaying the execution time of the backup task when the backup process is in the process of pending execution.

[0016] According to embodiments of the present invention, when the backup process is in progress, the backup task continues to run, ensuring its smooth progress. Reconciliation is performed only after the backup process is completed, minimizing interference with the normal backup task's execution and preventing newly added data during backup from entering the reconciliation scope. This effectively prevents incremental new data from interfering with the reconciliation baseline and causing distortion in the reconciliation results. Simultaneously, a timing avoidance mechanism is implemented for pending backup tasks, isolating newly added data generated during their execution to prevent data disturbance and result interference during the reconciliation verification process. This improves the accuracy and reliability of data reconciliation results, while also rationally coordinating the execution sequence of cloud platform backup and reconciliation tasks, reducing multi-task concurrency conflicts, and enhancing overall business operation stability and data security verification reliability.

[0017] According to an embodiment of the present invention, the above method further includes: in the event of loss of original data, repairing the original data using historical backup original data.

[0018] A second aspect of the present invention provides a data reconciliation apparatus, comprising: an acquisition module, configured to, in response to receiving a data reconciliation request from a cloud platform, acquire a backup metadata dataset of a backup task based on a task identifier and a backup version identifier in the data reconciliation request, wherein the backup task is used to back up the original data of a business on the cloud platform via the backup metadata dataset to obtain backup original data corresponding to the backup version identifier, the backup metadata dataset including first metadata, second metadata, and third metadata, the first metadata including task-level information of the backup task, the second metadata including the physical storage location of the backup original data, and the third metadata including global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information; a determination module, configured to determine the backup process status of the backup task based on the task status in the task-level information; and a reconciliation module, configured to, when the backup process status is not complete, reconcile the original data and the backup original data based on the first metadata, the second metadata, and the third metadata to obtain a reconciliation result.

[0019] A third aspect of the present invention provides an electronic device comprising: one or more processors; and a memory for storing one or more computer programs, wherein the one or more processors execute the one or more computer programs to implement the steps of the method described above.

[0020] A fourth aspect of the present invention also provides a computer-readable storage medium having a computer program or instructions stored thereon, wherein the computer program or instructions, when executed by a processor, implement the steps of the above-described method.

[0021] A fifth aspect of the present invention also provides a computer program product, including a computer program or instructions that, when executed by a processor, implement the steps of the above-described method.

[0022] According to an embodiment of the present invention, in response to receiving a data reconciliation request from a cloud platform, the backup metadata dataset of the backup task is obtained based on the task identifier and backup version identifier in the data reconciliation request to accurately pinpoint the scope of the backup task; the backup process status of the backup task is determined based on the task status in the task layer information; and the reconciliation process is set after the backup task is completed to avoid new data generated during the backup process from affecting the reconciliation results. Based on the first metadata, second metadata, and third metadata in the backup metadata set, the original data and the backup original data are reconciled. The first metadata includes the task layer information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes the global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information, realizing multi-level collaborative reconciliation of the task layer (first metadata), physical addressing layer (second metadata), and logical storage rule layer (third metadata), achieving full-dimensional and accurate reconciliation of the original data and the backup original data. Attached Figure Description

[0023] The above-described features, other objects, and advantages of the present invention will become clearer from the following description of embodiments of the invention with reference to the accompanying drawings, in which:

[0024] Figure 1 This diagram illustrates an application scenario of the data reconciliation method according to an embodiment of the present invention.

[0025] Figure 2 A flowchart illustrating a data reconciliation method according to an embodiment of the present invention is shown schematically;

[0026] Figure 3 A schematic diagram illustrating a data reconciliation method according to another embodiment of the present invention is shown.

[0027] Figure 4 This schematic diagram illustrates the structure of a data reconciliation device according to an embodiment of the present invention;

[0028] Figure 5 A block diagram of an electronic device suitable for implementing a data reconciliation method according to an embodiment of the present invention is shown schematically. Detailed Implementation

[0029] Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. However, it should be understood that these descriptions are exemplary only and are not intended to limit the scope of the invention. In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the embodiments of the invention for ease of explanation. However, it will be apparent that one or more embodiments may be practiced without these specific details. Furthermore, descriptions of well-known structures and techniques are omitted in the following description to avoid unnecessarily obscuring the concept of the invention.

[0030] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. The terms “comprising,” “including,” etc., as used herein indicate the presence of the stated features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.

[0031] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.

[0032] When using expressions such as "at least one of A, B and C", they should generally be interpreted in accordance with the meaning that is commonly understood by those skilled in the art (e.g., "a system having at least one of A, B and C" should include, but is not limited to, a system having A alone, a system having B alone, a system having C alone, a system having A and B, a system having A and C, a system having B and C, and / or a system having A, B and C, etc.).

[0033] In the technical solution of this invention, the user information (including but not limited to user personal information, user image information, user device information, such as location information) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of related data all comply with relevant laws, regulations, and standards, take necessary confidentiality measures, do not violate public order and good morals, and provide corresponding operation entry points for users to choose to authorize or refuse.

[0034] Currently, cloud platforms commonly deploy backup tasks to archive and retain original business data, generate backup data, and persistently store it. Simultaneously, metadata is generated to record relevant information about the backup tasks, enabling disaster recovery and reproducibility of business data. To ensure the consistency of backup data, original data, and metadata, cloud platforms must perform reconciliation tasks to identify and resolve anomalies such as data errors, missing data, and inconsistencies.

[0035] In cloud backup scenarios, the backup process generates new data in real time, which may mix the new data into the reconciliation scope, making the reconciliation results inaccurate.

[0036] An embodiment of the present invention provides a data reconciliation method, comprising: responding to receiving a data reconciliation request from a cloud platform, obtaining a backup metadata dataset of a backup task based on a task identifier and a backup version identifier in the data reconciliation request, wherein the backup task is used to back up the original data of a business on the cloud platform via the backup metadata dataset to obtain backup original data corresponding to the backup version identifier, the backup metadata dataset includes first metadata, second metadata, and third metadata, the first metadata includes task-level information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information; determining the backup process status of the backup task based on the task status in the task-level information; and, if the backup process status is complete, reconciling the original data and the backup original data based on the first metadata, the second metadata, and the third metadata to obtain a reconciliation result.

[0037] Figure 1 The diagram illustrates an application scenario of the data reconciliation method according to an embodiment of the present invention.

[0038] like Figure 1 As shown, the front-end 101 can be used for user input of backup / data reconciliation requests; the back-end 102 can be used to verify the backup / data reconciliation requests, encapsulate parameters, and call the underlying service 103. The underlying service 103 can execute backup / data reconciliation tasks. For example, the underlying service 103 is configured to execute the core logic of backup and data reconciliation tasks, including cloud platform container running status verification, three-layer metadata progressive comparison, and reading, writing, and verification of original backup data.

[0039] If the container 104 on the verification cloud platform is running normally, the backup process status of the backup task is determined from the task layer information in the first metadata 105.

[0040] If the backup process is in a completed state, the task identifier in the first metadata 105 is matched with the file identifier in the third metadata 107 in sequence. Then, the second metadata 106 is matched with the third metadata 107 based on the association mapping relationship, global storage information, and the number of entries in the third metadata 107.

[0041] If the task identifier in the first metadata 105 matches the file identifier in the third metadata 107, and the second metadata 106 matches the third metadata 107, the backup original data 108 is retrieved from the physical storage location in the second metadata 106 according to the data management information in the third metadata 107.

[0042] A consistency comparison is performed between the original data 109 and the backup original data 108 to obtain the reconciliation result. The reconciliation result is then fed back to the cloud platform's container 104, underlying service 103, backend 102, and frontend 101 in sequence.

[0043] If the running status of container 104 on the verification cloud platform is abnormal, the process directly proceeds to termination 110. Alternatively, a clear message can be returned to frontend 101: "Container status is abnormal, reconciliation task cannot be executed, please try again later." By verifying the running status of containers before data reconciliation, invalid reconciliation requests under abnormal container scenarios are filtered out, avoiding reconciliation failures or result distortions due to service unavailability, thus ensuring the stability and reliability of the reconciliation task and results.

[0044] Figure 2 A flowchart illustrating a data reconciliation method according to an embodiment of the present invention is shown schematically.

[0045] like Figure 2 As shown, the data reconciliation method in this embodiment includes operations S210 to S230.

[0046] In operation S210, in response to receiving a data reconciliation request from the cloud platform, the backup metadata dataset of the backup task is obtained based on the task identifier and backup version identifier in the data reconciliation request. The backup task is used to back up the original data of the business on the cloud platform through the backup metadata dataset to obtain the backup original data corresponding to the backup version identifier. The backup metadata dataset includes first metadata, second metadata, and third metadata. The first metadata includes the task layer information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes the global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information.

[0047] According to embodiments of the present invention, a cloud platform provides a service model that allows computing resources to be provided on demand via the network. Due to the characteristics of cloud platforms, such as underlying storage virtualization, dynamic changes in physical addresses, multi-layer mapping decoupling (e.g., layer-by-layer mapping between original data, mapping layer relationships, and the physical storage location of backup original data), asynchronous eventual consistency between metadata and backup data, block-level incremental and deduplication architecture, multi-tenant elastic scaling, and cross-domain replication, full content comparison is limited by cloud performance and cost. This ultimately results in complex reconciliation logic, large verification errors, and a task implementation difficulty far exceeding that of local backup architectures.

[0048] According to an embodiment of the present invention, a data reconciliation request can be a request instruction initiated by a client or business unit to the cloud platform, which is used to request the cloud platform to reconcile the original data of the corresponding backup version under the backup task with the original backup data.

[0049] According to an embodiment of the present invention, the task identifier can be a unique code for the backup task, and the backup version identifier can be a unique version identifier generated for different backup time points and different backup batches under the same backup task. The task identifier and the backup version identifier can accurately identify the backup metadata dataset and the original backup data to be processed for reconciliation.

[0050] According to embodiments of the present invention, the backup metadata dataset can be a comprehensive metadata collection integrating the task layer, logical layer, and physical layer, serving the functions of cloud platform backup management, association mapping, physical addressing, and data reconciliation. For example, the backup metadata dataset may include information recorded by the backup system, such as the backup time, file size, storage path, checksum, backup version number, original path, and modification time of the file.

[0051] According to an embodiment of the present invention, the original data can be real data from the source, such as working documents, photos, etc. uploaded by users in cloud storage.

[0052] According to an embodiment of the present invention, backing up the original data can be a physical backup copy generated on the backend storage after backing up the original data.

[0053] According to an embodiment of the present invention, the first metadata includes task-level information of the backup task. This task-level information may include configuration information of the backup task, attributes of the backup disk, task status, etc. For example, the configuration information of the backup task may include a backup task identifier, backup source path, backup target disk, backup policy (full / incremental), backup timestamp, etc. The attributes of the backup disk may include capacity attributes, storage tier attributes, running mount status attributes, business association attributes, encryption verification attributes, and access permission attributes, etc. The task status can be used to characterize the completion progress of the backup task; for example, the task status may be 10% complete.

[0054] According to an embodiment of the present invention, the second metadata may be physical addressing layer information. The physical storage location of the backup original data may be at least one of the following: backup disk physical address, distributed storage node path, disk offset address, and underlying physical location of object storage.

[0055] According to embodiments of the present invention, the third metadata includes global storage information of the backup original data and the association mapping relationship between physical storage locations and global storage information. That is, the third metadata layer can be logical storage rule layer information. Global storage information can be used to characterize the logical index of files / blocks of the original data. For example, global storage information can include comprehensive storage information of multiple files / blocks in the backup original data on the cloud platform, such as multiple files / blocks being stored on disk A and disk C respectively. The association mapping relationship between physical storage locations and global storage information can be used to verify whether there is an association between the physical storage location and the storage ownership information in the global storage information, to determine whether the backup original data is missing. For example, if the storage ownership information in the global storage information includes storage address A and storage address B, and the physical storage location also includes the disk under storage address A and the hard disk under storage address B, it can be determined that the storage ownership information of the physical storage location is associated with the global storage information, and the backup original data is not missing. It should be noted that the association mapping relationship between physical storage locations and global storage information can also be an addressing path, without specific limitations.

[0056] In operation S220, the backup process status of the backup task is determined based on the task status in the task layer information.

[0057] Because the backup process generates new data, data verification or checking during the reconciliation process can introduce data discrepancies, leading to inaccurate reconciliation results. Therefore, by determining the backup task's progress status (e.g., in progress, pending execution, completed), different reconciliation times can be selected based on different backup progress statuses to improve the accuracy of reconciliation results.

[0058] For example, the task status could be 30% complete, with the backup process status being "in progress". Alternatively, the task status could be 0% complete, with the backup process status being "pending".

[0059] When operating S230, if the backup process is in a completed state, the original data and the backup original data are reconciled based on the first metadata, the second metadata, and the third metadata to obtain the reconciliation result.

[0060] According to an embodiment of the present invention, when the backup task is completed, new data generated during the backup process is prevented from affecting the reconciliation results.

[0061] For example, the reconciliation task and data scope are determined based on the first metadata, the logical storage benchmark and mapping relationship are provided through the third metadata, and the physical storage location of the backup original data is located and the data is read with the help of the second metadata. The reconciliation of the structure, mapping and content consistency between the original data and the backup original data is achieved through step-by-step linkage.

[0062] For example, using the third metadata as the core, and simultaneously associating the task benchmark of the first metadata and the physical storage benchmark of the second metadata, the original data is retrieved and the original data is backed up in parallel, and two-way linkage reconciliation is performed.

[0063] The first metadata is used as the task dimension benchmark, the second metadata is used as the physical storage location benchmark, and the third metadata is used as the storage logic benchmark.

[0064] The original data identifier, data size, backup scope, and version attributes corresponding to this backup task are parsed from the first metadata and used as the source baseline sample to be reconciled.

[0065] By utilizing the global storage information in the third-party metadata and the association mapping relationship between physical storage location and global storage information, the logical structure of the source original data and the logical mapping relationship on the backup side are respectively associated; the consistency between the original data and the backup original data in terms of logical hierarchy and file entry affiliation is verified.

[0066] Based on the physical storage location recorded in the second metadata record, the original backup data is read. This allows for verification of the consistency between the original data and the original backup data.

[0067] Therefore, the reconciliation results simultaneously verified the version matching, logical hierarchy, file entry ownership, and content consistency between the original data and the backup original data.

[0068] According to an embodiment of the present invention, in response to receiving a data reconciliation request from a cloud platform, the backup metadata dataset of the backup task is obtained based on the task identifier and backup version identifier in the data reconciliation request to accurately pinpoint the scope of the backup task; the backup process status of the backup task is determined based on the task status in the task layer information; and the reconciliation process is set after the backup task is completed to avoid new data generated during the backup process from affecting the reconciliation results. Based on the first metadata, second metadata, and third metadata in the backup metadata set, the original data and the backup original data are reconciled. The first metadata includes the task layer information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes the global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information, realizing multi-level collaborative reconciliation of the task layer (first metadata), physical addressing layer (second metadata), and logical storage rule layer (third metadata), achieving full-dimensional and accurate reconciliation of the original data and the backup original data.

[0069] According to an embodiment of the present invention, the third metadata further includes data management information for backing up the original data; based on the first metadata, the second metadata, and the third metadata, the original data and the backup original data are reconciled to obtain a reconciliation result, including: when the task identifier corresponds to the file identifier of the third metadata, matching the second metadata and the third metadata according to the association mapping relationship in the third metadata, global storage information, and the number of entries in the third metadata to obtain a matching result; when the matching result indicates that the second metadata and the third metadata match, retrieving the backup original data from the physical storage location according to the data management information; and performing a consistency comparison between the original data and the backup original data to obtain a reconciliation result.

[0070] According to an embodiment of the present invention, the third metadata may exist on the server in the form of a folder, and the name of the folder may be the file identifier of the third metadata. Furthermore, if there is a one-to-one correspondence between the task identifier and the file identifier of the third metadata, the third metadata is determined to be the metadata of the backup task.

[0071] According to an embodiment of the present invention, the number of entries of the third metadata can be the total number of entries of all files / blocks of the original data, which can accurately pinpoint whether the second metadata is missing, so as to determine whether the backup original data is missing files / blocks.

[0072] According to embodiments of the present invention, by matching the second metadata with the third metadata through the association mapping relationship, global storage information and the number of entries of the third metadata, the consistency verification of physical storage address and number of files / blocks between the original data and the backup original data can be achieved.

[0073] For example, first count the total number of entries for the third metadata based on global storage information, and then count the number of entries for the second metadata. If the number of entries is the same, then use the association mapping relationship between the second and third metadata to verify the binding correspondence between logical entries and physical entries one by one. Through the dual verification of quantity and mapping, the matching verification of the second and third metadata is completed.

[0074] For example, based on the association mapping relationship and global storage information in the third metadata, the corresponding second metadata is retrieved; logical storage entries and physical storage entries are matched one by one according to the mapping binding rules, and the total number of entries in the second metadata and the third metadata is compared; if the mapping relationship matches and the number of entries is equal, the second metadata and the third metadata are determined to be a match.

[0075] According to embodiments of the present invention, data management information may include the method for reading the backup original data, the text format, the content verification method, etc. The backup original data can be correctly read using the data management information.

[0076] If the second and third metadata match, the original backup data is retrieved from the physical storage location according to the reading method specified in the data management information. The original data and the original backup data are then compared for consistency using the content verification methods specified in the data management information to obtain the reconciliation result. For example, the reading method may include decompression, decryption, and parsing processes. The content verification method may be global verification (overall file digest verification, full data fingerprint comparison), cyclic redundancy check, etc.

[0077] In cloud backup scenarios, due to the characteristics of underlying storage virtualization, dynamic changes in physical addresses, asynchronous eventual consistency between metadata and backup data, and block-level incremental and deduplication architecture, it is difficult to establish a stable relationship between the source original data, metadata, and backup original data. At the same time, the comparison of full content is limited by cloud performance and cost, which ultimately results in complex reconciliation logic, large verification errors, and high difficulty in task implementation.

[0078] In light of this, the task identifier corresponds to the file identifier of the third metadata, establishing a stable association between the task layer of the original data and the third metadata. The association between the second and third metadata is verified through the association mapping relationships, global storage information, and the number of entries in the third metadata. When the second and third metadata match, the backup original data is retrieved from the physical storage location in the second metadata using the data management information in the third metadata. This achieves a stable association between the source original data, various metadata, and the backup original data, thus enabling reconciliation between the original data and the backup original data. Simultaneously, multiple metadata are used to complete multi-level collaborative reconciliation of association mapping relationships, storage addresses, number of entries, and content consistency, achieving comprehensive and accurate reconciliation between the original data and the backup original data.

[0079] According to an embodiment of the present invention, the second metadata and the third metadata are matched based on the association mapping relationship in the third metadata, global storage information, and the number of entries in the third metadata to obtain a matching result, including: determining the target second metadata that has an association mapping relationship with the third metadata from the second metadata; when the global storage information matches the storage ownership information of the physical storage location of the target second metadata, comparing the number of entries in the third metadata with the number of entries in the target second metadata; when the number of entries in the third metadata is consistent with the number of entries in the target second metadata, obtaining a matching result indicating that the second metadata and the third metadata are matched.

[0080] Missing metadata can make it difficult to find backups of the original data, leading to the loss of business data. Therefore, by identifying target second metadata that has a mapping relationship with third metadata from the second metadata, and assuming that the global storage information matches the storage ownership information of the target second metadata's physical storage location, comparing the number of entries in the third metadata with the number of entries in the target second metadata can determine whether metadata is missing.

[0081] Matching global storage information with the storage ownership information of the physical storage location of the target second metadata can be achieved by: the storage cluster, storage pool, backup disk, and storage node ownership domain corresponding to the global storage information in the third metadata being consistent with the actual storage cluster, storage pool, backup disk, and storage node ownership information of the physical storage location associated with the second metadata.

[0082] The number of entries in the third metadata is the same as the number of entries in the target second metadata. This can be achieved when the total number of logical directory entries, file entries, or data block entries contained in the second metadata is equal to the total number of physical directory entries, physical file entries, or physical data block entries corresponding to the third metadata, and the two entries form a one-to-one correspondence.

[0083] According to embodiments of the present invention, target second metadata that matches the third metadata is accurately selected from the second metadata based on the association mapping relationship, eliminating irrelevant metadata interference and narrowing the verification scope; then, by matching and verifying the global storage information with the storage ownership information of the physical storage location corresponding to the target second metadata, erroneous associations between the second and third metadata across storage domains and storage resources can be effectively avoided, preventing metadata mapping from going out of bounds and becoming disordered; subsequently, under the premise that the storage ownership matching is compliant, the number of entries of the third metadata and the target second metadata is compared and verified to ensure that the number of logical storage entries and physical storage entries are completely corresponding and without missing redundancy; through the progressive multi-layer verification logic of association filtering, ownership verification, and quantity comparison, the rigor and accuracy of the matching between the second and third metadata are greatly improved, avoiding misjudgments and omissions caused by a single verification dimension, and providing a reliable and compliant metadata matching foundation for subsequent backup original data addressing and data consistency reconciliation based on metadata, ensuring the stable and reliable operation of cloud platform backup reconciliation business.

[0084] According to embodiments of the present invention, data management information includes at least one of the following: data parsing method, data compression method, data decryption method, and data encapsulation method.

[0085] For example, the data parsing method can be one of the following: parsing according to a fixed data block length, parsing according to backup block rules, or parsing according to the order of protocol fields.

[0086] For example, data compression methods can be one of the following: block streaming compression, archive packaging compression, differential incremental compression, or general format compression.

[0087] For example, data decryption methods can be one of the following: symmetric algorithm decryption, national cryptographic algorithm decryption, asymmetric key decryption, block-level storage decryption, or file-level overall decryption.

[0088] For example, data encapsulation methods can be one of the following: encapsulation by data block, encapsulation by directory as a whole, encapsulation by custom message format, or encapsulation by logical storage entry.

[0089] According to an embodiment of the present invention, the method further includes: obtaining a matching result indicating a mismatch between the second metadata and the third metadata when any of the following conditions are met: the task identifier does not correspond to the file identifier of the third metadata; there is no target second metadata associated with the third metadata in the second metadata; the global storage information does not match the storage ownership information of the physical storage location of the target second metadata; the number of entries in the third metadata is inconsistent with the number of entries in the target second metadata.

[0090] According to an embodiment of the present invention, if the task identifier does not correspond to the file identifier of the third metadata, it may be that the third metadata is not the metadata of the backup task, making it difficult to match the third metadata with the second metadata and thus no matching result can be obtained.

[0091] According to an embodiment of the present invention, the second metadata does not contain target second metadata associated with the third metadata, that is, the second metadata is unrelated to the third metadata.

[0092] The storage ownership information of the global storage information does not match the physical storage location of the target second metadata, which means that the second metadata and the third metadata are incorrectly associated across storage domains and storage resources, resulting in disordered metadata mapping.

[0093] The number of entries in the third metadata is inconsistent with the number of entries in the target second metadata, meaning that the number of logical storage entries does not correspond to the number of physical storage entries, or there are missing or redundant entries.

[0094] According to an embodiment of the present invention, the method further includes: continuing to execute the backup task when the backup process is in the process of execution, and performing reconciliation when the backup process is switched to the completion state; and delaying the execution time of the backup task when the backup process is in the process of pending execution.

[0095] According to an embodiment of the present invention, a data reconciliation request may include a reconciliation task list, which may include multiple reconciliation tasks. These multiple reconciliation tasks can be categorized according to their backup process status into a first reconciliation task corresponding to a completed backup task, a second reconciliation task corresponding to a currently executing backup task, and a third reconciliation task corresponding to a backup task yet to be executed.

[0096] The first data reconciliation phase is for backup tasks that have been completed. The first data reconciliation phase includes executing the first reconciliation task, which can be to reconcile the original data of the backup task that has been completed with the original backup data.

[0097] The second data reconciliation phase targets the ongoing backup task. Its second reconciliation task is to reconcile the original data from the current backup task with the original data from any newly generated backup. The purpose of this second data reconciliation phase is to minimize disruption to the backup task's execution.

[0098] According to an embodiment of the present invention, when the backup process is in the running state, the backup task continues to be executed, and when the backup process is switched to the completed state and the first reconciliation task is completed, the second reconciliation task is reconciled.

[0099] It should be noted that the second data reconciliation stage only begins after the first reconciliation task and the ongoing backup task have been completed. The goal is to minimize disruption to the backup task's execution and to avoid introducing new data generated during the backup process that could affect the reconciliation results.

[0100] The third data reconciliation phase involves delaying the execution of backup tasks until the second data reconciliation phase is complete. After the backup tasks are completed, the third reconciliation task is executed. This third reconciliation task reconciles the original data from the backup tasks and the original data from any newly generated backups.

[0101] The third data reconciliation stage sets up a timing avoidance mechanism for the backup tasks to be executed. This mechanism can isolate the new data generated during the execution of the backup tasks from the time sequence, thus preventing the new data generated during the backup process of the backup tasks from affecting the reconciliation results of the first and second reconciliation tasks.

[0102] According to embodiments of the present invention, when the backup process is in progress, the backup task continues to run, ensuring its smooth progress. Reconciliation is performed only after the backup process is completed, minimizing interference with the normal backup task's execution and preventing newly added data during backup from entering the reconciliation scope. This effectively prevents incremental new data from interfering with the reconciliation baseline and causing distortion in the reconciliation results. Simultaneously, a timing avoidance mechanism is implemented for pending backup tasks, isolating newly added data generated during their execution to prevent data disturbance and result interference during the reconciliation verification process. This improves the accuracy and reliability of data reconciliation results, while also rationally coordinating the execution sequence of cloud platform backup and reconciliation tasks, reducing multi-task concurrency conflicts, and enhancing overall business operation stability and data security verification reliability.

[0103] According to an embodiment of the present invention, the above method further includes: marking the third metadata as erroneous metadata when the task identifier and the file identifier of the third metadata do not correspond; and correcting the number of entries of the target second metadata and the associated mapping relationship based on the number of entries of the third metadata when the number of entries of the third metadata is inconsistent with the number of entries of the target second metadata and there are no abnormalities in the backup original data.

[0104] Related technologies only need to fulfill data verification functions and do not require repair. However, this application addresses data reconciliation in cloud backup scenarios, performing backup tasks, reconciliation tasks, and automatically repairing the entire process.

[0105] According to an embodiment of the present invention, if incorrect metadata or backup of the original data exists after the reconciliation is completed, it can be automatically repaired. The repair method varies depending on the type of data error.

[0106] If the task identifier and the file identifier of the third-party metadata do not correspond, then the third-party metadata corresponding to the backup task does not exist. The solution is to mark the third-party metadata as erroneous metadata.

[0107] If the number of entries in the third metadata does not match the number of entries in the target second metadata, and the original backup data is normal, the target second metadata may be missing or redundant. Since the third metadata is global storage information that includes the original backup data, the repair method can be to modify the number of entries in the third metadata to match the number of entries in the target second metadata, and the physical storage address of the target second metadata can be repaired according to the association mapping relationship.

[0108] According to embodiments of the present invention, when the task identifier and the file identifier of the third-party metadata do not match, the corresponding third-party metadata is promptly marked as erroneous metadata. This enables rapid identification and accurate verification of abnormal metadata with disordered identifier associations, preventing abnormal metadata from participating in subsequent mapping matching and data reconciliation processes, and preventing addressing deviations caused by the propagation of erroneous metadata. Furthermore, assuming the original backup data itself is not abnormal, if the number of entries in the third-party metadata is inconsistent with the number of entries in the target second-party metadata, the number of entries and their associated mapping relationships in the target second-party metadata are adaptively corrected based on the reliable number of third-party metadata entries. This automatically repairs the discrepancies in the number of entries and the disordered mapping links between logical-layer metadata and physical-layer metadata without altering the actual backup business data, effectively improving the standardization of cloud platform metadata management, self-repair capabilities, and the accuracy of data reconciliation results.

[0109] According to an embodiment of the present invention, the above method further includes: in the case that there is inconsistency between the target second metadata and the backup original data, determining the abnormal instance from the backup original data based on the inconsistency information, clearing the abnormal instance and performing a full backup, wherein the inconsistency information includes at least one of the following: mismatch in the number of physical addresses, or inconsistency in data content.

[0110] According to an embodiment of the present invention, based on the identified inconsistency information, the corresponding abnormal data instances are located and filtered in the backup original data. The abnormal instances are abnormal data blocks or abnormal file entries that cause the number of physical addresses to be mismatched or the data content to be deviated.

[0111] The identified abnormal instances are cleared and removed from the backup original data, including the corresponding storage data and related invalid entries. After clearing the abnormal instances, a full backup of the original business data is initiated again based on the first and third metadata, and complete and compliant backup original data is regenerated. The corresponding second and third metadata entries and physical address mappings are updated synchronously to ensure that the updated target second metadata is completely consistent with the newly generated backup original data in terms of the number of physical addresses and data content. This provides an accurate and abnormal-free data benchmark for subsequent cross-metadata matching and data reconciliation.

[0112] According to an embodiment of the present invention, the above method further includes: in the event of loss of original data, repairing the original data using historical backup original data.

[0113] According to embodiments of the present invention, when the original data on the business side is missing in terms of the number of entries, logical path, physical storage location, or data content, or cannot be matched with the metadata in the backup metadata set, it can be determined that the original data has been lost. For example, if the number of file or directory entries in the actual original data is less than the baseline number recorded in the first metadata, it can be determined that the original data has been lost. For example, if the logical path or file entries recorded in the second metadata do not exist on the business side, it can also be determined that the original data has been lost. For example, if the backup side has complete entries (the total number of entries corresponding to the physical storage address in the third metadata), but the corresponding entries in the original business data are missing, and the number of entries cannot be matched, it can also be determined that the original data is partially lost.

[0114] According to an embodiment of the present invention, the historical backup original data can be the backup original data with the most data content and the shortest time interval between the data update time and the current time. For example, the original data can be replaced with the backup original data, and a backup error warning message can be sent to the user interface.

[0115] According to an embodiment of the present invention, when the loss of original data is identified, the missing original data can be recovered and repaired by using historical backup original data. The lost original business data can be quickly supplemented based on the retained historical complete backup data without the need for manual re-collection or manual reconstruction, which greatly reduces the risk of business interruption and manual operation and maintenance costs caused by data loss, and effectively restores the normal availability of business data.

[0116] Figure 3 A schematic diagram illustrating a data reconciliation method according to another embodiment of the present invention is shown.

[0117] like Figure 3 As shown, multiple backup tasks in the cloud platform's reconciliation task list 301 are classified according to their backup process status, resulting in backup tasks 302 that have been completed, backup tasks 303 that are currently being executed, and backup tasks 304 that are yet to be executed.

[0118] The first data reconciliation stage 305 can be used to reconcile data for a completed backup task. Specifically, this includes: when the task identifier corresponds to the file identifier of the third-party metadata, matching the second-party metadata with the third-party metadata based on the association mapping relationship in the third-party metadata, global storage information, and the number of entries in the third-party metadata, to obtain a matching result; when the matching result indicates that the second-party metadata and the third-party metadata match, retrieving the original backup data from the physical storage location according to the data management information; and performing a consistency comparison between the original data and the original backup data to obtain a reconciliation result. A matching result indicating a mismatch between the second-party metadata and the third-party metadata is obtained if any of the following conditions are met: the task identifier does not correspond to the file identifier of the third-party metadata; the second-party metadata does not contain a target second-party metadata associated with the third-party metadata; the global storage information does not match the storage ownership information of the physical storage location of the target second-party metadata; or the number of entries in the third-party metadata is inconsistent with the number of entries in the target second-party metadata.

[0119] Once the first data reconciliation phase 305 is completed and the ongoing backup task 303 has finished, the second data reconciliation phase 306 begins. The process of the second data reconciliation phase 306 is consistent with that of the first data reconciliation phase 305.

[0120] After completing the second data reconciliation stage 306, if it is determined that there are discrepancies (307) between the results of the first and second data reconciliation stages indicating data anomalies, the data repair stage 309 must be initiated. After repair is completed, the pending backup task 304 is executed.

[0121] The data repair phase 309 may specifically include: marking the third-party metadata as erroneous metadata when the task identifier and the file identifier of the third-party metadata do not correspond; correcting the number of entries in the target second-party metadata and the associated mapping relationship based on the number of entries in the third-party metadata when the number of entries in the target second-party metadata is inconsistent with the number of entries in the target second-party metadata, and the backup original data is normal; and determining abnormal instances from the backup original data based on the inconsistencies in the target second-party metadata and the backup original data, clearing the abnormal instances, and then performing a full backup, wherein the inconsistencies include at least one of the following: mismatch in the number of physical addresses or inconsistent data content.

[0122] After completing the second data reconciliation stage 306, and confirming that the results of both the first and second data reconciliation stages indicate that the reconciliation data is normal 308, the backup task to be executed is performed 304.

[0123] Once backup task 304 is completed, the process proceeds to the third data reconciliation stage 310. The process of the third data reconciliation stage 310 is consistent with that of the first data reconciliation stage 305.

[0124] It should be noted that during the failed data reconciliation phase, it is permissible to delete reconciliation data and temporary files, but this process cannot be repeated. Failures during the reconciliation process will be displayed on the front-end interface. Failures during the data repair phase can be attempted repeatedly.

[0125] Figure 4 A schematic block diagram of a data reconciliation device according to an embodiment of the present invention is shown.

[0126] like Figure 4 As shown, the data reconciliation device 400 of this embodiment includes an acquisition module 410, a determination module 420, and a reconciliation module 430.

[0127] The acquisition module 410 is used to respond to a data reconciliation request received from the cloud platform, and to acquire the backup metadata dataset of the backup task based on the task identifier and backup version identifier in the data reconciliation request. The backup task is used to back up the original business data on the cloud platform via the backup metadata dataset to obtain the backup original data corresponding to the backup version identifier. The backup metadata dataset includes first metadata, second metadata, and third metadata. The first metadata includes the task-level information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes the global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information. In one embodiment, the acquisition module 410 can be used to perform the operation S210 described above, which will not be repeated here.

[0128] The determining module 420 is used to determine the backup process status of the backup task based on the task status in the task layer information. In one embodiment, the determining module 420 can be used to perform the operation S220 described above, which will not be repeated here.

[0129] The reconciliation module 430 is used to reconcile the original data and the backup original data based on the first metadata, the second metadata, and the third metadata when the backup process is not yet complete, and obtain a reconciliation result. In one embodiment, the reconciliation module 430 can be used to perform the operation S230 described above, which will not be repeated here.

[0130] According to an embodiment of the present invention, the third metadata further includes data management information for backing up the original data; the reconciliation module 430 includes a matching submodule and an acquisition submodule. The matching submodule is used to match the second metadata with the third metadata based on the association mapping relationship in the third metadata, global storage information, and the number of entries in the third metadata, when the task identifier corresponds to the file identifier of the third metadata, to obtain a matching result. The acquisition submodule is used to retrieve the backup original data from the physical storage location according to the data management information when the matching result indicates that the second metadata and the third metadata match; and to perform a consistency comparison between the original data and the backup original data to obtain a reconciliation result.

[0131] According to an embodiment of the present invention, the matching submodule includes a determining unit, a comparing unit, and an obtaining unit. The determining unit is used to determine a target second metadata that has an associated mapping relationship with the third metadata from the second metadata; the comparing unit is used to compare the number of entries of the third metadata with the number of entries of the target second metadata when the global storage information matches the storage ownership information of the physical storage location of the target second metadata; the obtaining unit is used to obtain a matching result indicating that the second metadata and the third metadata match when the number of entries of the third metadata is consistent with the number of entries of the target second metadata.

[0132] According to embodiments of the present invention, data management information includes at least one of the following: data parsing method, data compression method, data decryption method, and data encapsulation method.

[0133] According to an embodiment of the present invention, the above-described apparatus further includes an obtaining module. The obtaining module is configured to obtain a matching result indicating a mismatch between the second metadata and the third metadata if any of the following conditions are met: the task identifier does not correspond to the file identifier of the third metadata; there is no target second metadata associated with the third metadata in the second metadata; the global storage information does not match the storage ownership information of the physical storage location of the target second metadata; the number of entries in the third metadata is inconsistent with the number of entries in the target second metadata.

[0134] According to an embodiment of the present invention, the above-described apparatus further includes a marking module and a correction module. The marking module is used to mark the third metadata as erroneous metadata when the task identifier and the file identifier of the third metadata do not correspond; the correction module is used to correct the number of entries of the target second metadata and the associated mapping relationship based on the number of entries of the third metadata when the number of entries of the third metadata is inconsistent with the number of entries of the target second metadata and the backup original data is normal.

[0135] According to an embodiment of the present invention, the above-described apparatus further includes a clearing module. The clearing module is used to determine abnormal instances from the original backup data based on the inconsistency information when there is inconsistency between the target second metadata and the original backup data, clear the abnormal instances, and then perform a full backup. The inconsistency information includes at least one of the following: a mismatch in the number of physical addresses, or inconsistent data content.

[0136] According to an embodiment of the present invention, the above-described apparatus further includes an execution module and a delay module. The execution module is used to continue executing the backup task when the backup process is in the running state, and to perform reconciliation when the backup process state switches to completed; the delay module is used to delay the execution time of the backup task when the backup process is in the pending state.

[0137] According to an embodiment of the present invention, the above-described apparatus further includes a repair module. The repair module is used to repair the original data using historical backup original data in the event of original data loss.

[0138] According to embodiments of the present invention, any plurality of modules among the acquisition module 410, determination module 420, and reconciliation module 430 may be combined into one module, or any one of these modules may be split into multiple modules. Alternatively, at least a portion of the functionality of one or more of these modules may be combined with at least a portion of the functionality of other modules and implemented in one module. According to embodiments of the present invention, at least one of the acquisition module 410, determination module 420, and reconciliation module 430 may be at least partially implemented as hardware circuitry, such as a field-programmable gate array (FPGA), a programmable logic array (PLA), a system-on-a-chip, a system-on-a-substrate, a system-on-package, an application-specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging circuitry, or implemented in software, hardware, or firmware, or in any appropriate combination of any of these three implementation methods. Alternatively, at least one of the acquisition module 410, determination module 420, and reconciliation module 430 may be at least partially implemented as a computer program module, which, when run, can perform corresponding functions.

[0139] Figure 5 A block diagram of an electronic device suitable for implementing a data reconciliation method according to an embodiment of the present invention is shown schematically.

[0140] like Figure 5As shown, an electronic device 500 according to an embodiment of the present invention includes a processor 501, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 502 or a program loaded from a storage portion 508 into a random access memory (RAM) 503. The processor 501 may include, for example, a general-purpose microprocessor (e.g., a CPU), an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (e.g., an application-specific integrated circuit (ASIC)), etc. The processor 501 may also include onboard memory for caching purposes. The processor 501 may include a single processing unit or multiple processing units for performing different actions of the method flow according to an embodiment of the present invention.

[0141] RAM 503 stores various programs and data required for the operation of electronic device 500. Processor 501, ROM 502, and RAM 503 are interconnected via bus 504. Processor 501 executes various operations of the method flow according to embodiments of the present invention by executing programs in ROM 502 and / or RAM 503. It should be noted that the programs may also be stored in one or more memories other than ROM 502 and RAM 503. Processor 501 may also execute various operations of the method flow according to embodiments of the present invention by executing programs stored in said one or more memories.

[0142] According to an embodiment of the present invention, the electronic device 500 may further include an input / output (I / O) interface 505, which is also connected to a bus 504. The electronic device 500 may also include one or more of the following components connected to the input / output (I / O) interface 505: an input section 506 including a keyboard, mouse, etc.; an output section 507 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 508 including a hard disk, etc.; and a communication section 509 including a network interface card such as a LAN card, modem, etc. The communication section 509 performs communication processing via a network such as the Internet. A drive 510 is also connected to the input / output (I / O) interface 505 as needed. A removable medium 511, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 510 as needed so that computer programs read from it can be installed into the storage section 508 as needed.

[0143] The present invention also provides a computer-readable storage medium, which may be included in the device / apparatus / system described in the above embodiments; or it may exist independently and not assembled into the device / apparatus / system. The computer-readable storage medium carries one or more programs, which, when executed, implement the method according to the embodiments of the present invention.

[0144] According to embodiments of the present invention, the computer-readable storage medium may be a non-volatile computer-readable storage medium, such as including, but not limited to: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In the present invention, the computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. For example, according to embodiments of the present invention, the computer-readable storage medium may include ROM 502 and / or RAM 503 and / or one or more memories other than ROM 502 and RAM 503 described above.

[0145] Embodiments of the present invention also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowchart. When the computer program product is run on a computer system, the program code enables the computer system to implement the data reconciliation method provided in the embodiments of the present invention.

[0146] When the computer program is executed by the processor 501, it performs the functions defined in the system / apparatus of this invention. According to embodiments of the invention, the systems, apparatuses, modules, units, etc., described above can be implemented by computer program modules.

[0147] In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device or a magnetic storage device. In another embodiment, the computer program may also be transmitted and distributed in the form of signals over a network medium, and may be downloaded and installed via the communication section 509, and / or installed from a removable medium 511. The program code contained in the computer program can be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination thereof.

[0148] In such an embodiment, the computer program can be downloaded and installed from a network via communication section 509, and / or installed from removable medium 511. When the computer program is executed by processor 501, it performs the functions defined in the system of this embodiment of the invention. According to embodiments of the invention, the systems, devices, apparatuses, modules, units, etc., described above can be implemented by computer program modules.

[0149] According to embodiments of the present invention, program code for executing the computer programs provided in the embodiments of the present invention can be written in any combination of one or more programming languages. Specifically, these computational programs can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. Programming languages ​​include, but are not limited to, languages ​​such as Java, C++, Python, "C", or similar programming languages. The program code can be executed entirely on the user's computing device, partially on the user's device, partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).

[0150] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.

[0151] Those skilled in the art will understand that the features described in the various embodiments of the present invention can be combined and / or combined in various ways, even if such combinations or combinations are not explicitly described in the present invention. In particular, the features described in the various embodiments of the present invention can be combined and / or combined in various ways without departing from the spirit and teachings of the present invention. All such combinations and / or combinations fall within the scope of the present invention.

[0152] The embodiments of the present invention have been described above. However, these embodiments are merely illustrative and not intended to limit the scope of the invention. Although various embodiments have been described above, this does not mean that the measures in the various embodiments cannot be used advantageously in combination. Various substitutions and modifications can be made by those skilled in the art without departing from the scope of the invention, and all such substitutions and modifications should fall within the scope of the invention.

Claims

1. A data reconciliation method, characterized by, The method includes: In response to receiving a data reconciliation request from the cloud platform, the backup metadata dataset of the backup task is obtained according to the task identifier and backup version identifier in the data reconciliation request. The backup task is used to back up the original data of the business on the cloud platform via the backup metadata dataset to obtain the backup original data corresponding to the backup version identifier. The backup metadata dataset includes first metadata, second metadata and third metadata. The first metadata includes the task layer information of the backup task, the second metadata includes the physical storage location of the backup original data, and the third metadata includes the global storage information of the backup original data and the association mapping relationship between the physical storage location and the global storage information. The backup process status of the backup task is determined based on the task status in the task layer information. If the backup process is complete, the original data and the backup original data are reconciled based on the first metadata, the second metadata, and the third metadata to obtain the reconciliation result.

2. The method of claim 1, wherein, The third metadata also includes data management information for the backup of the original data; The reconciliation of the original data and the backup original data based on the first metadata, the second metadata, and the third metadata to obtain the reconciliation result includes: When the task identifier corresponds to the file identifier of the third metadata, the second metadata and the third metadata are matched according to the association mapping relationship in the third metadata, the global storage information, and the number of entries in the third metadata to obtain a matching result; If the matching result indicates that the second metadata matches the third metadata, the backup original data is retrieved from the physical storage location according to the data management information. The original data and the backup original data are compared for consistency to obtain the reconciliation results.

3. The method of claim 2, wherein, Based on the association mapping relationship in the third metadata, the global storage information, and the number of entries in the third metadata, the second metadata is matched with the third metadata to obtain a matching result, including: Determine the target second metadata that has the associated mapping relationship with the third metadata from the second metadata; If the global storage information matches the storage ownership information of the physical storage location of the target second metadata, the number of entries of the third metadata is compared with the number of entries of the target second metadata. When the number of entries in the third metadata is the same as the number of entries in the target second metadata, the matching result is obtained, which indicates that the second metadata matches the third metadata.

4. The method according to any one of claims 2 to 3, characterized in that, The data management information includes at least one of the following: data parsing method, data compression method, data decryption method, and data encapsulation method.

5. The method according to claim 3, characterized in that, The method further includes: A matching result indicating a mismatch between the second metadata and the third metadata is obtained if any one of the following conditions is met: The task identifier does not correspond to the file identifier of the third metadata. The target second metadata, which is associated with the third metadata, is not present in the second metadata. The global storage information does not match the storage ownership information of the physical storage location of the target second metadata; The number of entries for the third metadata is inconsistent with the number of entries for the target second metadata.

6. The method according to claim 3, characterized in that, The method further includes: If the task identifier does not correspond to the file identifier of the third metadata, the third metadata will be marked as erroneous metadata. If the number of entries in the third metadata is inconsistent with the number of entries in the target second metadata, and the original backup data is normal, the number of entries in the target second metadata and the associated mapping relationship shall be corrected according to the number of entries in the third metadata.

7. The method according to claim 3, characterized in that, The method further includes: In the event of inconsistencies between the target second metadata and the original backup data, an abnormal instance is identified from the original backup data based on the inconsistency information. After clearing the abnormal instance, a full backup is performed. The inconsistency information includes at least one of the following: mismatch in the number of physical addresses or inconsistent data content.

8. The method according to claim 1, characterized in that, The method further includes: If the backup process is in the running state, continue to execute the backup task, and perform reconciliation after the backup process is switched to the completed state. If the backup process is in a pending state, the execution time of the backup task is delayed.

9. The method according to claim 1, characterized in that, The method further includes: In the event of loss of the original data, the original data can be repaired using historical backups.

10. An electronic device, comprising: One or more processors; Memory, used to store one or more computer programs. The characteristic feature is that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1 to 9.