A continuous data protection method, apparatus, device, medium and product

CN122308731APending Publication Date: 2026-06-30CHINA UNITED NETWORK COMM GRP CO LTD +2

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHINA UNITED NETWORK COMM GRP CO LTD
Filing Date
2026-03-17
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing CDP systems lack effective request management and batch processing mechanisms when copying write input/output requests, resulting in a large number of fine-grained write operations impacting the storage system, increasing I/O load and network bandwidth consumption, and reducing data protection efficiency. This is especially true in high-concurrency, high-data-volume application scenarios where it is difficult to balance the real-time performance of data protection with system performance.

Method used

By acquiring write input/output requests from the target device, write records are generated and cached in the data cache area. When the preset batch processing conditions are triggered, the data is aggregated, and the batch is first persistently stored in the local cache device and then sent to the remote distributed object storage cluster. It supports building data snapshots at any point in time, adopts zero-copy sending and scatter-aggregate transmission methods, and optimizes data protection by combining a two-level storage strategy.

Benefits of technology

It improves storage I/O operation efficiency, enhances the reliability and flexibility of data protection, reduces latency, meets the data recovery point requirements of high-frequency trading systems and real-time analysis platforms, and reduces the problem of low data protection efficiency caused by frequent disk access and network transmission.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308731A_ABST
    Figure CN122308731A_ABST
Patent Text Reader

Abstract

This application provides a method, apparatus, device, medium, and product for continuous data protection, relating to the field of data processing technology. The method involves acquiring write input / output requests from a target device and copying the original request and the copy request. The original request is submitted to the target device, and write records are generated based on the copy request. This ensures both normal business processing by the target device and complete recording of data changes. Next, the write records are cached in a data cache and aggregated when preset batch processing conditions are triggered. This avoids a large number of fine-grained write operations directly impacting the storage system, reducing the inefficiency of data protection caused by frequent disk access and network transmission. Subsequently, the aggregated batch is first persistently stored in a local cache device, and then sent to a remote distributed object storage cluster after successful batch persistence. This improves storage I / O operation efficiency and enhances the reliability and flexibility of data protection.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, and in particular to a method, apparatus, device, medium and product for continuous data protection. Background Technology

[0002] With the widespread application of technologies such as artificial intelligence, big data analytics, and cloud computing, various business systems are becoming increasingly reliant on data. Any data loss or service interruption can lead to serious economic losses and business continuity risks. Therefore, implementing an efficient, reliable, and low-latency Continuous Data Protection (CDP) mechanism is of paramount importance.

[0003] Existing CDP systems lack effective request management and batch processing mechanisms when copying write input / output (I / O) requests. This results in a large number of fine-grained write operations directly impacting the storage system, increasing I / O load and network bandwidth consumption. This not only affects the performance of the target device but may also reduce overall data protection efficiency due to frequent disk accesses and network transmissions. Furthermore, CDP solutions lack flexible batch processing and aggregation strategies when persistently storing copied write records to local or remote storage devices, leading to frequent and inefficient storage I / O operations. Summary of the Invention

[0004] This application provides a method, apparatus, device, medium, and product for continuous data protection to solve the problems in the prior art.

[0005] Firstly, this application provides a method for continuous data protection, comprising:

[0006] Obtain the write input / output request of the target device, and copy the write input / output request to obtain the original request and the copied request;

[0007] The original request is submitted to the target device, and a write record is generated based on the data content of the copy request and the corresponding logical block address and timestamp;

[0008] The write records are cached in the data cache area, and when a preset batch processing condition is triggered, the multiple write records accumulated in the data cache area are aggregated into a batch.

[0009] The batch is persistently stored in a local cache device;

[0010] After the batch is successfully persisted to storage, the batch is sent to a remote distributed object storage cluster; wherein the remote distributed object storage cluster is configured to receive and save the batch, and to support the construction of data snapshots at any point in time based on the batch.

[0011] In one possible design, the preset batch processing conditions include at least two of the following: a time threshold, a record count threshold, and a byte count threshold.

[0012] The conditions for triggering the preset batch processing include:

[0013] If the caching duration of the write record in the data cache exceeds the time threshold, batch aggregation is triggered;

[0014] Or, if the number of write records in the data cache exceeds the record count threshold, batch aggregation is triggered;

[0015] Batch aggregation may be triggered if the total number of bytes of the write records in the data cache exceeds the byte count threshold.

[0016] In one possible design, the local cache device is a high-speed storage device with non-volatile properties;

[0017] The step of persistently storing the batch to a local cache device includes:

[0018] Obtain the available space of the local cache device, wherein the available space is the difference between the total capacity of the local cache device and the occupied space;

[0019] If the available space is greater than or equal to the sum of the batch size and the preset safety margin, the batch is written sequentially to the local cache device according to the circular buffer mechanism;

[0020] Update the superblock information of the local cache device, wherein the superblock information includes the data write location, the location of the data sent, and the checksum.

[0021] In one possible design, sending the batch to a remote distributed object storage cluster includes:

[0022] Construct a transmission data packet containing a message header, batch data, and a checksum, wherein the message header includes a target device identifier and batch data length information;

[0023] The data packets are transmitted using a scattering-focusing input / output method with zero copy transmission; wherein, zero copy transmission is a copy method that does not pass through user-space memory.

[0024] The network connection with the remote distributed object storage cluster is maintained using a preset connection pool. If the network is interrupted, a disconnection and reconnection process is triggered, and the batches that were not successfully sent are resent after a successful reconnection.

[0025] In one possible design, the remote distributed object storage cluster stores the batch in the following ways:

[0026] The remote distributed object storage cluster appends the received batches to the log object, and the log object is named according to the target device identifier and sequence number;

[0027] If the log object exceeds a preset size or a preset time interval, the batch data in the log object is merged into a mirror object; wherein, the mirror object is used to store a complete data copy of the target device;

[0028] An incremental snapshot is created based on the mirror object. The incremental snapshot is used to record change data that is different from the parent snapshot.

[0029] In one possible design, the method further includes:

[0030] Find the base snapshot whose timestamp is less than or equal to the target recovery time.

[0031] Read the incremental log object corresponding to the basic snapshot, extract the batch data in the incremental log object and parse it to obtain the corresponding historical write records;

[0032] The historical write records are sorted by logical block address and timestamp, and the target historical write record with the latest timestamp among those with the same logical block address is retained;

[0033] A temporary snapshot is created based on the base snapshot, and the target historical write records are applied to the temporary snapshot.

[0034] The data in the temporary snapshot is written back to the target recovery device in order of preset block size to complete the data recovery.

[0035] Secondly, this application provides a continuous data protection device, comprising:

[0036] The copy module is used to acquire the write input / output request of the target device and copy the write input / output request to obtain the original request and the copy request;

[0037] The generation module is used to submit the original request to the target device and generate a write record based on the data content of the copy request and the corresponding logical block address and timestamp;

[0038] The batch aggregation module is used to cache the write records to the data cache area, and when a preset batch processing condition is triggered, aggregate multiple write records accumulated in the data cache area into a batch.

[0039] A storage module is used to persistently store the batch to a local cache device;

[0040] The sending module is used to send the batch to a remote distributed object storage cluster after the batch is successfully persisted to storage; wherein the remote distributed object storage cluster is configured to receive and save the batch, and to support the construction of data snapshots at any point in time based on the batch.

[0041] Thirdly, this application provides an electronic device, including: a processor, and a memory communicatively connected to the processor;

[0042] The memory stores computer-executed instructions;

[0043] The processor executes computer execution instructions stored in the memory to implement the method as described in any of the first aspects.

[0044] Fourthly, embodiments of this application provide a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any of the first aspects.

[0045] Fifthly, embodiments of this application provide a computer program product, including a computer program that, when executed by a processor, implements the method described in any of the first aspects.

[0046] This application provides a continuous data protection method, apparatus, device, medium, and product. The method obtains the original and copy requests from the target device's write input / output requests, submits the original request to the target device, and generates write records based on the copy request. This ensures both normal business processing by the target device and complete recording of data changes. Next, the write records are cached in a data cache and aggregated when preset batch processing conditions are triggered. This avoids a large number of fine-grained write operations directly impacting the storage system, reducing the inefficiency of data protection caused by frequent disk access and network transmission. Especially in high-concurrency, high-data-volume application scenarios, it better balances the relationship between real-time data protection and system performance. Afterward, the aggregated batch is first persistently stored in a local cache device to ensure reliable local data storage. Only after successful persistent storage of the batch is it sent to a remote distributed object storage cluster. The remote distributed object storage cluster supports building data snapshots at any point in time based on the batch. Thus, through a two-level storage approach combined with flexible batch processing and aggregation strategies, not only is the efficiency of storage I / O operations improved, but the reliability and flexibility of data protection are also enhanced, providing efficient, reliable, and low-latency continuous data protection. Attached Figure Description

[0047] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.

[0048] Figure 1 An application scenario diagram corresponding to a continuous data protection method provided in an embodiment of this application;

[0049] Figure 2 A flowchart illustrating a continuous data protection method provided in an embodiment of this application;

[0050] Figure 3 A flowchart illustrating a continuous data protection method provided in another embodiment of this application;

[0051] Figure 4 A schematic diagram of a continuous data protection device provided in an embodiment of this application;

[0052] Figure 5 This is a structural example diagram of an electronic device provided in an embodiment of this application.

[0053] The accompanying drawings illustrate specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to particular embodiments. Detailed Implementation

[0054] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0055] To clearly understand the technical solution of this application, the solutions of the prior art will be described in detail first.

[0056] In the digital age, data security and continuity protection have become critical requirements for enterprise operations and personal information management. With the widespread application of technologies such as artificial intelligence, big data analytics, and cloud computing, various business systems are increasingly reliant on data. Any data loss or service interruption can lead to severe economic losses and business continuity risks. Therefore, implementing an efficient, reliable, and low-latency Continuous Data Protection (CDP) mechanism is of paramount importance.

[0057] Existing data protection solutions, such as periodic backups and snapshots, while capable of recovering data to some extent, suffer from significant time intervals and the risk of data loss. Particularly in high-frequency transaction systems, real-time analytics platforms, and mission-critical applications, these solutions fail to meet the stringent requirements of Recovery Point Objectives (RPO). Existing Content Provider Dependency (CDP) solutions largely rely on centralized storage architectures, achieving continuous data protection by capturing and replicating all write operations. However, existing CDP systems lack effective request management and batch processing mechanisms when replicating write input / output (I / O) requests. This results in a large number of fine-grained write operations directly impacting the storage system, increasing I / O load and network bandwidth consumption. This not only affects the performance of the target device but may also reduce overall data protection efficiency due to frequent disk accesses and network transmissions. Especially in high-concurrency, high-volume application scenarios, existing CDP solutions struggle to balance the real-time nature of data protection with system performance. Furthermore, CDP solutions lack flexible batch processing and aggregation strategies when persistently storing replicated write records to local or remote storage devices, leading to frequent and inefficient storage I / O operations.

[0058] Figure 1 An application scenario diagram corresponding to a continuous data protection method provided in an embodiment of this application is shown, such as... Figure 1 As shown, the application scenario provided in this embodiment includes: a business server 10, a CDP control device 11, a local cache device 12 (such as a high-speed SSD array, used for temporary persistent storage of batch data), and a remote distributed object storage cluster 13 (used for long-term storage and supporting snapshot construction). The business server 10 and the CDP control device 11 are connected via PCIe or a high-speed network, and the CDP control device 11 establishes data transmission links with the local cache device 12 and the remote distributed object storage cluster 13, respectively.

[0059] Optionally, after the business server 10 generates a write I / O request, the CDP control device 11 obtains the write I / O request, copies the write I / O request into an original request and a copy request, and submits the original request to the target device normally to ensure that the business is not affected. Subsequently, based on the data content of the copy request, the corresponding logical block address and timestamp, a structured write record is generated. Then, the write record is cached in the data cache area. When the preset batch processing condition is triggered, the accumulated multiple write records are aggregated into a batch to reduce scattered write operations. After that, the CDP control device 11 persists the batch data to the local cache device 12 to realize the data is quickly written to disk nearby. After the batch data is successfully stored, the CDP control device 11 sends the batch data to the remote distributed object storage cluster 13. The remote distributed object storage cluster 13 receives and saves the batch data and supports building data snapshots at any time point based on historical batch data to meet the data backtracking requirements.

[0060] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will now be described with reference to the accompanying drawings.

[0061] In the several embodiments provided in this application, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces, devices, or modules, and may be electrical, mechanical, or other forms.

[0062] Figure 2 A flowchart illustrating a continuous data protection method according to an embodiment of this application is shown below. Figure 2 As shown, the execution subject of this embodiment is a continuous data protection device. This device can be implemented through a computer program, or through a medium storing the relevant computer program, such as a USB flash drive and / or optical disc; alternatively, it can be implemented through a physical device that integrates or installs the relevant computer program, such as a chip or electronic device. The electronic device may be a computer or a server, etc. The continuous data protection method provided in this embodiment includes the following steps:

[0063] S201. Obtain the write input / output request of the target device and copy the write input / output request to obtain the original request and the copy request.

[0064] It should be noted that the purpose of this step is to offload write I / O requests, ensuring that normal business read / write operations on the target device are not affected, while simultaneously capturing write request data for subsequent protection. The target device refers to the business device requiring data protection, including but not limited to servers, storage nodes, and business terminals. Write input / output requests generated by the target device are the core operations that cause data changes on the target device, such as data addition, modification, and updates.

[0065] Optionally, a preset request capture module can capture all write I / O requests generated by the target device in real time to ensure no write requests are missed and to guarantee the continuity of data protection. Subsequently, a request replication module replicates each captured write I / O request, generating two identical copies: the original request and the replicated request. The original request is used to maintain the normal business processes of the target device, while the replicated request is used for subsequent data protection-related processing, such as write record generation and persistent storage. The original and replicated requests are transmitted independently and do not interfere with each other, avoiding any impact on the business performance of the target device from data protection operations.

[0066] S202. Submit the original request to the target device and generate a write record based on the data content of the copy request and the corresponding logical block address and timestamp.

[0067] The purpose of this step is to transform the replication request into a data carrier that can be used for storage and recovery, i.e., a write record. First, the generated original request is submitted to the target device according to the target device's normal business process, ensuring that the target device can respond to the write request in a timely manner, complete the data read and write operations, and guarantee business continuity. The submission process of the original request is consistent with the write request processing process of the target device in the existing technology, without the need to modify the business logic of the target device, thus reducing the deployment difficulty of the solution.

[0068] Specifically, based on the generated copy request, key information of the copy request is extracted to generate a write record. Key information includes, but is not limited to, the data content corresponding to the copy request (the data itself that the target device needs to write), the logical block address corresponding to the data (used to identify the location of the data in the target device's storage medium, providing a basis for address location during subsequent data recovery), and the timestamp of the request processing (used to record the time when the write request was generated, providing support for the construction of data snapshots at any subsequent time point and the location of the time point for data recovery).

[0069] Optionally, the above key information is integrated according to a preset format to generate a complete write record. Each write record uniquely corresponds to a copy request, accurately recording the details of data changes in a single write operation, and providing data support for subsequent data recovery.

[0070] S203. Cache the write records to the data cache area, and when the preset batch processing condition is triggered, aggregate the multiple write records accumulated in the data cache area into a batch.

[0071] This step replaces the existing technology's "instant transfer" with "cache aggregation," reducing the pressure on the storage system from fine-grained write operations.

[0072] The data cache is a pre-defined temporary storage space used to temporarily store generated write records. Its function is to buffer write records and avoid performing a storage operation for each write record generated, thereby reducing the frequency of storage I / O calls.

[0073] Specifically, each generated write record is cached sequentially in the data cache area. The data cache area adopts a first-in-first-out storage logic to ensure that the time order of write records is not disrupted and to guarantee the time sequence consistency during data recovery.

[0074] Optionally, when preset batch processing conditions are met, multiple write records accumulated in the data cache are aggregated and merged into a single batch. The preset batch processing conditions are pre-set triggers for the aggregation operation; their specific settings can be flexibly adjusted according to the actual application scenario. The aim is to achieve batch processing of write records, reduce the amount of data in a single storage operation, and lower storage I / O load and network transmission pressure.

[0075] S204. Persist the batch to the local cache device.

[0076] The purpose of this step is to ensure the local reliability of data and avoid data loss due to network anomalies, remote storage failures, etc. The local cache device refers to a storage device deployed in the same local environment as the request processing module. Local cache devices are characterized by fast read / write speeds and low latency, enabling rapid batch persistent storage.

[0077] Specifically, after batch aggregation of write records is completed, the batch is transferred to a local cache device. Using a pre-defined persistent storage strategy, all write records in the batch are permanently stored in the local cache device, ensuring that data is not lost due to power outages, restarts, or other abnormal events. The storage operations on the local cache device are performed on a batch basis, rather than on individual write records, thus significantly reducing the number of I / O calls to local storage, improving local storage efficiency, and providing a stable data source for subsequent remote transmission.

[0078] S205. After the batch is successfully persisted to storage, the batch is sent to a remote distributed object storage cluster; wherein the remote distributed object storage cluster is configured to receive and save the batch, and to support building data snapshots at any point in time based on the batch.

[0079] The purpose of this step is to achieve remote off-site backup of write record batches, further improving the reliability of data protection and providing flexible timing options for data recovery. The remote distributed object storage cluster is a distributed storage system deployed in a different location. It features high availability, high scalability, and strong fault tolerance, preventing data loss due to local environmental anomalies (such as local equipment failures or natural disasters), thus achieving off-site disaster recovery protection for data.

[0080] Optionally, the persistent storage result of the batch on the local cache device is first verified. After confirming that the batch has been successfully stored (i.e. no data loss and data integrity), the batch is then transmitted to the remote distributed object storage cluster via the network to avoid data redundancy or data loss due to unsuccessful local storage.

[0081] Optionally, after receiving the batches, the remote distributed object storage cluster saves all batches according to a preset storage strategy. Optionally, the snapshot construction module constructs a data snapshot at any point in time based on all stored batches (combined with the write record timestamps corresponding to each batch), providing support for subsequent data recovery. That is, users can restore the data to any specified point in time as needed.

[0082] This application provides a continuous data protection method that obtains write input / output requests from the target device and copies the original request and the copy request. The original request is submitted to the target device, and write records are generated based on the copy request. This ensures that the target device can process business normally while completely recording data change information. Next, the write records are cached in a data cache and aggregated when a preset batch processing condition is triggered. This effectively solves the problem of existing CDP systems lacking effective request management and batch processing mechanisms, avoiding a large number of fine-grained write operations directly impacting the storage system, significantly reducing the performance pressure on the target device, and reducing the problem of low data protection efficiency caused by frequent disk access and network transmission. Especially in high-concurrency, large-data-volume application scenarios, it can better balance the relationship between the real-time performance of data protection and system performance. Subsequently, the aggregated batches are first persistently stored on a local cache device to ensure reliable local storage of the data. After successful persistence, the batches are then sent to a remote distributed object storage cluster. The remote distributed object storage cluster supports building data snapshots at any point in time based on batches. By combining a two-level storage approach with flexible batch processing and aggregation strategies, not only is the efficiency of storage I / O operations improved, but the reliability and flexibility of data protection are also enhanced. This can meet the data recovery target requirements of high-frequency transaction systems, real-time analysis platforms, and critical business applications, effectively reducing economic losses and business continuity risks caused by data loss or service interruption, and providing more efficient, reliable, and low-latency continuous data protection.

[0083] As an optional implementation, based on any of the above embodiments, the preset batch processing conditions include at least two of the following: a time threshold, a record count threshold, and a byte count threshold; triggering the preset batch processing conditions includes the following steps:

[0084] Batch aggregation is triggered when the cache duration for writing records in the data cache exceeds a time threshold; or when the number of records written in the data cache exceeds a record count threshold; or when the total number of bytes written in the data cache exceeds a byte count threshold.

[0085] Among them, the time threshold refers to the preset maximum cache duration of write records. It is a fixed duration set in advance according to the real-time requirements of data protection in the business scenario, such as 100 milliseconds, 500 milliseconds, etc. The purpose of the time threshold is to ensure the real-time performance of data protection and avoid write records being cached in the data cache area for too long, which would increase the risk of data loss or increase the RPO during data recovery.

[0086] Optionally, a cache start timestamp is added to each write record entering the data cache area, and the cache duration of the write record (the difference between the current time and the cache start timestamp) is calculated in real time. When the cache duration of any write record in the data cache area exceeds a preset time threshold, regardless of whether the number of write records or the total number of bytes in the cache area have reached the corresponding threshold, a batch aggregation operation is immediately triggered to aggregate all currently accumulated write records in the data cache area into a batch, followed by the local persistence step S204 and the remote transmission step S205. This method is suitable for scenarios with high real-time data protection requirements (such as high-frequency trading systems). By limiting the cache duration through time thresholds, it ensures that write records can be aggregated and stored in a timely manner, reducing the risk of data loss.

[0087] The record count threshold refers to the maximum number of records that can be written in a single batch aggregation, such as 50, 100, or 200 records. The record count threshold is a fixed value that is set in advance based on the I / O processing capacity of the storage device and the network bandwidth capacity. Its purpose is to balance storage efficiency and system load, and to avoid the number of write records in a single aggregation being too small (which cannot reduce the I / O load) or too large (which leads to excessive aggregation time and excessive pressure on a single storage operation).

[0088] Optionally, by statistically analyzing the total number of write records currently accumulated in the cache in real time, when the statistical result exceeds a preset record count threshold, a batch aggregation operation is triggered regardless of whether the cache duration and total number of bytes of the write records have reached the corresponding thresholds, aggregating all write records in the cache into a single batch. This method is suitable for scenarios with stable write I / O request frequency and small data volume per write record (such as ordinary business data entry systems). By controlling the number of records aggregated in a single batch, balanced control of storage I / O load is achieved, improving storage efficiency.

[0089] Among them, the byte number threshold refers to the maximum total number of bytes of data corresponding to a single batch aggregation, such as 1MB, 5MB, 10MB, etc. The byte number threshold is a fixed value set in advance based on the optimal data volume of a single read / write of the storage device and the size of the network transmission unit. Its function is to optimize storage and network transmission efficiency and avoid low transmission efficiency due to too small a single aggregated data volume, or high transmission latency due to too large a single aggregated data volume.

[0090] Optionally, by statistically analyzing the total number of bytes of all accumulated write records in the cache in real time (i.e., the sum of the bytes of each write record), when the statistical result exceeds a preset byte threshold, a batch aggregation operation is triggered regardless of whether the cache duration or number of write records has reached the corresponding threshold, aggregating all write records in the cache into a single batch. This method is suitable for scenarios where the data volume of a single write record varies greatly and the data volume of write requests fluctuates significantly (such as real-time video data storage systems). By controlling the total number of bytes in a single aggregation, it ensures that the batch data volume is adapted to the processing capabilities of the storage device and network, thereby improving overall data protection efficiency.

[0091] It should be noted that in this embodiment, the preset batch processing conditions must include at least two of the above three thresholds, that is, at least simultaneously setting the time threshold and record count threshold, the time threshold and byte count threshold, or the record count threshold and byte count threshold, or simultaneously setting all three thresholds to avoid the limitations caused by a single threshold. For example, if only the record count threshold is set, and the write request frequency is extremely low, the number of write records in the cache may not reach the threshold for a long time, resulting in excessive write record caching time and affecting data real-time performance. If only the time threshold is set, and the write request frequency is extremely high, batch aggregation may be frequently triggered in a short period of time, resulting in too many batches and failing to adequately reduce storage I / O load. If only the byte count threshold is set, there may be situations where the number of write records is too high but the total number of bytes does not reach the threshold, or the total number of bytes meets the standard but the number of records is too low, affecting aggregation efficiency. By using a combination of at least two thresholds, complementary advantages can be achieved, ensuring the real-time performance of data protection while effectively reducing system load, improving aggregation efficiency, and adapting to more diverse business scenarios.

[0092] As an alternative implementation, based on any of the above embodiments, the local cache device is a high-speed storage device with non-volatile properties.

[0093] Non-volatility means that the stored data will not be lost when the storage device is powered off, restarted, or experiences a temporary failure, and can be stably retained for a long time, thereby ensuring the reliability of batch data in the local storage stage; high-speed storage device means that the device has extremely high read and write speeds, higher than traditional mechanical hard drives, and can quickly complete the write and read operations of batch data, adapting to the fast processing needs of batch data in high-concurrency scenarios, avoiding batch accumulation due to slow local storage speed and affecting the real-time performance of data protection.

[0094] Optionally, high-speed storage devices with non-volatile properties include, but are not limited to, solid-state drives (SSDs), non-volatile memory (NVM), and embedded multimedia cards (eMMC). These can be flexibly selected based on the storage capacity requirements, read / write speed requirements, and cost budget of the actual business scenario, without requiring modifications to other steps of the overall solution, thus improving the deployment flexibility and compatibility of the solution.

[0095] Specifically, persistently storing batches to a local cache device includes the following steps:

[0096] First, obtain the available space of the local cache device, where the available space is the difference between the total capacity of the local cache device and the space already occupied.

[0097] The purpose of this step is to perform a preliminary space check to avoid batch write failures, data loss, or storage anomalies due to insufficient space on the local cache device. Optionally, the space detection module of the local cache device can read the total capacity of the local cache device (i.e., the maximum space that the device can use to store data) and the occupied space (i.e., the space occupied by the stored batch data, system files, etc.) in real time, and calculate the available space that can be used to store the new batch of data using the formula: Available space = Total capacity - Occupied space.

[0098] It should be noted that the space detection operation is performed in real time and is triggered before each batch of writes to ensure the timeliness of the detection results. This avoids discrepancies between the calculated available space and the actual available space caused by multiple batches of concurrent writes or other processes occupying space, which could lead to write failures. In addition, the space detection process is lightweight and does not consume excessive system resources or device bandwidth, thus not affecting the read and write performance of the local cache device or the efficiency of the overall data protection process.

[0099] Secondly, if the available space is greater than or equal to the sum of the batch size and the preset safety margin, the batches are written to the local cache device in sequence according to the circular buffer mechanism.

[0100] This step is the core operation of batch persistent storage, which includes two stages: space threshold judgment and ordered writing. This ensures the feasibility of writing and the orderliness of batch storage.

[0101] The prerequisite for batch writing is that "available space ≥ batch size + preset safety margin". The preset safety margin is a fixed space (e.g., 100MB, 500MB) pre-set based on the business scenario. Its purpose is to reserve emergency space, preventing insufficient space during writing due to device system usage, space fragmentation, etc., when available space is exactly equal to the batch size, thus further improving the success rate of batch writing. In addition, the preset safety margin can also be used to handle unexpected situations (such as temporary failure of local caching devices, space occupied by data verification, etc.), providing additional security for batch storage. If the available space is less than the sum of the batch size and the preset safety margin, a space warning is triggered, and the batch write is not executed to avoid data loss.

[0102] Among them, the ring buffer mechanism is the core method of batch writing. The principle of the ring buffer mechanism is to divide the storage space of the local cache device into multiple contiguous storage blocks to form a ring storage queue. Each storage block corresponds to the storage space of a batch. The batch data is written to the storage blocks of the ring queue in the order of "first in first out". When the writing reaches the end of the queue, it automatically returns to the beginning of the queue to overwrite the batch data that has been remotely transmitted and does not need to be retained, or to retain the core batch data according to the preset retention strategy.

[0103] It should be noted that the circular buffer mechanism eliminates the need for frequent allocation and release of storage space, reducing storage fragmentation and improving the space utilization of the local cache device. In addition, batch writing is performed in a fixed order, which can accurately record the write position of each batch, facilitating subsequent batch reading, remote transmission, and data traceability. Moreover, the circular buffer mechanism has high write efficiency, adapting to the needs of multiple batches of continuous writing in high-concurrency scenarios, matching the high-speed storage characteristics of the local cache device, and further improving batch storage efficiency.

[0104] Optionally, during the batch writing process, a sequential writing method is adopted, that is, all write records of each batch are written to the storage block in the order of their generation time, to ensure the temporal consistency of write records within the batch, to support sequential reading and timestamp matching during subsequent data recovery, and to avoid data recovery anomalies due to disordered writing order.

[0105] Finally, update the superblock information of the local cache device, which includes the data write location, the location of the data sent, and the checksum.

[0106] The purpose of this step is to update the storage status in real time, ensuring that the storage information of the local cache device is traceable and verifiable, and providing a basis for subsequent batch reading, remote transmission verification, and data anomaly investigation.

[0107] The superblock is a region in the local cache device used to store device status information and data management information. It is equivalent to the index directory of the local cache device. The information stored in it is real-time and complete, and can be read and modified by the system in real time.

[0108] Specifically, the data write location records the exact storage address (i.e., the corresponding storage block location) of the latest batch of data in the local cache device's circular storage queue. This facilitates quick location of the batch when reading the data or performing remote transmission operations, reducing address lookup time and improving remote transmission efficiency. Furthermore, it provides a location reference for writing the next batch of data, ensuring the continuity and orderliness of batch writing.

[0109] The "sent data location" records the specific storage address of batch data that has been successfully transmitted to the remote distributed object storage cluster in the local cache device's circular storage queue. This helps the system identify which batches of data have been remotely backed up and can be overwritten, and which batches of data have not been remotely transmitted and need to be retained, thus avoiding data loss caused by mistakenly overwriting untransmitted batches.

[0110] The checksum is a unique check code calculated using a preset checksum algorithm (such as CRC checksum, MD5 checksum, etc.) for the currently written batch of data. It is used to verify whether the batch data has been damaged, lost, or tampered with during the writing process, ensuring the integrity of the batch data. Before subsequent readings of batch data for remote transmission, the checksum of the batch data can be recalculated and compared with the checksum stored in the superblock. If the comparison matches, it indicates that the batch data is complete and transmission can proceed; if the comparison does not match, it indicates that the batch data is abnormal, triggering a data rewrite or anomaly handling mechanism to ensure data reliability.

[0111] It should be noted that the update operation of the superblock information is completed synchronously with the batch data writing operation. That is, after the batch data is written, the corresponding information in the superblock is updated immediately to ensure that the superblock information is consistent with the actual storage state of the local cache device, and to avoid problems such as batch positioning errors and data verification failures caused by information update delays.

[0112] As an optional implementation, based on any of the above embodiments, sending the batch to a remote distributed object storage cluster includes the following steps:

[0113] First, construct a transmission data packet containing a message header, batch data, and a checksum. The message header includes the target device identifier and batch data length information.

[0114] Optionally, the batch data to be transmitted can be structured and encapsulated, integrating the message header, batch data, and checksum into a unified format transmission data packet. This ensures the integrity and identifiability of the transmitted data, while the checksum enables data integrity verification during transmission, thus mitigating the risk of data corruption in advance.

[0115] The message header is the "header identifier" of the transmitted data packet, which is used to quickly inform the remote distributed object storage cluster of the basic information of the data packet, so that the receiving end can quickly parse, identify and process the data packet, obtain key information without parsing the complete data packet, and improve the parsing efficiency of the data packet.

[0116] The target device identifier is a unique code that identifies the target device (such as a combination of device ID, IP address, and port number). The purpose of the target device identifier is to allow the receiving end to accurately identify which target device the batch of data originated from. This facilitates the remote distributed object storage cluster's categorization and management of batch data by target device, and also facilitates quick location of the corresponding batch data during subsequent data recovery. The batch data length information informs the remote distributed object storage cluster of the specific number of bytes in the currently transmitted data packet. This allows the remote distributed object storage cluster to verify the integrity of the received data. If the received data volume does not match the batch data length information, it indicates that data loss occurred during transmission, triggering a retransmission mechanism.

[0117] Among them, batch data is the core content of the transmitted data packet, namely the complete batch data that has been aggregated in S203 and persisted locally in S204. The batch data includes all write records in the batch and information such as the corresponding timestamp and logical block address. During the batch data encapsulation process, the timing consistency of the write records in the batch is maintained to ensure that the batch data stored at the receiving end is completely consistent with the batch data stored in the local cache device.

[0118] The checksum is a unique check code calculated using a preset check algorithm (such as CRC32, MD5, SHA-1, etc.) for the entire transmitted data packet (or only for batch data, which can be flexibly set according to the scenario). The function of the checksum is to realize data integrity verification during the transmission process.

[0119] Optionally, the sending end calculates the checksum and encapsulates it into the transmitted data packet. After receiving the data packet, the receiving end recalculates the checksum using the same verification algorithm and compares the calculated result with the checksum in the data packet. If the comparison matches, it means that the data packet has not been damaged, tampered with, or lost during transmission and can be parsed and stored normally. If the comparison does not match, it means that the data packet is abnormal. The receiving end refuses to store it and notifies the sending end to trigger a retransmission to ensure that the batch data transmitted to the receiving end is complete and reliable.

[0120] It should be noted that the format of the transmitted data packets is preset to a unified standard, and the sending and receiving ends use the same parsing rules to ensure that the data packets can be parsed normally. Standardized encapsulation can improve transmission compatibility, adapt to different types of remote distributed object storage clusters, and improve the versatility of the solution.

[0121] Secondly, a scattering-focusing input-output method is used to send data packets with zero copy; where zero copy means that the transmission does not pass through user space memory.

[0122] The purpose of this step is to improve transmission efficiency, reduce system load, and solve the defects of high transmission latency and high CPU and memory resource consumption caused by multiple data copies in the existing transmission method, so as to adapt to the fast transmission needs in high-concurrency and large data volume scenarios.

[0123] Optionally, the distributed reading and centralized sending of data packets can be achieved through scattering and focusing I / O. Combined with zero-copy technology, this reduces the number of times data is copied between memory and device, shortens the transmission link, increases the transmission rate, and reduces system resource consumption.

[0124] The Scatter-Gather I / O method can directly read message headers, batch data, and checksums that are scattered and stored in different memory addresses, without integrating them into a contiguous memory space. It directly encapsulates the scattered data into a transmission data packet and sends it. This method reduces data movement and integration operations in memory, saving CPU resources and memory bandwidth. It is especially suitable for scenarios with large batch data volumes and scattered storage of components, further improving transmission efficiency.

[0125] Zero-copy transmission skips the copying process in user-space memory, enabling direct transmission from the local cache device to kernel-space memory and then to the network interface card. Data is not copied or processed through user-space memory, reducing the number of data copies. Zero-copy shortens the data transmission link, reduces CPU and memory resource usage, and improves transmission speed. Especially in high-concurrency, large-volume batch transmission scenarios, it effectively avoids excessive transmission latency and system resource exhaustion, ensuring real-time data protection.

[0126] It should be noted that the scattering-focusing I / O method works synergistically with zero-copy technology. Scattering-focusing I / O solves the problem of efficient integration of the various components of the data packet, while zero-copy technology solves the problem of efficient transmission during the data transmission process. The combination of the two can maximize transmission efficiency, reduce system load, and ensure that the transmission process does not affect the normal business operation of the target device.

[0127] Finally, a pre-defined connection pool is used to maintain the network connection with the remote distributed object storage cluster. If the network is interrupted, a reconnection process is triggered, and the batches that were not successfully sent are resent after a successful reconnection.

[0128] The purpose of this step is to ensure transmission reliability, solve problems such as data loss due to network interruptions during existing transmission processes, resource waste and increased latency caused by frequent connection creation / closing, and ensure that batch data can be successfully transmitted to the remote distributed object storage cluster to achieve off-site disaster recovery protection.

[0129] Optionally, a pre-defined connection pool can be used to manage network connections in a unified manner, enabling connection reuse, reducing the overhead of connection creation and closure, and establishing a reconnection mechanism to cope with network interruption scenarios, ensuring uninterrupted transmission and no data loss.

[0130] The preset connection pool is a pre-created, stable network connection queue with the remote distributed object storage cluster, managed, reused, and monitored in a unified manner. Connections in the preset connection pool can be reused repeatedly. Each time a batch of data is transmitted, an idle connection is retrieved from the preset connection pool, and the connection is returned to the pool after transmission is complete, eliminating the need for frequent connection creation and closure. Furthermore, the preset connection pool monitors the status of each connection in real time (e.g., whether it is normal or idle), promptly cleaning up and replenishing abnormal connections to ensure that connections in the preset connection pool are always available, improving network connection stability and reuse rate, and reducing system resource consumption.

[0131] Optionally, the network connection status with the remote distributed object storage cluster can be monitored in real time. If a network interruption is detected (such as network failure, link disconnection, or no response from the receiving end), the disconnection and reconnection process can be triggered immediately without manual intervention.

[0132] Optionally, a preset reconnection threshold (e.g., 3 times, 5 times) and reconnection interval (e.g., 100 milliseconds, 500 milliseconds) are set. The network connection is re-established sequentially according to the preset interval. If the reconnection is successful within the preset number of reconnections, batch transmission is resumed. If the reconnection fails after the preset number of reconnections, an alarm mechanism is triggered and the batch information that was not successfully sent is recorded. After the network is restored, reconnection and retransmission are triggered again to ensure that no batch data is missed.

[0133] Optionally, after a successful reconnection, the system automatically retrieves batches of data that were not successfully sent. This can be done by comparing the sent data location with the data write location in the superblock information of the local cache device to locate the batches that were not successfully sent. The system then resends the batches that were not successfully sent in the order of their generation time, ensuring that all batches of data can be successfully transmitted to the remote distributed object storage cluster.

[0134] As an optional implementation, based on any of the above embodiments, the method of storing batches in a remote distributed object storage cluster includes the following steps:

[0135] First, the remote distributed object storage cluster appends the received batches to the log object, and the log object is named according to the target device identifier and sequence number.

[0136] The purpose of this step is to achieve orderly and real-time archiving of batches, ensuring that received batch data is not lost or out of sequence, and facilitating classified management by target device, thus providing clear basic data support for subsequent data integration and snapshot creation.

[0137] Optionally, the "log append write" mode is adopted to replace the traditional "random write" mode, avoiding data overwriting and timing disorder during the writing process. Through standardized naming rules, the orderly differentiation and rapid retrieval of data from multiple target devices and multiple batches can be achieved, solving the problems of batch disorder and retrieval difficulties in existing remote storage.

[0138] The log object is a pre-defined storage medium in the remote distributed object storage cluster used for temporarily archiving batch data. Essentially, the log object is a linear storage structure that records batch data in chronological order. It employs an "append-write" mode, meaning that each received batch is directly appended to the end of the corresponding log object without modifying any batch data already stored within it. This write mode is simple and efficient, requiring no address location or data migration, and can quickly complete batch reception and archiving, adapting to real-time storage needs in high-concurrency batch transmission scenarios. Furthermore, append-write strictly preserves the batch data reception sequence (consistent with the batch generation sequence), ensuring accurate reconstruction of the data change process and guaranteeing data temporal consistency during subsequent data integration and recovery.

[0139] It should be noted that the log object is scalable and can dynamically expand according to the amount of batch data received without manual intervention, avoiding batch storage failure due to insufficient log object capacity and improving the reliability of remote storage. In addition, the log object adopts a distributed storage architecture, and its data is synchronously backed up to multiple nodes in the cluster, avoiding batch data loss due to the failure of a single node and making full use of the high availability characteristics of the remote distributed object storage cluster.

[0140] Optionally, log objects are named according to the rule of "target device identifier and sequence number". The target device identifier is a unique code that identifies the target device (consistent with the target device identifier in the data packet header, such as a combination of device ID, IP address, and port number). The sequence number is a unique sequence code for the log object (e.g., 001, 002, 003...) used to distinguish different log objects corresponding to the same target device. This naming rule achieves "classification by device and differentiation by sequence number", facilitating the rapid identification of the target device corresponding to each log object by the remote distributed object storage cluster, and simplifying the management of multiple log objects for the same target device, such as data integration by sequence number and cleaning up expired log objects.

[0141] For example, for a target device identified as "Transaction Server ID001", its first log object is named "Transaction Server ID001_001", the second log object is named "Transaction Server ID001_002", and so on. By naming, the target device and its corresponding log object can be quickly located, improving the retrieval efficiency of batch data.

[0142] In addition, batch data for the same target device is appended to the current log object corresponding to that target device until the log object meets the merging conditions. Then, a new log object with a new sequence number is created, and the batch appending continues. This ensures that batch data for the same target device is centrally archived, improving management efficiency.

[0143] Secondly, if the log object exceeds the preset size or preset time interval, the batch data in the log object will be merged into the mirror object; the mirror object is used to store a complete data copy of the target device.

[0144] The purpose of this step is to achieve efficient integration of batch data, generate a complete data copy of the target device, solve the problems of scattered storage, data redundancy, and low retrieval efficiency of batch data in log objects, and provide complete basic data for subsequent incremental snapshot creation, balancing storage efficiency and data integrity.

[0145] Optionally, by setting preset merging conditions, scattered batch data in the log object can be merged periodically or quantitatively to form a complete data copy (mirror object) of the target device. This reduces data redundancy, improves the efficiency of data retrieval and snapshot creation, and retains the core information of the batch data to ensure data traceability.

[0146] Optionally, the merging condition is either "log objects exceed the preset size" or "log objects exceed the preset time interval". Meeting either condition will trigger the merging operation, ensuring that the log objects will not cause slow retrieval due to excessive data volume, nor will the batch data be scattered for a long time due to excessive existence, thus balancing data integration efficiency and storage performance.

[0147] The preset size is the maximum capacity of the log object (e.g., 10GB, 50GB) set in advance based on the storage capacity and data retrieval efficiency requirements of the remote distributed object storage cluster. The purpose of the preset size is to control the size of the log object and avoid the write and retrieval efficiency from being too large. The preset time interval is the maximum time span (e.g., 1 hour, 6 hours) set in advance based on the real-time data integration requirements of the business scenario. The purpose of the preset time interval is to ensure that batch data can be integrated into the mirror object in a timely manner and to avoid batch data being scattered in the log object for a long time, which would affect the efficiency of subsequent snapshot creation and the speed of data recovery.

[0148] Optionally, the remote distributed object storage cluster monitors the size and creation time of each log object in real time. When the current size of a log object exceeds the preset size or its creation time exceeds the preset time interval, a merge operation is immediately triggered to merge all batch data in the log object into the corresponding target device's mirror object. After the merge is completed, the log object can be cleaned up according to a preset strategy (such as deleting after backup or archiving), releasing storage space and improving space utilization.

[0149] The mirror object is a storage medium in a remote distributed object storage cluster used to store a complete copy of the target device's data. Each target device corresponds to a unique mirror object, which stores the latest complete data state of the target device, essentially a "mirror backup" of the target device's data. The core of the merge operation is to parse and integrate multiple batches of data scattered in the log object according to the order in which the batches were generated, remove redundant data (such as multiple modification records of the same data, only the latest modification record is retained), supplement missing data, and finally update the mirror object, so that the mirror object always maintains the complete data state of the target device.

[0150] Optionally, during the merging process, information such as timestamps and logical block addresses corresponding to each batch of data is retained to ensure that the data in the mirror object is not only complete but also traceable in its change process, providing support for subsequent incremental snapshot creation and data recovery at any point in time. In addition, the merging operation adopts a distributed parallel processing method, which can quickly complete the parsing and integration of multiple batches of data, avoiding the impact of excessive merging time on the overall performance of remote storage, and adapting to the batch merging needs of high concurrency and large data volume.

[0151] The mirror object serves as a complete data copy of the target device, providing the foundation for data recovery. When data loss or damage occurs on the target device, the complete data can be quickly recovered directly based on the mirror object, significantly shortening recovery time. Furthermore, the mirror object provides a baseline for subsequent incremental snapshot creation. Incremental snapshots only need to record changes from the mirror object (or the previous snapshot), without storing the complete data, reducing snapshot storage space usage and improving the efficiency of snapshot creation and management.

[0152] Finally, an incremental snapshot is created based on the mirror object. The incremental snapshot is used to record change data that is different from the parent snapshot.

[0153] The purpose of this step is to achieve lightweight snapshot management, support data recovery at any point in time, and solve the shortcomings of existing full snapshots, such as large storage space consumption, long creation time, and cumbersome management. While ensuring the flexibility of data recovery, it improves snapshot creation efficiency and space utilization.

[0154] Optionally, the mirror object (or the previous incremental snapshot) is used as the parent snapshot, and only the batch data changes (i.e., the newly added and modified data) between the two snapshots are recorded. The complete data is not stored, which realizes the lightweight creation and storage of snapshots. Through the chain association of multiple incremental snapshots, data recovery at any point in time can be supported.

[0155] Optionally, incremental snapshots are created with the mirror object as the initial parent snapshot. Each subsequent incremental snapshot creation uses the previously created incremental snapshot as the parent snapshot, forming a chain structure of "mirror object → incremental snapshot 1 → incremental snapshot 2 → ... → incremental snapshot N". Optionally, during the creation process, the remote distributed object storage cluster compares the batch data changes of the current mirror object (or the previous incremental snapshot) with the snapshot being created, identifies newly added or modified batch data (i.e., changed data different from the parent snapshot), and only records this changed data in the newly created incremental snapshot, without storing the complete mirror data.

[0156] For example, in the initial state, incremental snapshot 1 is created based on the mirror object. At this time, incremental snapshot 1 records all batch change data from the time the mirror object was created until the time of incremental snapshot 1 creation. When incremental snapshot 2 is created later, the changes in incremental snapshot 1 and the current batch data are compared, and only the change data within this time period is recorded, and so on.

[0157] Incremental snapshots record changes to target device data at different points in time. Through a chain of linked incremental snapshots, the data state of the target device at any given time can be accurately restored, supporting data recovery at any point in time. Specifically, when restoring data to a specific point in time, only the parent snapshot (a mirror object or an incremental snapshot) corresponding to that point in time needs to be loaded. Then, the changed data from all incremental snapshots after the parent snapshot up to the specified point in time is sequentially overlaid. This quickly restores the complete data state of the target device at that point in time, eliminating the need to restore all data and significantly shortening data recovery time. Furthermore, incremental snapshots support individual management and deletion. Snapshot retention policies can be set according to business needs (e.g., retaining incremental snapshots from the most recent 7 days or 30 days). Deleting expired incremental snapshots only releases the data storage space corresponding to that snapshot, without affecting the parent snapshot or other incremental snapshots, further improving storage space utilization and reducing storage costs.

[0158] It should be noted that after an incremental snapshot is created, it will establish an associated index with the corresponding parent snapshot, log object, and batch data to facilitate quick location and retrieval. It will also calculate a checksum for each incremental snapshot to verify the integrity of the snapshot data and avoid data recovery anomalies caused by snapshot data corruption.

[0159] Figure 3 A flowchart illustrating a continuous data protection method provided in another embodiment of this application is shown below. Figure 3 As shown, as an optional implementation, based on any of the above embodiments, the method further includes the following steps:

[0160] S301. Find the base snapshot whose timestamp is less than or equal to the target recovery time point.

[0161] This step uses timestamp matching to determine the baseline for data recovery, providing a foundation for the overlay of subsequent incremental data and ensuring the temporal consistency and integrity of the recovered data.

[0162] The target recovery time point refers to the specific time node to which the data of the target recovery device needs to be restored, preset by the user according to actual needs, such as the time before a certain fault occurred or the time before a data misoperation. The target recovery time point can be manually entered by the user or automatically generated by the system based on the fault detection signal. The accuracy of the target recovery time point is consistent with the accuracy of the timestamp of the written record mentioned above, ensuring the accuracy of recovery positioning.

[0163] A base snapshot refers to a pre-built, complete data snapshot in a remote distributed object storage cluster whose timestamp is closest to and less than or equal to the target recovery time. Essentially, a base snapshot is a complete data image of the target device at a specific historical point in time, containing all data corresponding to all write records that have been persistently stored before that point in time. The remote distributed object storage cluster stores multiple snapshots with different timestamps, and all snapshots can be archived in ascending or descending order of timestamp for easy and rapid retrieval.

[0164] Optionally, the timestamp (denoted as T) corresponding to the user-preset target recovery time point is obtained; then, the snapshot retrieval module built into the remote distributed object storage cluster is invoked to traverse the timestamps of all snapshots in the cluster, filtering out all candidate snapshots with timestamps ≤ T; finally, the snapshot with the largest timestamp is selected as the base snapshot from all candidate snapshots. It should be noted that the purpose of selecting the candidate snapshot with the largest timestamp is to minimize the amount of incremental data that needs to be superimposed subsequently, shorten data recovery time, and improve recovery efficiency. The closer the timestamp of the base snapshot is to the target recovery time point, the less incremental data exists between it and the target recovery time point, the less workload there is in subsequent parsing and application, thereby reducing recovery latency.

[0165] Optionally, if there is no snapshot with a timestamp ≤ T in the remote distributed object storage cluster, that is, the target recovery time point is earlier than the timestamp of the earliest snapshot in the cluster, an exception prompt mechanism is triggered to inform the user that data recovery at that time point cannot be performed to avoid data loss during the recovery process; if the target recovery time point is exactly equal to the timestamp of a certain snapshot, then that snapshot is directly used as the base snapshot, and no incremental data needs to be superimposed subsequently, and recovery can be completed directly based on that snapshot.

[0166] S302. Read the incremental log object corresponding to the basic snapshot, extract the batch data in the incremental log object and parse it to obtain the corresponding historical write records.

[0167] This step involves associating the base snapshot with the incremental log object to extract all data change records between the base snapshot time and the target recovery time, providing support for the integrity of subsequent data recovery.

[0168] Incremental log objects refer to a collection of logs stored in a remote distributed object storage cluster that is associated with each snapshot and records all data changes after the snapshot time point. In essence, incremental log objects are ordered archives of batch data transmitted and stored in the above steps. Each incremental log object corresponds to a unique base snapshot, and all batch data are stored in an ordered manner by generating timestamps in batches within the log object.

[0169] Optionally, in a remote distributed object storage cluster, each snapshot contains a unique snapshot identifier, and the corresponding incremental log object also contains the same snapshot identifier. Furthermore, a fixed association index is established between the two. Through this association index, the incremental log object corresponding to the basic snapshot can be quickly located and read, avoiding abnormal recovery data caused by incorrect log object lookup and improving lookup efficiency.

[0170] Optionally, during the batch data extraction process, it is necessary to further filter out batch data whose batch timestamp is less than or equal to the target recovery time point. That is, only all batch data generated between the basic snapshot time point and the target recovery time point are extracted, and batch data after the target recovery time point are removed to ensure that the extracted incremental data accurately matches the target recovery time point and avoid redundant data interfering with the recovery results.

[0171] Optionally, each batch of filtered data is decapsulated to extract each independent write record (i.e., historical write record) within the batch. During the parsing process, it is necessary to accurately extract the information corresponding to each historical write record, including but not limited to data content, logical block address, and timestamp, to ensure that the parsed historical write records are consistent with the write records generated in the previous steps and that the information is complete, providing accurate data support for subsequent sorting and application steps.

[0172] It should be noted that by using snapshot association, batch filtering, and parsing extraction, the incremental data can be obtained accurately and efficiently. It is not necessary to read all the data in the entire storage cluster. Only the incremental batches of the required time period are extracted, which reduces the amount of data read, reduces the I / O load of the storage cluster, avoids the omission or redundancy of incremental data, and ensures the integrity and accuracy of the recovered data.

[0173] S303. Sort historical write records by logical block address and timestamp, and retain the target historical write record with the latest timestamp among those with the same logical block address.

[0174] This step eliminates redundant historical write records by sorting in sequence and deduplicating at the same address, ensuring that the recovery data corresponding to each logical block address is the latest data before the target recovery time point. This avoids data conflicts and redundancy caused by duplicate write records, thereby improving the accuracy and efficiency of the recovery data.

[0175] Between the base snapshot time and the target recovery time, there may be multiple write operations to the same logical block address, meaning multiple historical write records correspond to the same logical block address but have different timestamps. Directly applying all historical write records to the base snapshot would cause data at the same logical block address to be overwritten multiple times, which would not only increase recovery time but also potentially lead to data overwriting errors, affecting the accuracy of the recovery results.

[0176] Optionally, a dual sorting rule is used to arrange all parsed historical write records in an ordered manner. First, the records are sorted by logical block address, grouping historical write records with the same logical block address together. Write records with different logical block addresses are then arranged in ascending or descending order of address encoding. Within the same logical block address group, the records are further sorted in ascending order of timestamp, ensuring that write records at the same address are arranged in chronological order of their creation time. The sorting process employs efficient sorting algorithms (such as quicksort and mergesort) to ensure that sorting can be completed quickly even in scenarios with a large number of historical write records, avoiding sorting delays that could affect overall recovery efficiency.

[0177] Optionally, the sorted historical write records are grouped for deduplication. For each group of historical write records with the same logical block address, only the write record with the largest timestamp is retained as the target historical write record corresponding to that logical block address; other historical write records with smaller timestamps within the group are discarded. The historical write record with the largest timestamp corresponds to the last write operation of that logical block address before the target recovery time point, and its data content is the final data that should be presented at that address at the target recovery time point. Retaining this record ensures the accuracy of the recovered data; discarding redundant records reduces the amount of data subsequently applied to temporary snapshots, lowers the data write load, and shortens the recovery time.

[0178] S304. Create a temporary snapshot based on the base snapshot and apply the target historical write records to the temporary snapshot.

[0179] This step uses a temporary snapshot as a data integration carrier to overlay and integrate the complete data of the base snapshot with the incremental data of the target historical write records, generating a data image that is completely consistent with the target recovery time point. This avoids direct manipulation of the base snapshot or the target recovery device, ensuring the security of the base snapshot and the normal operation of the target recovery device.

[0180] Optionally, based on the base snapshot, a separate temporary snapshot is generated by copying the complete data of the base snapshot through the snapshot creation module of the remote distributed object storage cluster. The temporary snapshot is essentially a copy of the base snapshot, with data content completely identical to the base snapshot, and is stored in the cluster's temporary storage area, isolated from the base snapshot and formal business data. The purpose of creating a temporary snapshot is to provide a secure data integration platform. All incremental data overlay operations are performed on the temporary snapshot, without modifying the original data of the base snapshot or directly operating the target recovery device, thus improving the security and flexibility of the data recovery process.

[0181] Optionally, based on the logical block address corresponding to each target historical write record, its data content is written to the corresponding logical block location in the temporary snapshot, thereby achieving the fusion of basic snapshot data and incremental data. Specifically, all target historical write records arranged in order of logical block address are traversed. For each target historical write record, the storage location corresponding to the logical block address in the temporary snapshot is first located. Then, the data content of the target historical write record is used to overwrite the original data at the logical block address in the temporary snapshot. After the application of all target historical write records is completed in sequence, the data content of the temporary snapshot is completely consistent with the target device data at the target recovery time point, forming a data mirror of the target recovery time point.

[0182] S305. Write the data in the temporary snapshot back to the target recovery device in order of preset block size to complete the data recovery.

[0183] This step integrates the target recovery time point data from the temporary snapshot and writes it back to the target recovery device in an orderly manner according to preset rules, thereby achieving accurate restoration of the data on the target recovery device and finally completing the entire data recovery process.

[0184] The target recovery device refers to the device that needs to perform data recovery operations. It can be the target device in the aforementioned steps (such as recovering its own data after a business device fails) or other backup devices (such as restoring data to a backup device to achieve business disaster recovery backup). The preset block size refers to the fixed data block size (such as 4KB, 8KB, 16KB, etc.) that is set in advance according to the storage characteristics and I / O processing capabilities of the target recovery device. It matches the storage unit size of the target recovery device to ensure the compatibility and efficiency of data write-back.

[0185] Optionally, firstly, all integrated data in the temporary snapshot is read, and the data is divided into multiple consecutive data blocks according to a preset block size. The size of each data block is consistent with the preset block size. If the remaining data is less than the preset block size, it is encapsulated into a single data block according to the actual data volume to ensure the integrity of the data division and avoid data loss. Subsequently, each data block is written back to the corresponding storage location of the target recovery device in ascending or descending order according to the logical block address corresponding to the data block. During the write-back process, a sequential write method is used to reduce the I / O addressing time of the target recovery device and improve the write-back efficiency. After each data block is written back, the data block written back to the target recovery device is compared with the corresponding data block in the temporary snapshot using a preset verification algorithm (such as CRC check or MD5 check). If the comparison is consistent, the write-back of the data block is confirmed to be successful. If the comparison is inconsistent, a rewrite operation is immediately triggered until the data block is successfully written back, ensuring the integrity and accuracy of the written-back data.

[0186] Optionally, after all data blocks have been successfully written back and verified, a data recovery completion notification is generated, informing the user that the data on the target recovery device has been successfully restored to the preset target recovery time point, and recording key information of the recovery process (such as recovery time, amount of recovered data, verification results, etc.). If abnormal situations such as insufficient space on the target recovery device or device failure occur during the write-back process, the write-back operation is immediately paused, an abnormality warning is triggered, and the current recovery progress is saved. After the abnormality is resolved, the write-back can be resumed from the paused progress, avoiding data loss and duplicate operations caused by interruption during the recovery process.

[0187] Figure 4 A schematic diagram of a continuous data protection device provided in an embodiment of this application is shown below. Figure 4 As shown, the continuous data protection device provided in this embodiment is located in an electronic device. The continuous data protection device 40 provided in this embodiment includes: a copy module 41, a generation module 42, a batch aggregation module 43, a storage module 44, and a sending module 45.

[0188] Specifically, the replication module 41 is used to obtain the write input / output request of the target device and replicate the write input / output request to obtain the original request and the replication request; the generation module 42 is used to submit the original request to the target device and generate a write record based on the data content of the replication request and the corresponding logical block address and timestamp; the batch aggregation module 43 is used to cache the write record in the data cache area, and when the preset batch processing condition is triggered, aggregate multiple write records accumulated in the data cache area into a batch; the storage module 44 is used to persistently store the batch in the local cache device; the sending module 45 is used to send the batch to the remote distributed object storage cluster after the batch is successfully persistently stored; wherein, the remote distributed object storage cluster is configured to receive and save the batch, and support building data snapshots at any point in time based on the batch.

[0189] Optionally, the preset batch processing conditions include at least two of the following: a time threshold, a record count threshold, and a byte count threshold. Optionally, when the preset batch processing conditions are triggered, the batch aggregation module 43 is specifically used to: trigger batch aggregation when the caching time for writing records in the data cache exceeds the time threshold; or trigger batch aggregation when the number of records written in the data cache exceeds the record count threshold; or trigger batch aggregation when the total number of bytes written in the data cache exceeds the byte count threshold.

[0190] Optionally, the local cache device is a high-speed storage device with non-volatile properties; optionally, when persistently storing batches in the local cache device, the storage module 44 is specifically used to: obtain the available space of the local cache device, wherein the available space is the difference between the total capacity of the local cache device and the occupied space; if the available space is greater than or equal to the sum of the batch size and the preset safety margin, write the batches sequentially to the local cache device according to the circular buffer mechanism; update the superblock information of the local cache device, wherein the superblock information includes the data write position, the position of the sent data, and the check value.

[0191] Optionally, when sending a batch to a remote distributed object storage cluster, the sending module 45 specifically performs the following: constructing a transmission data packet containing a message header, batch data, and a checksum, wherein the message header includes the target device identifier and batch data length information; sending the transmission data packet using a scatter-gathering input / output method with zero copy; wherein zero copy is a copy method that does not pass through user-space memory; maintaining the network connection with the remote distributed object storage cluster using a preset connection pool, triggering a disconnection and reconnection process if the network is interrupted, and resending the batch that was not successfully sent after successful reconnection.

[0192] Optionally, the remote distributed object storage cluster saves batches in the following ways: the remote distributed object storage cluster appends the received batches to a log object, and the log object is named according to the target device identifier and sequence number; if the log object exceeds a preset size or a preset time interval, the batch data in the log object is merged into a mirror object; wherein, the mirror object is used to store a complete data copy of the target device; and an incremental snapshot is created based on the mirror object, the incremental snapshot being used to record changed data that is different from the parent snapshot.

[0193] Optionally, the continuous data protection device provided in this embodiment also includes a data recovery module.

[0194] Optionally, the data recovery module is used to: locate a base snapshot with a timestamp less than or equal to the target recovery time point; read the incremental log object corresponding to the base snapshot, extract batch data from the incremental log object, and parse the corresponding historical write records; sort the historical write records by logical block address and timestamp, and retain the target historical write record with the latest timestamp among the same logical block addresses; create a temporary snapshot based on the base snapshot, and apply the target historical write records to the temporary snapshot; and write the data in the temporary snapshot back to the target recovery device in order of preset block size to complete the data recovery.

[0195] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application, as shown below. Figure 5As shown, the electronic device 50 provided in this embodiment includes: a processor 51 and a memory 52 communicatively connected to the processor 51.

[0196] The memory 52 stores computer-executable instructions; the processor 51 executes the computer-executable instructions stored in the memory 52 to implement a continuous data protection method provided in any of the above embodiments.

[0197] The program may include program code, which includes computer-executable instructions. Memory 52 may include high-speed RAM, and may also include non-volatile memory, such as at least one disk storage device.

[0198] In this embodiment, the memory 52 and the processor 51 are connected via a bus. The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 5 The bus is represented by a single straight line, but this does not mean that there is only one bus or one type of bus.

[0199] This application also provides a computer-readable storage medium, which stores computer-executable instructions that, when executed by a processor, are used to implement a continuous data protection method provided in any of the above embodiments.

[0200] This application also provides a computer program product, including a computer program that, when executed by a processor, implements a continuous data protection method provided in any of the above embodiments.

[0201] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to implement the solution of this embodiment according to actual needs.

[0202] Furthermore, the functional modules in the various embodiments of this application can be integrated into one processing unit, or each module can exist physically separately, or two or more modules can be integrated into one unit. The unit composed of the above modules can be implemented in hardware or in the form of hardware plus software functional units.

[0203] The integrated modules described above, implemented as software functional modules, can be stored in a computer-readable storage medium. These software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute some steps of the methods of the various embodiments of this application.

[0204] It should be understood that the aforementioned processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. A general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly manifested as execution by a hardware processor, or execution by a combination of hardware and software modules within the processor.

[0205] The memory may include high-speed RAM, and may also include non-volatile storage (NVM), such as at least one disk storage device, and may also be a USB flash drive, external hard drive, read-only memory, disk or optical disc, etc.

[0206] The aforementioned storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The storage medium can be any available medium that can be accessed by a general-purpose or special-purpose computer.

[0207] An exemplary storage medium is coupled to a processor, enabling the processor to read information from and write information to the storage medium. Alternatively, the storage medium can be an integral part of the processor. The processor and storage medium can reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the processor and storage medium can exist as discrete components in an electronic control unit or main control device.

[0208] Those skilled in the art will understand that all or part of the steps of the above-described method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When executed, the program performs the steps of the above-described method embodiments; and the aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks.

[0209] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of this application, and are not intended to limit them. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features therein. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of this application.

Claims

1. A method for continuous data protection, characterized in that, include: Obtain the write input / output request of the target device, and copy the write input / output request to obtain the original request and the copied request; The original request is submitted to the target device, and a write record is generated based on the data content of the copy request and the corresponding logical block address and timestamp; The write records are cached in the data cache area, and when a preset batch processing condition is triggered, the multiple write records accumulated in the data cache area are aggregated into a batch. The batch is persistently stored in a local cache device; After the batch is successfully persisted to storage, the batch is sent to a remote distributed object storage cluster; wherein the remote distributed object storage cluster is configured to receive and save the batch, and to support the construction of data snapshots at any point in time based on the batch.

2. The method according to claim 1, characterized in that, The preset batch processing conditions include at least two of the following: time threshold, record number threshold, and byte number threshold; The conditions for triggering the preset batch processing include: If the caching duration of the write record in the data cache exceeds the time threshold, batch aggregation is triggered; Or, if the number of write records in the data cache exceeds the record count threshold, batch aggregation is triggered; Batch aggregation may be triggered if the total number of bytes of the write records in the data cache exceeds the byte count threshold.

3. The method according to claim 1, characterized in that, The local cache device is a high-speed storage device that is non-volatile; The step of persistently storing the batch to a local cache device includes: Obtain the available space of the local cache device, wherein the available space is the difference between the total capacity of the local cache device and the occupied space; If the available space is greater than or equal to the sum of the batch size and the preset safety margin, the batch is written sequentially to the local cache device according to the circular buffer mechanism; Update the superblock information of the local cache device, wherein the superblock information includes the data write location, the location of the data sent, and the checksum.

4. The method according to claim 1, characterized in that, Sending the batch to the remote distributed object storage cluster includes: Construct a transmission data packet containing a message header, batch data, and a checksum, wherein the message header includes a target device identifier and batch data length information; The data packets are transmitted using a scattering-focusing input / output method with zero copy transmission; wherein, zero copy transmission is a copy method that does not pass through user-space memory. The network connection with the remote distributed object storage cluster is maintained using a preset connection pool. If the network is interrupted, a disconnection and reconnection process is triggered, and the batches that were not successfully sent are resent after a successful reconnection.

5. The method according to claim 1, characterized in that, The remote distributed object storage cluster stores the batch in the following ways: The remote distributed object storage cluster appends the received batches to the log object, and the log object is named according to the target device identifier and sequence number; If the log object exceeds a preset size or a preset time interval, the batch data in the log object is merged into a mirror object; wherein, the mirror object is used to store a complete data copy of the target device; An incremental snapshot is created based on the mirror object. The incremental snapshot is used to record change data that is different from the parent snapshot.

6. The method according to any one of claims 1-5, characterized in that, The method further includes: Find the base snapshot whose timestamp is less than or equal to the target recovery time. Read the incremental log object corresponding to the basic snapshot, extract the batch data in the incremental log object and parse it to obtain the corresponding historical write records; The historical write records are sorted by logical block address and timestamp, and the target historical write record with the latest timestamp among those with the same logical block address is retained; A temporary snapshot is created based on the base snapshot, and the target historical write records are applied to the temporary snapshot. The data in the temporary snapshot is written back to the target recovery device in order of preset block size to complete the data recovery.

7. A continuous data protection device, characterized in that, include: The copy module is used to acquire the write input / output request of the target device and copy the write input / output request to obtain the original request and the copy request; The generation module is used to submit the original request to the target device and generate a write record based on the data content of the copy request and the corresponding logical block address and timestamp; The batch aggregation module is used to cache the write records to the data cache area, and when a preset batch processing condition is triggered, aggregate multiple write records accumulated in the data cache area into a batch. A storage module is used to persistently store the batch to a local cache device; The sending module is used to send the batch to a remote distributed object storage cluster after the batch is successfully persisted to storage; wherein the remote distributed object storage cluster is configured to receive and save the batch, and to support the construction of data snapshots at any point in time based on the batch.

8. An electronic device, characterized in that, include: A processor, and a memory communicatively connected to the processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory to implement the method as described in any one of claims 1-6.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any one of claims 1-6.

10. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method of any one of claims 1-6.