Multi-node cooperative hot backup recovery method and device, equipment and storage medium

By employing a multi-node collaborative hot backup and recovery method, core data is synchronized in real time while non-core data is synchronized asynchronously. This solves the problems of high resource consumption, long recovery time, and high risk of interruption in cloud phone backup and recovery, achieving zero interruption and rapid recovery, and improving the data security and business continuity of cloud phones.

CN122220153APending Publication Date: 2026-06-16XIAOVO TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
XIAOVO TECH
Filing Date
2026-03-26
Publication Date
2026-06-16

Smart Images

  • Figure CN122220153A_ABST
    Figure CN122220153A_ABST
Patent Text Reader

Abstract

The application belongs to the technical field of cloud computing and data backup and recovery, and discloses a multi-node cooperative hot backup and recovery method, device, equipment and storage medium. After a backup task is triggered by a scheduling management node, the scheduling management node instructs a main service node to divide local running data in layers, then synchronizes core basic data to all mutual backup storage nodes in real time to complete multi-copy landing, and only transmits incremental change fragments of non-core application data after the last backup to at least one mutual backup storage node for asynchronous synchronization, and automatically switches and connects synchronization when a single node fails. When receiving a recovery request, the scheduling management node schedules available mutual backup storage nodes through heartbeat negotiation, preferentially transmits core basic data to a target service node, and schedules mutual backup storage nodes to transmit corresponding non-core application data to the target service node on demand, and gradually completes full-service recovery. The above method realizes zero-interruption backup and rapid hot recovery, and improves the data security and business continuity of a cloud phone.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of cloud computing and data backup and recovery technology, and in particular to a multi-node collaborative hot backup and recovery method, apparatus, device and storage medium. Background Technology

[0002] With the rapid development of cloud computing technology, cloud phones, as an emerging mobile terminal virtualization service, face critical challenges in ensuring data security and business continuity. Existing cloud phone data backup and recovery solutions suffer from the following shortcomings: Existing patent CN121349771A discloses a cloud-based method and system for secure backup of archive data. It employs a regenerable code strategy to split and store backup data, recording fragment root hashes via a consortium blockchain. However, this solution still uses a full backup mode, resulting in a large amount of backup data and consuming significant storage resources and network bandwidth. CN121387633A discloses a data synchronization backup method and system for a distributed mobile storage cluster. It achieves data synchronization by constructing a distributed relative clock through ultra-wideband bidirectional ranging. However, this solution focuses on clock synchronization between mobile nodes and does not address the business interruption problem in cloud phone scenarios. CN121387632A discloses a data backup and recovery method and electronic device that generates backup metadata through multiple backups. However, recovery still requires loading the complete backup data, and full recovery typically takes more than 5 minutes, resulting in prolonged core business interruption. Furthermore, it does not differentiate the recovery priority between core system data and non-core application data, leading to unavailability of core functions quickly. In addition, traditional solutions often require backup or recovery operations to be performed while the cloud is powered off, failing to achieve zero-interruption backup and recovery.

[0003] The above content is only used to help understand the technical solution of the present invention and does not represent an admission that the above content is prior art. Summary of the Invention

[0004] The main objective of this invention is to provide a multi-node collaborative hot backup and recovery method, apparatus, device, and storage medium, aiming to solve the technical problems of existing cloud mobile phone backup and recovery solutions, such as high resource consumption during full backup, long recovery time, unreasonable recovery priority, and high risk of business interruption.

[0005] To achieve the above objectives, the present invention provides a multi-node collaborative hot backup and recovery method, which includes the following steps: Under the premise that the main business node is running normally and the business is not interrupted, the scheduling and management node triggers a backup task, instructing the main business node to divide the local running data into two categories: core basic data and non-core application data. The core basic data is synchronized to all mutual backup storage nodes in real time to complete the multi-replica landing. The non-core application data is transmitted only with incremental change fragments after the last backup and asynchronously synchronized to at least one mutual backup storage node. When a single node fails, it automatically switches to other available storage nodes to continue the synchronization. Throughout the process, the operation of the main business node is not interrupted. When the scheduling and management node receives a recovery request, it negotiates with all mutual backup storage nodes and target service nodes through heartbeats, schedules the available mutual backup storage node with the best network and the most complete data, and prioritizes the transmission of core basic data to the target service node. Once the target service node completes the loading of the core basic data, the core service functions become available immediately. Based on the user's actual access needs for non-core applications, the scheduling and management node schedules the backup storage nodes to transmit the corresponding non-core application data to the target business nodes as needed, gradually completing the full business recovery.

[0006] In one embodiment, the core basic data includes at least system configuration, account information, key process status, and user core asset data, which are collected through real-time memory snapshots; the non-core application data includes at least third-party application installation packages, application cache, and non-sensitive user files, which are collected through file system block change tracking.

[0007] In one embodiment, the triggering instruction for the backup task includes any one or more combinations of the following: automatic triggering at a preset period, manual triggering by the user, event triggering when the data change of the main business node reaches a preset threshold, and emergency triggering when the scheduling management node detects node risks.

[0008] In one embodiment, the Raft consensus algorithm is used to ensure the consistency of multiple replicas when the core basic data is synchronized across multiple nodes. After each storage node completes the data writing, it synchronously generates an MD5 checksum and reports it to the scheduling and management node. When the checksums of all nodes are consistent, the core data backup is determined to be successful.

[0009] In one embodiment, the loading time of the core basic data does not exceed a preset time. After the core business functions are available, the main business node marks the non-core application entry that has not been restored as being in a state of recovery, and at the same time synchronizes the recovery progress of the non-core data in the background.

[0010] In one embodiment, the backup storage nodes are distributed across at least two physically isolated remote data centers. Core basic data is synchronized to all remote storage nodes, while non-core application data is only synchronized to storage nodes in the same city. In the event of a cross-regional failure, the core data of the remote storage nodes can be directly called to complete disaster recovery.

[0011] In one embodiment, all mutual backup storage nodes collaboratively maintain a unified backup version chain. Each version record includes backup time, data checksum, and storage node distribution information. Recovery requests can specify any historical version to complete data recovery, and the version retention period can be customized.

[0012] Furthermore, to achieve the above objectives, the present invention also proposes a multi-node collaborative hot backup and recovery device, which is applied to the multi-node collaborative hot backup and recovery method described above. The device includes: The backup trigger module is used to trigger backup tasks under the premise that the main business node continues to operate normally and the business is not interrupted. It instructs the main business node to divide the local running data into two categories: core basic data and non-core application data. The data synchronization module is used to synchronize the core basic data to all mutual backup storage nodes in real time to complete the multi-replica landing, and to transmit only the incremental change fragments after the last backup of the non-core application data, and asynchronously synchronize it to at least one mutual backup storage node. When a single node fails, it automatically switches to other available storage nodes to continue the synchronization. The main business node is not interrupted throughout the process. The recovery scheduling module is used to negotiate with all mutual backup storage nodes and target service nodes through heartbeat when a recovery request is received. It schedules the available mutual backup storage node with the best network and the most complete data to be transmitted to the target service node first. After the target service node completes the loading of the core basic data, the core business functions will be available immediately. The on-demand recovery module is used to schedule backup storage nodes to transmit corresponding non-core application data to the target business nodes on demand, based on the user's actual access needs for non-core applications, and gradually complete the full business recovery.

[0013] Furthermore, to achieve the above objectives, the present invention also proposes a multi-node collaborative hot backup and recovery device, the multi-node collaborative hot backup and recovery device comprising: a memory, a processor, and a multi-node collaborative hot backup and recovery program stored on the memory and executable on the processor, the multi-node collaborative hot backup and recovery program being configured to implement the steps of the multi-node collaborative hot backup and recovery method described above.

[0014] In addition, to achieve the above objectives, the present invention also proposes a storage medium storing a multi-node collaborative hot backup and recovery program, wherein when the multi-node collaborative hot backup and recovery program is executed by a processor, it implements the steps of the multi-node collaborative hot backup and recovery method described above.

[0015] In this invention, after the scheduling and management node triggers a backup task, it instructs the main business node to divide its local running data into layers. Then, it synchronizes the core basic data in real time to all mutual backup storage nodes to complete multi-replica persistence. Non-core application data is transmitted only with incremental changes since the last backup, asynchronously synchronized to at least one mutual backup storage node. In the event of a single node failure, automatic failover and continuation synchronization are performed. Upon receiving a recovery request, the scheduling and management node negotiates via heartbeat, scheduling available mutual backup storage nodes and prioritizing the transmission of core basic data to the target business node. The scheduling mutual backup storage nodes then transmit corresponding non-core application data to the target business node as needed, gradually completing the full business recovery. This method achieves zero-interruption backup and rapid hot recovery, improving the data security and business continuity of cloud phones. Attached Figure Description

[0016] Figure 1 This is a flowchart illustrating the first embodiment of the multi-node collaborative hot backup and recovery method of the present invention; Figure 2 This is a diagram of the hot backup and recovery system architecture in the multi-node collaborative hot backup and recovery method of the present invention; Figure 3 This is a flowchart of the hot backup process in the multi-node collaborative hot backup and recovery method of the present invention; Figure 4 This is a flowchart of the hot recovery process in the multi-node collaborative hot backup and recovery method of the present invention; Figure 5 This is a structural block diagram of the first embodiment of the multi-node collaborative hot backup and recovery device of the present invention.

[0017] The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation

[0018] It should be understood that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of the invention.

[0019] This invention provides a multi-node collaborative hot backup and recovery method, referring to... Figure 1 , Figure 1 This is a flowchart illustrating the first embodiment of a multi-node collaborative hot backup and recovery method according to the present invention.

[0020] In this embodiment, the multi-node collaborative hot backup and recovery method includes the following steps: Step S10: Under the premise that the main business node is running normally and the business is not interrupted, the scheduling management node triggers the backup task, instructing the main business node to divide the local running data into two categories: core basic data and non-core application data.

[0021] In this embodiment, the execution subject is a multi-node collaborative hot backup and recovery device, which has functions such as data processing, data communication and program execution. The multi-node collaborative hot backup and recovery device can be a computer terminal device or other network device, or other devices with similar functions. This embodiment does not limit this.

[0022] It should be noted that existing technologies disclose a cloud-based method and system for secure backup of archive data, employing a regenerable code strategy to split and store backup data, and recording fragment root hashes through a consortium blockchain. However, this solution still uses a full backup mode, resulting in a large amount of backup data and consuming significant storage resources and network bandwidth. Existing technologies disclose a distributed mobile storage cluster data synchronization backup method and system, achieving data synchronization by constructing a distributed relative clock through ultra-wideband bidirectional ranging. However, this solution focuses on clock synchronization between mobile nodes and does not address the business interruption issue in cloud phone scenarios. Existing technologies disclose a data backup and recovery method and electronic device that generates backup metadata through multiple backups. However, recovery still requires loading the complete backup data, with full recovery times typically exceeding 5 minutes, leading to prolonged core business interruptions. Furthermore, it does not differentiate the recovery priority between core system data and non-core application data, resulting in core functions not being quickly available. In addition, traditional solutions often require backup or recovery operations to be performed while the cloud is powered off, failing to achieve zero-interruption backup and recovery.

[0023] To address the aforementioned technical challenges, this embodiment, after the scheduling and management node triggers a backup task, instructs the primary business node to hierarchically divide its local operating data. Then, core basic data is synchronized in real-time to all mutual backup storage nodes to achieve multi-replica persistence. Non-core application data is transmitted only with incremental changes since the last backup, asynchronously synchronized to at least one mutual backup storage node. In the event of a single node failure, automatic failover and continuation synchronization are enabled. Upon receiving a recovery request, the scheduling and management node, through heartbeat negotiation, schedules available mutual backup storage nodes, prioritizing the transmission of core basic data to the target business node. The scheduling of mutual backup storage nodes then transmits corresponding non-core application data to the target business node as needed, gradually completing full business recovery. This approach achieves zero-interruption backup and rapid hot recovery, improving the data security and business continuity of cloud phones.

[0024] In the specific implementation, this embodiment first combines Figure 2The system architecture is described below. This system adopts a distributed architecture and mainly includes: a backup management module, a data classification module, a synchronization transmission module, a dual-center storage module, a recovery scheduling module, and an application activation module. 1. Backup Management Module: Deployed on the cloud management platform, it is responsible for formulating backup strategies (scheduled backup cycles, data classification rules), triggering backup tasks, and monitoring backup progress and status. 2. Data Classification Module: It classifies cloud phone data into core basic data (system configuration, account information, critical process status) and non-core application data (third-party application installation packages, application cache, user files). 3. Synchronization Transmission Module: Deployed in the data plane proxy, it collects cloud phone running data in real time and synchronizes it to the dual-center storage module according to classification. It uses incremental synchronization technology, transmitting only changed data. 4. Dual-Center Storage Module: Composed of mutual backup storage nodes in Tianjin and Shanghai, core basic data uses real-time synchronous storage, while non-core application data uses asynchronous synchronous storage to ensure data security. 5. Recovery Scheduling Module: It receives recovery requests, schedules the recovery process according to priority, first restoring core basic data, and then restoring non-core application data as needed based on user operations. 6. Application Activation Module: Combining instant installation technology, the module uses the OverlayFS mounting mechanism to mount non-core application data, which is quickly activated when the user clicks on the application, making the application data available immediately.

[0025] Furthermore, based on the above system architecture, the specific processes for hot backup and hot recovery in this embodiment can be referred to respectively. Figure 3 and Figure 4 As shown. Figure 3 The hot backup process is illustrated below. When the system triggers a backup task, it operates as follows: Step 1: Backup Trigger. The backup management module triggers the backup task according to a preset cycle (default every 2 hours) or manually by the user. During this time, the cloud phone remains running, and services continue normally. Step 2: Data Classification and Collection. The data classification module categorizes the current data of the cloud phone into layers. Core basic data is acquired through real-time memory collection, while non-core application data is collected through file system snapshots. Step 3: Incremental Synchronization Transmission. The synchronization transmission module transmits core basic data to the dual-center storage module in real-time to ensure data consistency; for non-core application data, incremental synchronization is used, transmitting only file fragments that have changed since the last backup. Step 4: Dual-Center Storage. After receiving the data, the dual-center storage module stores core basic data synchronously in real-time in both Tianjin and Shanghai centers; non-core application data is first stored in the local center and then asynchronously synchronized to the backup center, while generating a data verification code for integrity verification. Step 5: Backup Status Feedback. After the backup is completed, the synchronization transmission module reports the backup result (success / failure, data volume, and time taken) to the backup management module. The backup management module records the backup log. If the backup fails, a retry mechanism is triggered (3 retries by default). Figure 4The hot recovery process is illustrated below. When a user initiates a recovery request or the system detects a cloud server failure, the following steps are executed: Step 1: Recovery Request Reception. The recovery scheduling module receives manual recovery requests from users or automatic recovery requests triggered by system failures. The request includes information such as the target backup version and cloud server identifier. Step 2: Core Data Priority Recovery. The recovery scheduling module reads the core basic data of the target version from the dual-center storage module, transmits it to the target cloud server via a fast transmission protocol, loads system configuration, account information, and key process status, and starts the cloud server's core services. At this point, the cloud server status changes to "Running," and core functions are available. The process takes no more than 30 seconds. Step 3: Non-Core Data Marking. The recovery scheduling module marks non-core application data, displaying grayed-out application icons on the cloud server desktop with a "Recovering" message. At this point, the user can operate the core functions normally. Step 4: Application On-Demand Activation. When the user clicks the grayed-out application icon, the application activation module triggers the non-core application data recovery process. The application base image is mounted using OverlayFS technology, incremental application data is synchronized, and the application is quickly activated. The entire process takes no more than 3 seconds, and the user does not need to wait for the full recovery to complete. Step 5: Confirm Recovery Completion. After all non-core application data is activated according to user operations, the recovery scheduling module updates the cloud machine recovery status to "fully recovered" and pushes a recovery completion notification to the user. The entire recovery process takes no more than 2 minutes.

[0026] In this implementation, the backup triggering method includes any one or more combinations of the following: automatic triggering at a preset period (default every 2 hours), manual triggering by the user, event triggering when the data change of the main business node reaches a preset threshold, and emergency triggering when the scheduling management node detects node risks. For example, when the scheduling management node detects that a mutual backup storage node is about to undergo planned maintenance, it can trigger an emergency backup in advance to ensure data integrity. After triggering, the scheduling management node issues a backup instruction to the main business node. This instruction includes information such as the backup version identifier and data classification rules, without interrupting the operation of the main business node throughout the process.

[0027] In this embodiment, core foundational data includes system configuration, account information, key process status, and core user asset data, which are collected through real-time memory snapshots. Specifically, the main business node calls a memory snapshot tool to capture core data of the cloud phone's current operating status within milliseconds, ensuring data consistency and real-time performance. Non-core application data includes third-party application installation packages, application caches, and non-sensitive user files, which are collected through file system block change tracking. The main business node records a block change bitmap since the last backup, reading only the changed data blocks, significantly reducing collection overhead.

[0028] Step S20: The core basic data is synchronized to all mutual backup storage nodes in real time to complete the multi-replica landing. The non-core application data is transmitted only to the incremental change fragments after the last backup and asynchronously synchronized to at least one mutual backup storage node. When a single node fails, it is automatically switched to other available storage nodes to continue the synchronization.

[0029] It should be noted that this embodiment uses different processing methods for core basic data and non-core application data. Specifically, for core basic data, the main business node synchronizes it to all mutual backup storage nodes in real time to complete multi-replica persistence. During the synchronization process, the Raft consensus algorithm is used to ensure the consistency of multiple replicas. After each storage node completes the data writing, it synchronously generates an MD5 checksum and reports it to the scheduling and management node. When the checksums of all nodes are consistent, the core data backup is considered successful. For non-core application data, the main business node only transmits the incremental change fragments since the last backup, asynchronously synchronizing them to at least one mutual backup storage node. Specifically, the main business node packages the data fragments corresponding to the block change bitmap and sends them to the local mutual backup storage node through an asynchronous transmission channel. After the transmission is completed, the node asynchronously synchronizes them to other backup nodes. It should be emphasized that the loading time of the core basic data does not exceed the preset time. After the core business functions are available, the main business node marks the unrecovered non-core application entry points as being in recovery status, and simultaneously synchronizes the recovery progress of non-core data in the background.

[0030] In one embodiment, when a backup storage node fails, the system automatically switches to another available storage node to continue synchronization. For example, when the primary service node synchronizes data with storage node A, if node A experiences three consecutive heartbeat timeouts, the scheduling management node determines that node A has failed and immediately notifies the primary service node to switch to node B to continue synchronization. The synchronization process is seamless, and the operation of the primary service node is unaffected. After backup is complete, the primary service node reports the backup results to the scheduling management node, including success or failure status, data volume, and time taken. The scheduling management node records backup logs. If backup fails, a retry mechanism is triggered (default 3 retries). If the retries still fail, an alarm is generated to notify maintenance personnel to intervene.

[0031] Step S30: When the scheduling management node receives a recovery request, it negotiates with all mutual backup storage nodes and target service nodes through heartbeats, schedules the available mutual backup storage node with the best network and the most complete data, and prioritizes the transmission of core basic data to the target service node. Once the target service node completes the loading of the core basic data, the core business functions become available immediately.

[0032] It should be noted that recovery requests can originate from: manual requests initiated by users through the management interface, and automatic requests triggered by the system when it detects a failure in the primary business node. The recovery request includes information such as the target backup version, the target business node identifier, and the recovery scope.

[0033] In the specific implementation, after receiving a request, the scheduling management node immediately establishes a heartbeat negotiation with all backup storage nodes and the target service node to obtain network latency, load status, and data integrity information of each storage node, as well as the resource availability status of the target service node. Based on the heartbeat negotiation results, the scheduling management node comprehensively evaluates the network latency, load status, and data integrity of each backup storage node, and schedules the available backup storage node with the best current network and the most complete data to prioritize the transmission of core basic data to the target service node.

[0034] Specifically, the scheduling and management node calculates a comprehensive score for each storage node: Score = Weight 1 × (1 / Network Latency) + Weight 2 × (1 / Load Rate) + Weight 3 × Data Integrity Coefficient. The storage node with the highest score is selected as the data source. Core basic data is transmitted to the target business node via a fast transmission protocol. After receiving the data, the target business node loads system configuration, account information, and key process statuses. Once the core basic data loading is complete, core business functions become available immediately. The entire core data recovery process takes no more than 30 seconds, and the target business node's status changes to "Running," allowing core functions to be used normally. After the core business functions become available, the target business node marks unrecovered non-core application entry points as "Recovering," and the application icons on the desktop are grayed out, indicating to the user that the application is being recovered, but this does not affect the user's operation of core functions. Simultaneously, the target business node synchronizes the recovery progress of non-core data in the background, continuously pulling non-core application data from the backup storage node to gradually complete data persistence.

[0035] Step S40: Based on the user's actual access needs for non-core applications, the scheduling management node schedules the backup storage nodes to transmit the corresponding non-core application data to the target business nodes as needed, and gradually completes the full business recovery.

[0036] In the actual implementation, when a user clicks the grayed-out application icon, the target business node triggers the on-demand recovery process for that application. Specifically, the target business node sends a recovery request for the application to the scheduling management node, and the scheduling management node schedules the backup storage nodes to transmit the corresponding non-core application data to the target business node as needed.

[0037] In this embodiment, the target business node uses the OverlayFS file system mounting mechanism to mount the application base image and synchronize incremental application data, completing application activation within 3 seconds. Users do not need to wait for full recovery to complete; they can use the application immediately upon clicking, achieving a balance between data recovery and user experience. As users access non-core applications one by one, the target business node restores the corresponding application data as needed, gradually completing full business recovery. After all non-core application data is activated according to user operations, the scheduling and management node updates the cloud machine recovery status to "fully recovered" and pushes a recovery completion notification to the user. The entire recovery process, from triggering to full recovery, takes no more than 2 minutes.

[0038] In this embodiment, after the scheduling and management node triggers the backup task, it instructs the main business node to divide the local running data into layers, and then synchronizes the core basic data to all mutual backup storage nodes in real time to complete multi-replica persistence. Non-core application data is transmitted only with incremental changes since the last backup, asynchronously synchronized to at least one mutual backup storage node, and automatically switches over to continue synchronization in case of a single node failure. When the scheduling and management node receives a recovery request, it negotiates with heartbeats to schedule available mutual backup storage nodes, prioritizing the transmission of core basic data to the target business node. The scheduling mutual backup storage nodes then transmit the corresponding non-core application data to the target business node as needed, gradually completing the full business recovery. This method achieves zero-interruption backup and rapid hot recovery, improving the data security and business continuity of cloud phones.

[0039] Furthermore, this embodiment of the invention also proposes a storage medium storing a multi-node collaborative hot backup and recovery program, which, when executed by a processor, implements the steps of the multi-node collaborative hot backup and recovery method described above.

[0040] Reference Figure 5 , Figure 5 This is a structural block diagram of the first embodiment of the multi-node collaborative hot backup and recovery device of the present invention.

[0041] like Figure 5 As shown, the multi-node collaborative hot backup and recovery device proposed in this embodiment of the invention includes: The backup triggering module 10 is used to trigger a backup task under the premise that the main business node is running normally and the business is not interrupted. It instructs the main business node to divide the local running data into two categories: core basic data and non-core application data. The data synchronization module 20 is used to synchronize the core basic data to all mutual backup storage nodes in real time to complete the multi-replica landing, and to transmit only the incremental change fragments after the last backup of the non-core application data, and asynchronously synchronize it to at least one mutual backup storage node. When a single node fails, it automatically switches to other available storage nodes to continue the synchronization. The main business node is not interrupted throughout the process. The recovery scheduling module 30 is used to negotiate with all mutual backup storage nodes and target service nodes through heartbeat when a recovery request is received, and to schedule the available mutual backup storage node with the best network and the most complete data to be transmitted to the target service node first. After the target service node completes the loading of the core basic data, the core service function will be available immediately. The on-demand recovery module 40 is used to schedule backup storage nodes to transmit corresponding non-core application data to the target business node on demand, based on the user's actual access needs for non-core applications, and gradually complete the full business recovery.

[0042] In this embodiment, after the scheduling and management node triggers the backup task, it instructs the main business node to divide the local running data into layers, and then synchronizes the core basic data to all mutual backup storage nodes in real time to complete multi-replica persistence. Non-core application data is transmitted only with incremental changes since the last backup, asynchronously synchronized to at least one mutual backup storage node, and automatically switches over to continue synchronization in case of a single node failure. When the scheduling and management node receives a recovery request, it negotiates with heartbeats to schedule available mutual backup storage nodes, prioritizing the transmission of core basic data to the target business node. The scheduling mutual backup storage nodes then transmit the corresponding non-core application data to the target business node as needed, gradually completing the full business recovery. This method achieves zero-interruption backup and rapid hot recovery, improving the data security and business continuity of cloud phones.

[0043] This application embodiment also provides a multi-node collaborative hot backup and recovery device, including a processor, a communication interface, a memory, and a communication bus. The processor, communication interface, and memory communicate with each other through the communication bus. The memory is used to store the multi-node collaborative hot backup and recovery program. When the processor executes the program stored in the memory, it implements the above-mentioned multi-node collaborative hot backup and recovery method.

[0044] The communication bus mentioned in the aforementioned multi-node collaborative hot backup and recovery device can be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus, etc. This communication bus can be divided into address bus, data bus, control bus, etc.

[0045] The communication interface is used for communication between the aforementioned multi-node collaborative hot backup and recovery device and other devices.

[0046] The memory may include random access memory (RAM) or non-volatile memory (NVM), such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.

[0047] The processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0048] In the above embodiments, implementation can be achieved entirely or partially through software, hardware, firmware, or any combination thereof. When implemented using software, it can be implemented entirely or partially in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium that a computer can access or a data storage device such as a server or data center that integrates one or more available media. The available medium can be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (SSD)).

[0049] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0050] The various embodiments in this specification are described in a related manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0051] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

[0052] It should be understood that the above are merely illustrative examples and do not constitute any limitation on the technical solutions of the present invention. In specific applications, those skilled in the art can make settings as needed, and the present invention does not impose any restrictions on this.

[0053] It should be noted that the workflow described above is merely illustrative and does not limit the scope of protection of this invention. In practical applications, those skilled in the art can select some or all of the workflow to achieve the purpose of this embodiment according to actual needs, and no restrictions are imposed here.

[0054] In addition, for technical details not described in detail in this embodiment, please refer to the multi-node collaborative hot backup and recovery method provided in any embodiment of the present invention, which will not be repeated here.

[0055] Furthermore, it should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

[0056] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0057] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as read-only memory (ROM) / RAM, magnetic disk, optical disk) and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0058] The above are merely preferred embodiments of the present invention and do not limit the scope of the patent. Any equivalent structural or procedural transformations made based on the description and drawings of the present invention, or direct or indirect applications in other related technical fields, are similarly included within the scope of patent protection of the present invention.

[0059] It is understood that the system provided in the embodiments of the present invention corresponds to the method provided in the embodiments of the present invention, and the explanation, examples and beneficial effects of the relevant content can be referred to the corresponding parts of the above method.

Claims

1. A multi-node collaborative hot backup and recovery method, characterized in that, The multi-node collaborative hot backup and recovery method includes: Under the premise that the main business node is running normally and the business is not interrupted, the scheduling and management node triggers a backup task, instructing the main business node to divide the local running data into two categories: core basic data and non-core application data. The core basic data is synchronized to all mutual backup storage nodes in real time to complete the multi-replica landing. The non-core application data is transmitted only with incremental change fragments after the last backup and asynchronously synchronized to at least one mutual backup storage node. When a single node fails, it automatically switches to other available storage nodes to continue the synchronization. Throughout the process, the operation of the main business node is not interrupted. When the scheduling and management node receives a recovery request, it negotiates with all mutual backup storage nodes and target service nodes through heartbeats, schedules the available mutual backup storage node with the best network and the most complete data, and prioritizes the transmission of core basic data to the target service node. Once the target service node completes the loading of the core basic data, the core service functions become available immediately. Based on the user's actual access needs for non-core applications, the scheduling and management node schedules the backup storage nodes to transmit the corresponding non-core application data to the target business nodes as needed, gradually completing the full business recovery.

2. The multi-node collaborative hot backup and recovery method as described in claim 1, characterized in that, The core basic data includes at least system configuration, account information, key process status, and user core asset data, which are collected through real-time memory snapshots; the non-core application data includes at least third-party application installation packages, application cache, and non-sensitive user files, which are collected through file system block change tracking.

3. The multi-node collaborative hot backup and recovery method as described in claim 2, characterized in that, The backup task triggering instructions include any one or more of the following combinations: automatic triggering at a preset period, manual triggering by the user, event triggering when the data change of the main business node reaches a preset threshold, and emergency triggering when the scheduling management node detects node risks.

4. The multi-node collaborative hot backup and recovery method as described in claim 1, characterized in that, The core data is synchronized across multiple nodes using the Raft consensus algorithm to ensure consistency among multiple replicas. After each storage node completes the data writing, it synchronously generates an MD5 checksum and reports it to the scheduling and management node. When the checksums of all nodes are consistent, the core data backup is considered successful.

5. The multi-node collaborative hot backup and recovery method as described in claim 1, characterized in that, The loading time of the core basic data shall not exceed the preset time. After the core business functions are available, the main business node marks the non-core application entry that has not been restored as being in the recovery state, and at the same time synchronizes the recovery progress of the non-core data in the background.

6. The multi-node collaborative hot backup and recovery method as described in claim 1, characterized in that, The backup storage nodes are distributed across at least two physically isolated remote data centers. Core basic data is synchronized to all remote storage nodes, while non-core application data is only synchronized to storage nodes in the same city. In the event of a cross-regional failure, the core data of the remote storage nodes can be directly called to complete disaster recovery.

7. The multi-node collaborative hot backup and recovery method as described in claim 1, characterized in that, All mutual backup storage nodes collaboratively maintain a unified backup version chain. Each version record includes backup time, data checksum, and storage node distribution information. Recovery requests can specify any historical version to complete data recovery, and the version retention period can be customized.

8. A multi-node collaborative hot backup and recovery device, characterized in that, The multi-node collaborative hot backup and recovery device is applied to the multi-node collaborative hot backup and recovery method as described in any one of claims 1 to 7, the device comprising: The backup trigger module is used to trigger backup tasks under the premise that the main business node continues to operate normally and the business is not interrupted. It instructs the main business node to divide the local running data into two categories: core basic data and non-core application data. The data synchronization module is used to synchronize the core basic data to all mutual backup storage nodes in real time to complete the multi-replica landing, and to transmit only the incremental change fragments after the last backup of the non-core application data, and asynchronously synchronize it to at least one mutual backup storage node. When a single node fails, it automatically switches to other available storage nodes to continue the synchronization. The main business node is not interrupted throughout the process. The recovery scheduling module is used to negotiate with all mutual backup storage nodes and target service nodes through heartbeat when a recovery request is received. It schedules the available mutual backup storage node with the best network and the most complete data to be transmitted to the target service node first. After the target service node completes the loading of the core basic data, the core business functions will be available immediately. The on-demand recovery module is used to schedule backup storage nodes to transmit corresponding non-core application data to the target business nodes on demand, based on the user's actual access needs for non-core applications, and gradually complete the full business recovery.

9. A multi-node collaborative hot backup and recovery device, characterized in that, The multi-node collaborative hot backup and recovery device includes: a memory, a processor, and a multi-node collaborative hot backup and recovery program stored on the memory and executable on the processor, wherein the multi-node collaborative hot backup and recovery program is configured to implement the steps of the multi-node collaborative hot backup and recovery method as described in any one of claims 1 to 7.

10. A storage medium, characterized in that, The storage medium stores a multi-node collaborative hot backup and recovery program, which, when executed by a processor, implements the steps of the multi-node collaborative hot backup and recovery method as described in any one of claims 1 to 7.