A node processing method, device, equipment and readable storage medium
By iteratively calculating the redundancy and node order of sub-clusters in a distributed cluster, a list of nodes that can be processed in parallel is determined, solving the problem of low node processing efficiency in existing technologies and achieving efficient and flexible node processing.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- JINAN INSPUR DATA TECH CO LTD
- Filing Date
- 2023-04-07
- Publication Date
- 2026-06-26
AI Technical Summary
In distributed clusters, existing technologies are inefficient in node processing, cannot simultaneously meet the unique rule requirements of each sub-cluster, and are difficult to adapt to system evolution, often ignoring the specific needs of sub-clusters.
By determining the list of nodes to be operated on and sending it to the sub-cluster, the sub-cluster determines the list of nodes that can be processed in parallel based on its own redundancy and the optimal execution order of the nodes, and iterates the calculation step by step until all nodes have completed the processing, thus simplifying the overall control and rule calculation.
It improves the efficiency and flexibility of node processing, adapts to the evolution of distributed systems, does not ignore the requirements of sub-cluster rules, and does not restrict the deployment method, thus realizing multi-node parallel processing.
Smart Images

Figure CN116405495B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of distributed cluster technology, and more specifically, to a node processing method, apparatus, device, and readable storage medium. Background Technology
[0002] A distributed cluster contains multiple nodes, each of which may deploy various software modules. Nodes deploying the same software module form a sub-cluster. Similarly, a distributed storage cluster consists of a group of interconnected storage nodes, each with multiple responsibilities and belonging to multiple sub-clusters.
[0003] In distributed clusters, during scenarios where nodes are temporarily offline due to maintenance, restarts, or online upgrades, to avoid impacting business operations, only one batch of nodes can be operated on at a time. The next batch can only be operated on after these nodes have recovered. A distributed cluster may contain dozens, hundreds, or even more nodes; to improve operational efficiency, each batch should select as many nodes as possible for execution.
[0004] Each sub-cluster has its own unique redundancy rules and optimal node offline order. Furthermore, in a distributed cluster, a node may belong to multiple sub-clusters with different requirements. Therefore, how to select nodes to perform operations in batches while meeting these requirements becomes a problem to be solved.
[0005] Currently, some distributed clusters, to avoid the aforementioned node selection problem, simply execute operations node by node, resulting in low efficiency. Others ignore the rule requirements of some complex sub-clusters and simplify the process by adding restrictions to the software module deployment mode. This not only ignores the rule requirements of some sub-clusters but also imposes requirements on the software module deployment mode. Additionally, some methods collect the rules and node information of all sub-clusters and perform complex algorithmic calculations to obtain a batch of nodes that can be operated in parallel. These products, on the one hand, struggle to adapt to the continuous evolution of distributed systems, and on the other hand, often fail to fully consider the specific needs of each sub-cluster, resulting in suboptimal final results from the node processing.
[0006] In summary, how to improve node processing efficiency, simplicity, and flexibility without ignoring the rule requirements of sub-clusters or restricting their deployment methods is a technical problem that urgently needs to be solved by those skilled in the art. Summary of the Invention
[0007] In view of this, the purpose of this application is to provide a node processing method, apparatus, device and readable storage medium to improve node processing efficiency, simplicity and flexibility, without ignoring the rule requirements of sub-clusters and without restricting the deployment method of sub-clusters.
[0008] To achieve the above objectives, this application provides the following technical solution:
[0009] A node processing method, comprising:
[0010] Determine the list of nodes to be operated on; the list of nodes to be operated on includes unprocessed nodes in the distributed cluster.
[0011] The list of nodes to be operated is sent to one of the sub-clusters in the distributed cluster, and the sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy, and the optimal execution order of the nodes.
[0012] Receive the list of nodes that can be processed in parallel from the sub-cluster, and send the list of nodes that can be processed in parallel to the next sub-cluster, until the list of nodes that can be processed in parallel from the last sub-cluster is received.
[0013] Parallel processing is performed on one of the nodes in the list of nodes that can be processed in parallel sent by the last sub-cluster, and the process returns to the step of determining the list of nodes to be operated on, until all nodes in the distributed cluster have completed processing.
[0014] Preferred options also include:
[0015] Determine the priority of all the aforementioned sub-clusters;
[0016] Sending the list of nodes to be operated to one of the sub-clusters in the distributed cluster includes:
[0017] Send the list of nodes to be operated to the highest priority sub-cluster in the distributed cluster;
[0018] Sending the list of nodes that can be processed in parallel to the next sub-cluster includes:
[0019] The list of nodes that can be processed in parallel is sent to the next sub-cluster in descending order of priority.
[0020] Preferably, the priority of all said sub-clusters is determined, including:
[0021] Obtain the priority of each of the sub-clusters from each of the sub-clusters.
[0022] Preferably, receiving the list of nodes capable of parallel processing sent by the sub-cluster includes:
[0023] Receive the list of all parallelizable nodes sent by the sub-cluster;
[0024] Parallel processing is performed on one of the nodes in the list of nodes that can be processed in parallel, sent by the last sub-cluster, including:
[0025] The nodes in the best list of parallelizable nodes sent by the last sub-cluster are processed in parallel.
[0026] Preferably, receiving the list of nodes capable of parallel processing sent by the sub-cluster includes:
[0027] Receive a batch of the list of nodes that can be processed in parallel from the sub-cluster.
[0028] Preferably, sending the list of nodes to be operated to one of the sub-clusters in the distributed cluster includes:
[0029] Call the node operation batch acquisition interface of one of the sub-clusters to send the list of nodes to be operated to the sub-cluster through the node operation batch acquisition interface;
[0030] Sending the list of nodes that can be processed in parallel to the next sub-cluster includes:
[0031] Call the node operation batch interface of the next sub-cluster to send the list of nodes to be operated to the sub-cluster through the node operation batch interface.
[0032] Preferably, after performing parallel processing on the nodes in one of the parallelizable node lists sent by the last said sub-cluster, the method further includes:
[0033] Display the node information of the nodes that are performing parallel processing.
[0034] A node processing device, comprising:
[0035] The first determining module is used to determine a list of nodes to be operated on; the list of nodes to be operated on includes unprocessed nodes in the distributed cluster.
[0036] The first sending module is used to send the list of nodes to be operated to one of the sub-clusters in the distributed cluster, and the sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy and the optimal execution order of the nodes.
[0037] The receiving module is used to receive the list of nodes that can be processed in parallel from the sub-cluster, and send the list of nodes that can be processed in parallel to the next sub-cluster, until the list of nodes that can be processed in parallel from the last sub-cluster is received.
[0038] The parallel processing module is used to perform parallel processing on one of the nodes in the list of nodes that can be processed in parallel sent by the last sub-cluster, and return to the step of determining the list of nodes to be operated until all nodes in the distributed cluster have completed processing.
[0039] A node processing device, comprising:
[0040] Memory, used to store computer programs;
[0041] A processor for executing the computer program to implement the steps of the node processing method as described in any of the preceding claims.
[0042] A readable storage medium storing a computer program that, when executed by a processor, implements the steps of the node processing method as described in any of the preceding claims.
[0043] This application provides a node processing method, apparatus, device, and readable storage medium. The method includes: determining a list of nodes to be operated; the list of nodes to be operated includes unprocessed nodes in a distributed cluster; sending the list of nodes to be operated to one of the sub-clusters in the distributed cluster, whereby the sub-cluster determines a list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy, and the optimal execution order of the nodes; receiving the list of nodes that can be processed in parallel from the sub-cluster, and sending the list of nodes that can be processed in parallel to the next sub-cluster, until the list of nodes that can be processed in parallel from the last sub-cluster is received; performing parallel processing on the nodes in one of the nodes that can be processed in parallel from the last sub-cluster, and returning to the step of determining the list of nodes to be operated, until all nodes in the distributed cluster have completed processing.
[0044] The technical solution disclosed in this application determines a list of unprocessed nodes to be processed in a distributed cluster. This list is then sent to one of the sub-clusters in the distributed cluster. The sub-cluster determines a list of nodes that can be processed in parallel based on the list of nodes to be processed, its own redundancy, and the optimal execution order of the nodes. After receiving the list of nodes that can be processed in parallel from the sub-cluster, it is sent to the next sub-cluster in the distributed cluster. The next sub-cluster determines a new list of nodes that can be processed in parallel based on the list of nodes that can be processed in parallel, its own redundancy, and the optimal execution order of the nodes, and so on, until the last sub-cluster sends its list of nodes that can be processed in parallel. Afterward, nodes in one of the nodes in the list of nodes that can be processed in parallel from the last sub-cluster are processed in parallel, and the process returns to the step of determining the list of nodes to be processed, until all nodes in the distributed cluster have completed processing. The above process decomposes the computation for obtaining parallelizable nodes into sub-clusters that are familiar with their own redundancy rules, simplifying the overall implementation, decoupling overall control from specific rule calculations, and ensuring that all sub-clusters participate in determining the list of parallelizable nodes. It does not ignore the rule requirements of any sub-cluster and does not restrict the deployment method of sub-clusters, making it applicable to sub-clusters with any deployment method. Furthermore, this application does not require collecting rule and node information from all sub-clusters for complex algorithm calculations, making it relatively simple to implement and able to consider the needs of all sub-clusters. Even if a sub-cluster joins, it can directly participate in determining the list of parallelizable nodes and the final node processing in the same way, offering high flexibility. In addition, since this application ultimately processes nodes from one of the parallelizable node lists sent by the last sub-cluster in parallel, it can achieve simultaneous processing of multiple nodes at once, thereby improving node processing efficiency. Attached Figure Description
[0045] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of this application. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0046] Figure 1 This is a schematic diagram of the structure of a distributed storage cluster provided in an embodiment of this application;
[0047] Figure 2 A flowchart illustrating a node processing method provided in an embodiment of this application;
[0048] Figure 3 A flowchart illustrating another node processing method provided in this application embodiment;
[0049] Figure 4 This is a schematic diagram of the structure of a node processing device provided in an embodiment of this application;
[0050] Figure 5 This is a schematic diagram of the structure of a node processing device provided in an embodiment of this application. Detailed Implementation
[0051] Taking a distributed storage cluster within a distributed cluster as an example, a distributed storage cluster consists of a group of interconnected storage nodes. Each storage node undertakes multiple responsibilities and belongs to multiple sub-clusters. For example... Figure 1 As shown, it illustrates a schematic diagram of a distributed storage cluster. Nodes 1 to 6 constitute a distributed storage cluster. Among them, Nodes 1, 2, and 3 run cluster monitoring services, forming a system monitoring sub-cluster; Nodes 1, 4, and 5 constitute storage pool 1, and Nodes 4, 5, and 6 constitute storage pool 2.
[0052] In distributed storage clusters, during scenarios where nodes are temporarily offline due to maintenance, restarts, or online upgrades, to avoid impacting business operations, only one batch of nodes can be operated on at a time. The next batch can only be operated on after these nodes have recovered. A distributed storage cluster may contain dozens, hundreds, or even more nodes. To improve operational efficiency, each batch needs to select as many nodes as possible for operation.
[0053] Each sub-cluster has its own unique redundancy rules and optimal node offline order. For example, in the system monitoring sub-cluster shown in the diagram above, the total number of nodes offline simultaneously must be less than half of the total number of nodes in the sub-cluster. Furthermore, since master node switching has a greater impact on the system than switching between ordinary nodes, the number of master node switches during operations should be minimized. For example, master node operations can be scheduled last, requiring only one master node switch in the entire process. Storage pools vary depending on their data redundancy requirements. For instance, a 3-replica storage pool requires no more than two nodes to be offline simultaneously, while a 2-replica storage pool requires no more than one. In a distributed storage system, nodes may belong to multiple sub-clusters with different requirements. How to select nodes for operations in batches while meeting these requirements becomes a problem to be solved.
[0054] Currently, some distributed clusters, to avoid the aforementioned node selection problem, simply execute operations node by node, resulting in low efficiency. Others ignore the rule requirements of some complex sub-clusters and simplify the process by adding restrictions to the software module deployment mode. This not only ignores the rule requirements of some sub-clusters but also imposes requirements on the software module deployment mode. Additionally, some methods collect the rules and node information of all sub-clusters and perform complex algorithmic calculations to obtain a batch of nodes that can be operated in parallel. These products, on the one hand, struggle to adapt to the continuous evolution of distributed systems, and on the other hand, often fail to fully consider the specific needs of each sub-cluster, resulting in suboptimal final results from the node processing.
[0055] The core of this application is to provide a node processing method, apparatus, device, and readable storage medium to improve node processing efficiency, simplicity, and flexibility, without ignoring the rule requirements of sub-clusters or restricting the deployment method of sub-clusters.
[0056] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0057] See Figure 2 and Figure 3 ,in, Figure 2 A flowchart of a node processing method provided in an embodiment of this application is shown. Figure 3 The flowchart shown is another node processing method provided in the embodiment of this application. It should be noted that... Figure 3 This example uses a distributed cluster containing three sub-clusters; the implementation for other numbers of sub-clusters is similar. An embodiment of this application provides a node processing method that may include:
[0058] S11: Determine the list of nodes to be operated on; the list of nodes to be operated on contains unprocessed nodes in the distributed cluster.
[0059] In this application, the execution body of the node processing method is the control module, which is responsible for overall scheduling, scheduling each sub-cluster, and finally obtaining the node batches for parallel processing.
[0060] Specifically, the control module first determines the list of nodes to be operated on, which includes unprocessed nodes in the distributed cluster. For example, if all nodes in the distributed cluster have not yet performed an operation, the list of nodes to be operated on includes all nodes in the distributed cluster; if some nodes have already been processed, the list of nodes to be operated on does not include these nodes, but only includes all nodes that have not yet been processed.
[0061] Identifying unprocessed nodes facilitates the selection and determination of nodes in the sub-cluster from unprocessed nodes, avoiding the selection of already processed nodes and thus avoiding redundant processing of already processed nodes.
[0062] It should be noted that the distributed cluster mentioned in this application can specifically be a distributed storage cluster, or a cluster consisting of multiple sub-clusters with specific redundancy rules.
[0063] S12: Send the list of nodes to be operated to one of the sub-clusters in the distributed cluster. The sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy, and the optimal execution order of the nodes.
[0064] After determining the list of nodes to be operated on, the control module can send the list to one of the sub-clusters in the distributed cluster (referred to as the first sub-cluster). Upon receiving the list, the sub-cluster can determine a list of nodes that can be processed in parallel, based on the unprocessed nodes included in the list, its own redundancy, and the optimal execution order of the nodes. This list includes unprocessed nodes in the distributed cluster that are outside the first sub-cluster, and unprocessed nodes in the distributed cluster that are in the first sub-cluster and whose redundancy is determined based on the first sub-cluster's redundancy and optimal execution order. For example, if the distributed cluster contains ten nodes, with nodes 1-5 forming sub-cluster 1, nodes 6-10 forming sub-cluster 2, and nodes 1-3 forming sub-cluster 3, and sub-cluster 1 has a redundancy of 4 (meaning there must be four nodes in sub-cluster 1 at any given time), and the optimal execution order is from node 1 to node 5, then sub-cluster 1 can determine that one of its parallel processing node lists contains nodes 1 and nodes 6-10 (these six nodes).
[0065] In this case, the normal operation of the sub-cluster will not be affected if all nodes in the same list of parallel processing nodes corresponding to the sub-cluster go offline at the same time.
[0066] S13: Receive the list of nodes that can be processed in parallel from the sub-cluster, and send the list of nodes that can be processed in parallel to the next sub-cluster, until the list of nodes that can be processed in parallel from the last sub-cluster is received.
[0067] After determining its list of nodes that can be processed in parallel, the first sub-cluster can send this list to the control module. Correspondingly, the control module can receive the list of nodes that can be processed in parallel from the first sub-cluster and send it to the next sub-cluster in the distributed cluster (referred to as the second sub-cluster). The second sub-cluster determines its own list of nodes that can be processed in parallel based on the nodes included in its list, its own redundancy, and the optimal execution order of the nodes. Specifically, the list of nodes that can be processed in parallel corresponding to the second sub-cluster includes nodes determined by the first sub-cluster but located outside the second sub-cluster, and nodes determined by the second sub-cluster but located within it, based on the second sub-cluster's redundancy and optimal execution order. In other words, the list of nodes that can be processed in parallel output from the previous sub-cluster is used as input to the next sub-cluster to obtain its output list of nodes that can be processed in parallel; that is, the list of nodes that can be processed in parallel corresponding to the second sub-cluster is determined based on the list of nodes that can be processed in parallel corresponding to the first sub-cluster.
[0068] After obtaining its corresponding list of parallelizable nodes, the second sub-cluster sends its list to the control module. The control module receives this list and forwards it to the next sub-cluster (the third sub-cluster). The third sub-cluster determines its own list of parallelizable nodes based on the second sub-cluster's list, its own redundancy, and the optimal execution order of the nodes… This continues until the last sub-cluster in the distributed cluster determines its list of parallelizable nodes and sends it to the control module, thus obtaining the final list of parallelizable nodes.
[0069] In other words, this application decomposes the computational processing for obtaining parallelizable nodes into various sub-clusters. Each sub-cluster sequentially determines its list of parallelizable nodes based on its own redundancy rules and characteristics. The list of parallelizable nodes output by the previous sub-cluster is used as input for the next sub-cluster to obtain its output list of parallelizable nodes. Through iterative computation across all sub-clusters, the final list of parallelizable nodes is obtained. By decomposing the computation for obtaining parallelizable nodes into sub-clusters familiar with their own redundancy rules, this application simplifies the overall implementation, decouples overall control from specific rule computation, and improves the adaptability of the solution. It can easily adapt to changes in sub-clusters and conveniently obtain the list of parallelizable nodes in the distributed cluster, thus improving operational efficiency. The node processing method provided in this application has a wide range of applicable scenarios, commonly including batch node offline maintenance, restart, and online upgrades in distributed clusters.
[0070] S14: Perform parallel processing on one of the nodes in the list of nodes that can be processed in parallel sent by the last sub-cluster, and return to execute the step of determining the list of nodes to be operated on, until all nodes in the distributed cluster have completed processing.
[0071] After receiving the list of nodes that can be processed in parallel from the last sub-cluster, the control module can perform parallel processing on one of the nodes in the list, thereby improving node processing efficiency. Furthermore, processing only one node from the list sent by the last sub-cluster avoids processing duplicate nodes that would result from processing nodes from multiple lists simultaneously.
[0072] After parallel processing of one of the nodes in the list of nodes that can be processed in parallel sent to the last sub-cluster, the process can return to the step of determining the list of nodes to be operated on, that is, return to step S11, until all nodes in the distributed cluster have completed their operations.
[0073] As described above, compared to existing methods that simply process each node one by one, this application performs parallel processing on all nodes in the final list of parallelizable nodes to improve node processing efficiency. Compared to existing methods that ignore the rule requirements of some complex sub-clusters and simplify processing by adding restrictions to the software module deployment mode, this application performs iterative calculations on all sub-clusters in the distributed cluster, considering the rule requirements of each sub-cluster during iterative calculations, and imposes no restrictions on the deployment mode of the sub-clusters. Compared to existing methods that collect the rule and node information of all sub-clusters and perform complex algorithm calculations to obtain a batch of nodes that can be operated in parallel, this application decomposes the calculation of obtaining parallelizable nodes into each sub-cluster that is familiar with its own redundant rules, simplifying the overall implementation, decoupling the overall control from the specific rule calculation, and considering the needs of each sub-cluster while adapting well to the evolution of the distributed cluster (the addition of new sub-clusters, the departure of sub-clusters, etc.).
[0074] The technical solution disclosed in this application determines a list of unprocessed nodes to be processed in a distributed cluster. This list is then sent to one of the sub-clusters in the distributed cluster. The sub-cluster determines a list of nodes that can be processed in parallel based on the list of nodes to be processed, its own redundancy, and the optimal execution order of the nodes. After receiving the list of nodes that can be processed in parallel from the sub-cluster, it is sent to the next sub-cluster in the distributed cluster. The next sub-cluster determines a new list of nodes that can be processed in parallel based on the list of nodes that can be processed in parallel, its own redundancy, and the optimal execution order of the nodes, and so on, until the last sub-cluster sends its list of nodes that can be processed in parallel. Afterward, nodes in one of the nodes in the list of nodes that can be processed in parallel from the last sub-cluster are processed in parallel, and the process returns to the step of determining the list of nodes to be processed, until all nodes in the distributed cluster have completed processing. The above process decomposes the computation for obtaining parallelizable nodes into sub-clusters that are familiar with their own redundancy rules, simplifying the overall implementation, decoupling overall control from specific rule calculations, and ensuring that all sub-clusters participate in determining the list of parallelizable nodes. It does not ignore the rule requirements of any sub-cluster and does not restrict the deployment method of sub-clusters, making it applicable to sub-clusters with any deployment method. Furthermore, this application does not require collecting rule and node information from all sub-clusters for complex algorithm calculations, making it relatively simple to implement and able to consider the needs of all sub-clusters. Even if a sub-cluster joins, it can directly participate in determining the list of parallelizable nodes and the final node processing in the same way, offering high flexibility. In addition, since this application ultimately processes nodes from one of the parallelizable node lists sent by the last sub-cluster in parallel, it can achieve simultaneous processing of multiple nodes at once, thereby improving node processing efficiency.
[0075] The node processing method provided in this application embodiment may further include:
[0076] Determine the priority of all sub-clusters;
[0077] Sending the list of nodes to be operated on to one of the sub-clusters of the distributed cluster can include:
[0078] Send the list of nodes to be operated on to the highest priority sub-cluster in the distributed cluster;
[0079] Sending the list of nodes that can be processed in parallel to the next sub-cluster may include:
[0080] The list of nodes that can be processed in parallel is sent to the next sub-cluster in descending order of priority.
[0081] In this application, the control module can also determine the priority of all sub-clusters in the distributed cluster. For example, the priority of each sub-cluster can be predefined in the control module, and the priority of each sub-cluster can be defined according to the redundancy, importance, etc. of each sub-cluster.
[0082] After determining the priorities of the sub-clusters, when sending the list of nodes to be operated on to one of the sub-clusters in the distributed cluster, the list can be sent to the highest-priority sub-cluster in descending order of priority. This ensures that the highest-priority sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated on. Furthermore, when sending the list of nodes that can be processed in parallel to the next sub-cluster, the list can be sent in descending order of priority. In other words, iterative calculations and the determination of the list of nodes that can be processed in parallel are performed according to the descending priority of the sub-clusters, to ensure, as far as possible, the node processing order suggested by the high-priority sub-clusters.
[0083] The above process enables the control module to determine the priority of the sub-clusters in each loop and to determine the nodes that can be processed in parallel in descending order of sub-cluster priority.
[0084] This application provides a node processing method to determine the priority of all sub-clusters, which may include:
[0085] Obtain the priority of each sub-cluster from each sub-cluster.
[0086] In this application, when determining the priority of all sub-clusters, each sub-cluster can provide its own priority to improve the rationality of priority determination. Then, the control module obtains the priority of each sub-cluster from each sub-cluster.
[0087] The above methods can improve the rationality of determining the priority of each sub-cluster.
[0088] This application provides a node processing method that receives a list of nodes that can be processed in parallel from a sub-cluster, and may include:
[0089] Receive a list of all nodes that can be processed in parallel from the sub-cluster;
[0090] Parallel processing of nodes in one of the list of nodes that can be processed in parallel, sent by the last sub-cluster, can include:
[0091] Parallel processing is performed on the nodes in the best list of nodes that can be processed in parallel from the last sub-cluster.
[0092] In this application, when receiving the list of parallelizable nodes sent by the sub-cluster, it is possible to receive all the parallelizable node lists sent by the sub-cluster. That is, each sub-cluster sends all possible parallelizable node lists to the control module, so that the control module can obtain all the parallelizable node lists.
[0093] Furthermore, when the control module sends the list of nodes that can be processed in parallel to the next sub-cluster, it can also send the list of all possible nodes that can be processed in parallel to the previous sub-cluster. For example... Figure 3 As shown, sub-cluster 1 has n possible lists of nodes that can be processed in parallel (i.e. Figure 3 In the batches 1, 2, and n, for Figure 1 The cluster monitoring subsystem mentioned earlier can place its master node in batch n (i.e., in the nth type of parallel processing node list), and then send all n types of parallel processing node lists to the control module. Then, the control module can send each type of parallel processing node list corresponding to sub-cluster 1 to sub-cluster 2. The parallel processing node lists corresponding to sub-cluster 1 in sub-cluster 2 can determine the parallel processing node lists corresponding to sub-cluster 2. Specifically, as follows... Figure 3 As shown, the control module sends the list of batch x nodes corresponding to sub-cluster 1 to sub-cluster 2. Sub-cluster 2 can return batch x.1, batch x.2, and batch xm, thus using the batches of nodes returned by the previous sub-cluster as input parameters to participate in the determination of the next sub-cluster node batches.
[0094] Since each sub-cluster returns a list of all possible parallelizable nodes to the control module, when the control module performs parallel processing on nodes from one of the parallelizable node lists sent by the last sub-cluster, it can specifically perform parallel processing on nodes from the optimal parallelizable node list sent by the last sub-cluster. The optimal parallelizable node list can be the list of parallelizable nodes with the largest number of nodes (if there is more than one list of parallelizable nodes with the largest number of nodes, one can be randomly selected from it, etc.) or other judgment criteria.
[0095] This application provides a node processing method that receives a list of nodes that can be processed in parallel from a sub-cluster, and may include:
[0096] Receive a batch of parallelizable node lists sent by the sub-cluster.
[0097] Considering the large number of nodes and sub-clusters, sending the list of all parallelizable nodes for each sub-cluster to the control module would result in a large number of execution loops, a large number of parallelizable node lists, and a high computational load. To simplify this, each sub-cluster can send only one parallelizable node list (which one to send can be determined by the sub-cluster itself or by the next sub-cluster) to the control module. The control module then forwards this list to the next sub-cluster, and so on. Finally, the control module receives the last parallelizable node list from the last sub-cluster and processes it. While the final parallelizable node list returned by the last sub-cluster may not be optimal, it significantly reduces the processing load.
[0098] This application provides a node processing method that sends a list of nodes to be operated to one of the sub-clusters in a distributed cluster, which may include:
[0099] Call the "Get Node Operation Batch" interface of one of the sub-clusters and send the list of nodes to be operated to the sub-cluster through the "Get Node Operation Batch" interface.
[0100] Sending the list of nodes that can be processed in parallel to the next sub-cluster may include:
[0101] Call the "Get Node Operation Batch" interface of the next sub-cluster to send the list of nodes to be operated to the sub-cluster.
[0102] In this application, each sub-cluster can implement a unified definition of a node operation batch acquisition interface according to its own redundancy, optimal node execution order and other requirements. This interface returns a list of one or more nodes that can be processed in parallel. The normal operation of the sub-cluster function is not affected even if all nodes in the list go offline at the same time.
[0103] When the control module sends the list of nodes to be operated to one of the sub-clusters in the distributed cluster, it can call the sub-cluster's "Get Node Operation Batch" interface to send the list of nodes to be operated to that sub-cluster through the "Get Node Operation Batch" interface. The control module can also retrieve the list of nodes that can be processed in parallel from that sub-cluster through the "Get Node Operation Batch" interface. Furthermore, when the control module sends the list of nodes that can be processed in parallel to the next sub-cluster, it can also call the "Get Node Operation Batch" interface of that sub-cluster (i.e., the aforementioned next sub-cluster) to send the list of nodes that can be processed in parallel to that sub-cluster through the "Get Node Operation Batch" interface. The control module can also retrieve the list of nodes that can be processed in parallel from that sub-cluster through the "Get Node Operation Batch" interface.
[0104] In other words, the control module can input lists and output lists corresponding to sub-clusters by calling the node operation batch interface of each sub-cluster, thereby improving the convenience and efficiency of the interaction between the control module and the sub-cluster.
[0105] The node processing method provided in this application embodiment, after performing parallel processing on one of the nodes in the list of nodes that can be processed in parallel sent by the last sub-cluster, may further include:
[0106] Display the node information of the nodes that are performing parallel processing.
[0107] In this application, after the nodes in one of the parallel processing node lists sent by the last sub-cluster are processed in parallel, the control module can also display the node information (node name, node number, etc.) of the nodes that are processed in parallel, so that relevant personnel can know which nodes have been processed by the display.
[0108] This application also provides a node processing device, see [link to relevant documentation] Figure 4 It shows a schematic diagram of the structure of a node processing device provided in an embodiment of this application, which may include:
[0109] The first determining module 41 is used to determine the list of nodes to be operated on; the list of nodes to be operated on contains unprocessed nodes in the distributed cluster.
[0110] The first sending module 42 is used to send the list of nodes to be operated to one of the sub-clusters in the distributed cluster, and the sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy, and the optimal execution order of the nodes.
[0111] The receiving module 43 is used to receive the list of nodes that can be processed in parallel sent by the sub-cluster, and send the list of nodes that can be processed in parallel to the next sub-cluster, until it receives the list of nodes that can be processed in parallel sent by the last sub-cluster.
[0112] The parallel processing module 44 is used to process the nodes in one of the parallel processing node lists sent by the last sub-cluster in parallel, and return to the step of determining the list of nodes to be operated on, until all nodes in the distributed cluster have completed processing.
[0113] The node processing apparatus provided in this application embodiment may further include:
[0114] The second determining module is used to determine the priority of all sub-clusters;
[0115] The first transmitting module 42 may include:
[0116] The first sending unit is used to send the list of nodes to be operated to the highest priority sub-cluster in the distributed cluster.
[0117] The receiving module 43 may include:
[0118] The second sending unit is used to send the list of nodes that can be processed in parallel to the next sub-cluster in descending order of priority.
[0119] This application provides a node processing device, wherein the second determining module may include:
[0120] The acquisition unit is used to obtain the priority of each sub-cluster from each sub-cluster.
[0121] This application provides a node processing device, wherein the receiving module 43 may include:
[0122] The first receiving unit is used to receive the list of all parallelizable nodes sent by the sub-cluster;
[0123] Parallel processing module 44 may include:
[0124] The parallel processing unit is used to process the nodes in the best list of parallelizable nodes sent by the last sub-cluster in parallel.
[0125] This application provides a node processing device, wherein the receiving module 43 may include:
[0126] The second receiving unit is used to receive a batch of parallelizable node lists sent by the sub-cluster.
[0127] This application provides a node processing device, wherein the first sending module 42 may include:
[0128] The third sending unit is used to call the node operation batch acquisition interface of one of the sub-clusters and send the list of nodes to be operated to the sub-cluster through the node operation batch acquisition interface.
[0129] The receiving module 43 may include:
[0130] The fourth sending unit is used to call the node operation batch acquisition interface of the next sub-cluster and send the list of nodes to be operated to the sub-cluster through the node operation batch acquisition interface.
[0131] The node processing apparatus provided in this application embodiment may further include:
[0132] The display module is used to display the node information of the nodes that were processed in parallel after the nodes in one of the list of nodes that can be processed in parallel sent to the last sub-cluster have been processed in parallel.
[0133] This application also provides a node processing device, see [link to relevant documentation] Figure 5 It shows a schematic diagram of the structure of a node processing device provided in an embodiment of this application, which may include:
[0134] Memory 51 is used to store computer programs;
[0135] When processor 52 executes a computer program stored in memory 51, it can perform the following steps:
[0136] The process involves: determining a list of nodes to be processed; this list includes unprocessed nodes in the distributed cluster; sending this list to one of the sub-clusters in the distributed cluster, which then determines a list of nodes that can be processed in parallel based on the list of nodes to be processed, its own redundancy, and the optimal execution order of the nodes; receiving the list of nodes that can be processed in parallel from the sub-cluster, and then sending it to the next sub-cluster, until the last sub-cluster's list of nodes that can be processed in parallel is received; performing parallel processing on one of the nodes in the list of nodes that can be processed in parallel from the last sub-cluster's list, and then returning to the step of determining the list of nodes to be processed, until all nodes in the distributed cluster have completed processing.
[0137] This application embodiment also provides a readable storage medium storing a computer program, which, when executed by a processor, can perform the following steps:
[0138] The process involves: determining a list of nodes to be processed; this list includes unprocessed nodes in the distributed cluster; sending this list to one of the sub-clusters in the distributed cluster, which then determines a list of nodes that can be processed in parallel based on the list of nodes to be processed, its own redundancy, and the optimal execution order of the nodes; receiving the list of nodes that can be processed in parallel from the sub-cluster, and then sending it to the next sub-cluster, until the last sub-cluster's list of nodes that can be processed in parallel is received; performing parallel processing on one of the nodes in the list of nodes that can be processed in parallel from the last sub-cluster's list, and then returning to the step of determining the list of nodes to be processed, until all nodes in the distributed cluster have completed processing.
[0139] The readable storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0140] For a description of the relevant parts of the node processing apparatus, device and readable storage medium provided in the embodiments of this application, please refer to the detailed description of the relevant parts of the node processing method provided in the embodiments of this application, which will not be repeated here.
[0141] It should be noted that, in this document, relational terms such as "first" and "second" are used merely to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that the elements inherent in a process, method, article, or apparatus that includes a list of elements are included. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element. Additionally, portions of the technical solutions provided in the embodiments of this application that are consistent with the implementation principles of corresponding technical solutions in the prior art have not been described in detail to avoid excessive elaboration.
[0142] The above description of the disclosed embodiments enables those skilled in the art to make or use this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
Claims
1. A node processing method, characterized in that, include: Determine the list of nodes to be operated on; The list of nodes to be operated on contains unprocessed nodes in the distributed cluster; The list of nodes to be operated is sent to one of the sub-clusters in the distributed cluster, and the sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy, and the optimal execution order of the nodes. Receive the list of nodes that can be processed in parallel from the sub-cluster, and send the list of nodes that can be processed in parallel to the next sub-cluster, until the list of nodes that can be processed in parallel from the last sub-cluster is received. Parallel processing is performed on one of the nodes in the list of nodes that can be processed in parallel sent by the last sub-cluster, and the step of determining the list of nodes to be operated is returned until all nodes in the distributed cluster have completed processing. The process includes receiving the list of nodes that can be processed in parallel from the sub-cluster, sending the list of nodes that can be processed in parallel to the next sub-cluster, and continuing until the list of nodes that can be processed in parallel from the last sub-cluster is received. The system receives a first list of nodes that can be processed in parallel from the first sub-cluster, and sends the first list of nodes that can be processed in parallel to the second sub-cluster. The second sub-cluster then determines a new list of nodes that can be processed in parallel based on the received first list of nodes that can be processed in parallel, its own redundancy, and the optimal execution order of the nodes, until it receives a list of nodes that can be processed in parallel from the last sub-cluster.
2. The node processing method according to claim 1, characterized in that, Also includes: Determine the priority of all the aforementioned sub-clusters; Sending the list of nodes to be operated to one of the sub-clusters in the distributed cluster includes: Send the list of nodes to be operated to the highest priority sub-cluster in the distributed cluster; Sending the list of nodes that can be processed in parallel to the next sub-cluster includes: The list of nodes that can be processed in parallel is sent to the next sub-cluster in descending order of priority.
3. The node processing method according to claim 2, characterized in that, Determine the priority of all said sub-clusters, including: Obtain the priority of each of the sub-clusters from each of the sub-clusters.
4. The node processing method according to claim 1, characterized in that, The list of nodes that can be processed in parallel, sent by the sub-cluster, includes: Receive the list of all parallelizable nodes sent by the sub-cluster; Parallel processing is performed on one of the nodes in the list of nodes that can be processed in parallel, sent by the last sub-cluster, including: The nodes in the best list of parallelizable nodes sent by the last sub-cluster are processed in parallel.
5. The node processing method according to claim 1, characterized in that, The list of nodes that can be processed in parallel, sent by the sub-cluster, includes: Receive a batch of the list of nodes that can be processed in parallel from the sub-cluster.
6. The node processing method according to claim 1, characterized in that, Sending the list of nodes to be operated to one of the sub-clusters in the distributed cluster includes: Call the node operation batch acquisition interface of one of the sub-clusters to send the list of nodes to be operated to the sub-cluster through the node operation batch acquisition interface; Sending the list of nodes that can be processed in parallel to the next sub-cluster includes: Call the node operation batch interface of the next sub-cluster to send the list of nodes to be operated to the sub-cluster through the node operation batch interface.
7. The node processing method according to claim 1, characterized in that, After parallel processing of the nodes in one of the parallelizable node lists sent by the last said sub-cluster, the process further includes: Display the node information of the nodes that are performing parallel processing.
8. A node processing device, characterized in that, include: The first determination module is used to determine the list of nodes to be operated on. The list of nodes to be operated on contains unprocessed nodes in the distributed cluster; The first sending module is used to send the list of nodes to be operated to one of the sub-clusters in the distributed cluster, and the sub-cluster determines the list of nodes that can be processed in parallel based on the list of nodes to be operated, its own redundancy and the optimal execution order of the nodes. The receiving module is used to receive the list of nodes that can be processed in parallel from the sub-cluster, and send the list of nodes that can be processed in parallel to the next sub-cluster, until the list of nodes that can be processed in parallel from the last sub-cluster is received. The parallel processing module is used to perform parallel processing on one of the nodes in the list of nodes that can be processed in parallel sent by the last sub-cluster, and return to the step of determining the list of nodes to be operated until all nodes in the distributed cluster have completed processing. Specifically, the receiving module is used to receive a first list of nodes that can be processed in parallel from the first sub-cluster, send the first list of nodes that can be processed in parallel to the second sub-cluster, and the second sub-cluster determines a new list of nodes that can be processed in parallel based on the received first list of nodes that can be processed in parallel, its own redundancy, and the optimal execution order of the nodes, until it receives the list of nodes that can be processed in parallel from the last sub-cluster.
9. A node processing device, characterized in that, include: Memory, used to store computer programs; A processor for executing the computer program to implement the steps of the node processing method as described in any one of claims 1 to 7.
10. A readable storage medium, characterized in that, The readable storage medium stores a computer program that, when executed by a processor, implements the steps of the node processing method as described in any one of claims 1 to 7.