Task scheduling method, system and computer readable storage medium

By sharding and distributing the initial node set, the problem of insufficient efficiency and accuracy of existing task scheduling systems in handling complex tasks is solved, achieving high efficiency and high practicality in task scheduling.

CN116302419BActive Publication Date: 2026-06-23CHINA MERCHANTS BANK

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
CHINA MERCHANTS BANK
Filing Date
2023-02-28
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing task scheduling systems are inefficient and inaccurate when handling complex tasks, and are difficult to scale dynamically, resulting in insufficient practicality.

Method used

The initial node set is sharded by splitting service nodes, and the task data is distributedly scheduled by using scheduling to select service nodes and execution proxy service nodes. Execution parameters are obtained by combining middleware to improve the efficiency and accuracy of task scheduling.

Benefits of technology

It enables distributed processing of task data, improves the efficiency and accuracy of complex task scheduling, and enhances the practicality of task scheduling.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116302419B_ABST
    Figure CN116302419B_ABST
Patent Text Reader

Abstract

The application discloses a task scheduling method, system and computer readable storage medium, the method comprises the following steps: obtaining task data and resource data of an initial node group set through a segmentation service node, and performing segmentation on the initial node group set according to the task data, the resource data and a preset shard set to obtain a segmentation result; obtaining execution parameters of the task data through a scheduling selection service node and middleware, and performing scheduling on the task data according to the execution parameters and the segmentation result through the scheduling selection service node and an execution agent service node. The application distributes task data in the initial node group set to corresponding shards by performing segmentation on the initial node group set, so that the task data can be processed in a distributed manner, the efficiency and accuracy of processing complex task scheduling are improved, and the practicality of task scheduling is improved.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence technology, and in particular to task scheduling methods, systems and computer-readable storage media. Background Technology

[0002] As enterprises increasingly automate their processes, the importance of task scheduling grows, necessitating the use of task scheduling systems. Currently, task scheduling is primarily based on commercial systems like Control-M or open-source systems such as Airflow and Dolphin. However, Control-M's architecture is not distributed, making it unable to dynamically scale with changing task sizes, resulting in significant expansion challenges. While Airflow and Dolphin employ distributed architectures, their scheduling options are limited, leading to low efficiency and accuracy in handling complex tasks, thus limiting the practicality of existing task scheduling solutions. Therefore, improving the practicality of task scheduling is an urgent issue that needs to be addressed. Summary of the Invention

[0003] The main objective of this invention is to propose a task scheduling method, system, and computer-readable storage medium, aiming to solve the problem of how to improve the practicality of task scheduling.

[0004] To achieve the above objectives, the present invention provides a task scheduling method, which is applied to a task scheduling system. The task scheduling system includes: a segmentation service node, a scheduling selection service node, an execution proxy service node, and middleware. The task scheduling method includes the following steps:

[0005] The task data and resource data of the initial node group set are obtained through the splitting service node, and the initial node group set is split according to the task data, the resource data and the preset splitting set to obtain the splitting result;

[0006] The execution parameters of the task data are obtained through the scheduling selection service node and the middleware, and the task data is scheduled according to the execution parameters and the sharding result through the scheduling selection service node and the execution proxy service node.

[0007] Optionally, the steps of obtaining task data and resource data of the initial node group set by splitting service nodes, and then splitting the initial node group set according to the task data, the resource data, and the preset sharding set to obtain the sharding result include:

[0008] The resource data of each initial node group in the initial node group set is obtained through the splitting service node, and the initial node groups in the initial node group set are merged according to the resource data to obtain the target node group set.

[0009] The task data of the initial node group set is obtained through the splitting service node, and the target node group set is split according to the task data and the preset number of shards to obtain the sharding result.

[0010] Optionally, the step of merging the initial node groups in the initial node group set according to the resource data to obtain the target node group set includes:

[0011] The current initial node group is determined from the initial node group set by the segmentation service node;

[0012] The resource data of the current initial node group is compared with the pre-created set of processed resource data through the segmentation service node;

[0013] If the resource data does not intersect with the processed resource data set, the resource data is stored in the processed resource data set through the splitting service node;

[0014] If the resource data intersects with the set of processed resource data, the splitting service node obtains the corresponding processed node group from the pre-created set of processed node groups based on the intersection, merges the current initial node group with the processed node group to obtain a merged node group, and stores the merged node group in the set of processed node groups.

[0015] The current initial node group is updated through the segmentation service node, and the following steps are re-executed: the resource data of the current initial node group is compared with the pre-created set of processed resource data;

[0016] The process continues until all initial node groups in the initial node group set have been compared, and the target node group set is obtained based on the processed node group set.

[0017] Optionally, the step of splitting the target node group set according to the task data and the preset number of shards to obtain the sharding result includes:

[0018] The segmentation service node calculates the number of tasks to be carried by each segment in the preset segment set based on the task data and the preset segment set.

[0019] The number of tasks for each target node in the target node group set is obtained through the splitting service node, and the target node group set is split according to the number of tasks and the number of tasks carried to obtain the sharding result.

[0020] Optionally, the steps of obtaining the number of tasks for each target node in the target node group set through the splitting service node, and splitting the target node group set according to the number of tasks and the number of tasks carried to obtain the sharding result include:

[0021] The current target node group is determined in the target node group set by the splitting service node, and the current shard is determined in the preset shard set;

[0022] The first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard is calculated by the splitting service node, and the first difference is compared with a preset difference range.

[0023] If the first difference is within the preset difference range, the current target node group is allocated to the current shard through the splitting service node, and the steps of: determining the current target node group in the target node group set and determining the current shard in the preset shard set are executed again.

[0024] If the first difference is not within the preset difference range, the number of tasks corresponding to the current target node group is calculated by the splitting service node, and the sum of the number of tasks corresponding to all the pending node groups in the pre-created pending node group set is calculated. Based on the sum and the number of tasks carried by the current shard, the current target node group and all pending node groups are allocated to the current shard, and the following steps are re-executed: determine the current target node group in the target node group set, determine the current shard in the preset shard set, or update the current target node group, and re-execute the following steps: calculate the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard.

[0025] The sharding result is obtained when all target node groups in the target node group set have been allocated.

[0026] Optionally, based on the sum and the number of tasks corresponding to the current shard, the current target node group and all pending node groups are assigned to the current shard, and the steps of: determining the current target node group in the target node group set, determining the current shard in the preset shard set, or updating the current target node group, and re-executing the step of: calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks corresponding to the current shard includes:

[0027] The second difference between the sum and the number of tasks carried by the current shard is calculated by the sharding service node, and the second difference is compared with the preset difference range.

[0028] If the second difference is within the preset difference range, the current target node group and all node groups to be allocated are allocated to the current shard through the splitting service node, and the set of node groups to be allocated is cleared. Then, the following steps are executed again: determine the current target node group in the target node group set, and determine the current shard in the preset shard set.

[0029] If the second difference is less than the preset difference range, the current target node group is stored in the set of node groups to be allocated through the splitting service node, and the current target node group is updated. The step of re-executing the following steps is then performed: calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard.

[0030] If the second difference is greater than the preset difference range, the current target node group is retained in the target node group set by the splitting service node, and the current target node group is updated. The step of re-executing the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard is then executed.

[0031] Optionally, after the step of obtaining the sharding result after all target node groups in the target node group set have been allocated, the following steps are included:

[0032] The splitting service node detects whether there are any unassigned target node groups in the target node group set;

[0033] If they exist, the unassigned target node group is allocated to the corresponding shard according to the preset allocation strategy through the sharding service node, and the sharding result is updated.

[0034] Optionally, after obtaining the sharding results, the following steps are included:

[0035] The current time is obtained through the segmentation service node. If the current time has not reached the preset daily segmentation time, the task change is detected through the segmentation service node according to the preset time interval.

[0036] If a task change occurs, the resource information corresponding to the changed task is obtained through the splitting service node, and the changed task is allocated to the corresponding shard according to the resource information, or the changed task is added to the initial node group set through the splitting service node, and the steps are re-executed: obtaining the task data and resource data of the initial node group set, and splitting the initial node group set according to the task data, the resource data and the preset shard set to obtain the sharding result.

[0037] In addition, to achieve the above objectives, the present invention also provides a task scheduling system, the task scheduling system comprising: a segmentation service node, a scheduling selection service node, an execution proxy service node, middleware, a memory, a processor, and a task scheduler stored in the memory and executable on the processor, wherein the task scheduler, when executed by the processor, implements the steps of the task scheduling method described above.

[0038] In addition, to achieve the above objectives, the present invention also provides a computer-readable storage medium storing a task scheduler, which, when executed by a processor, implements the steps of the task scheduling method described above.

[0039] The task scheduling method proposed in this invention obtains task data and resource data of an initial node group set through a segmentation service node, and segments the initial node group set according to the task data, the resource data, and a preset sharding set to obtain sharding results. The execution parameters of the task data are obtained through a scheduling selection service node and the middleware, and the task data is scheduled according to the execution parameters and the sharding results through the scheduling selection service node and the execution proxy service node. This invention, by sharding the initial node group set and distributing the task data within it to corresponding shards, enables distributed processing of task data, improves the efficiency and accuracy of handling complex task scheduling, and thus enhances the practicality of task scheduling. Attached Figure Description

[0040] Figure 1 This is a schematic diagram of the device structure of the hardware operating environment involved in the embodiments of the present invention;

[0041] Figure 2 This is a flowchart illustrating the first embodiment of the task scheduling method of the present invention;

[0042] Figure 3This is a flowchart illustrating the second embodiment of the task scheduling method of the present invention;

[0043] Figure 4 This is a flowchart illustrating the third embodiment of the task scheduling method of the present invention;

[0044] Figure 5 This is a flowchart illustrating the fourth embodiment of the task scheduling method of the present invention;

[0045] Figure 6 This is a schematic diagram of the task scheduling system of the present invention.

[0046] The realization of the objective of this invention, its functional features and advantages will be further explained with reference to the accompanying drawings through a series of embodiments. Detailed Implementation

[0047] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

[0048] like Figure 1 As shown, Figure 1 This is a schematic diagram of the device structure of the hardware operating environment involved in the embodiments of the present invention.

[0049] The device in this embodiment of the invention can be a PC or a server.

[0050] like Figure 1 As shown, the device may include: a processor 1001, such as a CPU; a network interface 1004; a user interface 1003; a memory 1005; and a communication bus 1002. The communication bus 1002 is used to enable communication between these components. The user interface 1003 may include a display screen or an input unit such as a keyboard; optionally, the user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface or a wireless interface (such as a Wi-Fi interface). The memory 1005 may be high-speed RAM or non-volatile memory, such as a disk drive. Optionally, the memory 1005 may also be a storage device independent of the aforementioned processor 1001.

[0051] Those skilled in the art will understand that Figure 1 The device structure shown does not constitute a limitation on the device and may include more or fewer components than shown, or combine certain components, or have different component arrangements.

[0052] like Figure 1 As shown, the memory 1005, which serves as a computer storage medium, may include an operating system, a network communication module, a user interface module, and a task scheduler.

[0053] The operating system is a program that manages and controls the task scheduling system and software resources, and supports the operation of the network communication module, user interface module, task scheduler and other programs or software; the network communication module is used to manage and control the network interface 1002; the user interface module is used to manage and control the user interface 1003.

[0054] exist Figure 1 In the task scheduling system shown, the task scheduling system calls the task scheduling program stored in the memory 1005 through the processor 1001 and executes the operations in the various embodiments of the task scheduling method described below.

[0055] Based on the above hardware structure, an embodiment of the task scheduling method of the present invention is proposed.

[0056] like Figure 2 As shown, Figure 2 This is a flowchart illustrating the first embodiment of the task scheduling method of the present invention, the method comprising:

[0057] Step S10: Obtain task data and resource data of the initial node group set through the split service node, and split the initial node group set according to the task data, the resource data and the preset sharding set to obtain the sharding result;

[0058] Step S20: Obtain the execution parameters of the task data through the scheduling selection service node and the middleware, and schedule the task data according to the execution parameters and the sharding result through the scheduling selection service node and the execution proxy service node.

[0059] The task scheduling method in this embodiment can be applied to the task scheduling system of a financial institution. The task scheduling system can be applied to terminal devices, such as PCs and smart terminals. For ease of description, a task scheduling system is used as an example. The task scheduling system obtains resource data for each initial node group in the initial node group set by splitting service nodes, and merges the initial node groups in the initial node group set according to the resource data to obtain the target node group set. The task scheduling system obtains task data for the initial node group set by splitting service nodes, and splits the target node group set according to the task data and a preset sharding set to obtain sharding results. The task scheduling system obtains the execution parameters of the task data by scheduling and selecting service nodes and middleware, and schedules the task data according to the execution parameters and sharding results by scheduling and selecting service nodes and execution proxy service nodes.

[0060] The task scheduling method in this embodiment obtains task data and resource data of an initial node group set by splitting service nodes, and then splits the initial node group set into shards based on the task data, resource data, and a preset shard set to obtain sharding results. It then obtains execution parameters for the task data by selecting service nodes and middleware through scheduling, and finally schedules the task data based on the execution parameters and sharding results using the selected service nodes and execution proxy service nodes. This invention, by sharding the initial node group set and distributing the task data within it to corresponding shards, enables distributed processing of task data and improves the efficiency and accuracy of handling complex task scheduling, thereby enhancing the practicality of task scheduling.

[0061] The following will provide a detailed explanation of each step:

[0062] Step S10: Obtain task data and resource data of the initial node group set through the split service node, and split the initial node group set according to the task data, the resource data and the preset sharding set to obtain the sharding result;

[0063] In this embodiment, the task scheduling system includes a partitioning service node. The task scheduling system obtains an initial node group set through the partitioning service node. Each initial node group in the initial node group set contains a task data storage structure and a resource data storage structure of the initial node group. The task scheduling system can obtain the corresponding task data and resource data based on the task data storage structure and the resource data storage structure of the initial node group. The task scheduling system partitions each initial node group in the initial node group set according to the task data, resource data, and a preset partition set through the partitioning service node to obtain the partitioning result. It should be noted that the preset partition set is pre-set in the task scheduling system. Each partition can be understood as a server. The task scheduling system allocates the tasks corresponding to the node group to the corresponding partition. All tasks in each partition are scheduled by the same scheduling selection service node.

[0064] Specifically, step S10 includes:

[0065] Step S101: Obtain resource data of each initial node group in the initial node group set through the split service node, and perform a merging operation on the initial node groups in the initial node group set according to the resource data to obtain the target node group set.

[0066] In this step, the task scheduling system obtains the resource data corresponding to each initial node group by splitting the service nodes according to the resource data storage structure corresponding to each initial node group in the initial node group set, compares the resource data corresponding to each initial node group, and merges the initial node groups with overlapping resource data to obtain the target node group set.

[0067] Step S102: Obtain the task data of the initial node group set through the splitting service node, and perform a splitting operation on the target node group set according to the task data and the preset number of shards to obtain the sharding result.

[0068] In this step, the task scheduling system obtains the task data corresponding to each initial node group by splitting the service node according to the task data storage structure corresponding to each initial node group in the initial node group set. Based on the task data of each initial node group, it determines the task data corresponding to each target node group set in the target node group set. Based on the task data corresponding to each target node group set and the preset sharding set, the target node group set is split, and each target node group in the target node group set is assigned to a shard in the preset sharding set to obtain the sharding result.

[0069] Further, step S102 includes:

[0070] Step S1021: The splitting service node calculates the number of tasks to be carried by each shard in the preset shard set based on the task data and the preset shard set.

[0071] In this step, the task scheduling system determines the total number of tasks corresponding to the target node group set by splitting the service nodes based on the task data of each target node group in the target node group set. Then, based on the total number of tasks and the number of shards corresponding to the preset shard set, it calculates the number of tasks that each shard in the preset shard set can carry. For example, if the task scheduling system determines by splitting the service nodes that the total number of tasks corresponding to the target node group set is 1 million and the number of shards corresponding to the preset shard set is 5, the task scheduling system can calculate that the number of tasks that each shard in the preset shard set can carry is 200,000.

[0072] Step S1022: Obtain the number of tasks for each target node in the target node group set through the splitting service node, and perform a splitting operation on the target node group set according to the number of tasks and the number of tasks carried to obtain the sharding result.

[0073] In this step, the task scheduling system obtains the number of tasks for each target node in the target node group set by splitting the service nodes, and determines the shards that each target node can be allocated based on the number of tasks and the number of tasks it carries, and then performs a splitting operation on the target node group set to obtain the sharding result.

[0074] Step S20: Obtain the execution parameters of the task data through the scheduling selection service node and the middleware, and schedule the task data according to the execution parameters and the sharding result through the scheduling selection service node and the execution proxy service node.

[0075] In this embodiment, the task scheduling system includes a scheduling selection service node, an execution proxy service node, and middleware. The middleware includes an external event monitoring service node, a front-end routing service node, and an event processing service node. After obtaining the sharding result by splitting the service nodes, the task scheduling system allocates a scheduling selection service node to each shard according to the sharding result. The execution parameters of the task data in the corresponding shard are obtained through each corresponding scheduling selection service node and the middleware. Then, each scheduling selection node distributes the task data in the corresponding shard to the corresponding execution proxy service node according to the execution parameters. The corresponding execution proxy service node processes the distributed task data. The execution parameters include the conditions for executing each task data, the relationships between task data, the service registration information of the task data, monitoring information, heartbeat information, etc.

[0076] The task scheduling system in this embodiment obtains resource data for each initial node group in the initial node group set by splitting service nodes, and merges the initial node groups in the initial node group set according to the resource data to obtain the target node group set. The system then obtains task data from the initial node group set by splitting service nodes, and splits the target node group set according to the task data and a preset set of shards to obtain sharding results. Finally, the system obtains execution parameters for the task data by selecting service nodes and middleware, and schedules the task data according to the execution parameters and sharding results using these service nodes and execution proxy service nodes. By sharding the initial node group set, the task data in the initial node group set is allocated to corresponding shards, enabling distributed processing of task data. This prevents network overhead caused by high-frequency lock checks on task data and the risk of inconsistency in task data under a distributed system, and improves the efficiency and accuracy of handling complex task scheduling, thereby enhancing the practicality of task scheduling.

[0077] Furthermore, such as Figure 3 As shown, based on the first embodiment of the task scheduling method of the present invention, a second embodiment of the task scheduling method of the present invention is proposed.

[0078] The second embodiment of the task scheduling method differs from the first embodiment in that the step of merging the initial node groups in the initial node group set according to the resource data to obtain the target node group set includes:

[0079] Step S1011: The current initial node group is determined from the initial node group set by the splitting service node;

[0080] In this step, the task scheduling system obtains the first sorting information of all initial node groups in the initial node group set by splitting the service nodes, and determines the current initial node group in the initial node group set according to the first sorting information. It can be understood that the task scheduling system, by splitting the service nodes, sequentially selects an initial node group from the initial node group set as the current initial node group based on the first sorting information, and then performs subsequent steps on the current initial node group.

[0081] Step S1012: The resource data of the current initial node group is compared with the pre-created set of processed resource data through the splitting service node;

[0082] In this step, the task scheduling system obtains the resource data of the current initial node group by splitting the service nodes, and compares the resource data of the current initial node group with the pre-created set of processed resource data. It should be noted that the pre-created set of processed resource data is an empty set. That is, before the task scheduling system starts merging the initial node groups in the initial node group set by splitting the service nodes, no resource data is stored in the set of processed resource data. As the merging operation proceeds, the corresponding resource data will be stored in the set of processed resource data.

[0083] Step S1013: If the resource data does not intersect with the processed resource data set, the resource data is stored in the processed resource data set through the splitting service node;

[0084] In this step, the task scheduling system compares the resource data of the current initial node group with the processed resource data set by splitting the service node. If the comparison result shows that there is no intersection between the resource data of the current initial node group and the processed resource data set, the resource data corresponding to the current initial node group is stored in the processed resource data set by splitting the service node. It can be understood that when the task scheduling system processes the first initial node group in the first sorting information as the current initial node group by splitting the service node, since the processed resource data set is still an empty set, it is not necessary to compare the current initial node group with the pre-created processed resource data set. Instead, the resource data of the current initial node group is directly stored in the processed resource data set.

[0085] Step S1014: If the resource data has an intersection with the set of processed resource data, the splitting service node obtains the corresponding processed node group from the pre-created set of processed node groups based on the intersection, merges the current initial node group with the processed node group to obtain a merged node group, and stores the merged node group in the set of processed node groups.

[0086] In this step, the task scheduling system compares the resource data of the current initial node group with the processed resource data set by the split service node. If the comparison result shows an intersection between the resource data of the current initial node group and the processed resource data set, the split service node retrieves the corresponding processed node group from the pre-created processed node group set based on this intersection. The current initial node group and the processed node group are then merged to obtain a merged node group, which is stored in the pre-created merged node group set. It should be noted that the pre-created processed node group set is an empty set; that is, before the task scheduling system begins merging the initial node groups in the initial node group set through the split service node, no node groups are stored in the processed node group set. As the merging operation progresses, the processed node group set will store the corresponding node groups. Similarly, the pre-created merged node group set is an empty set; that is, before the task scheduling system begins merging the initial node groups in the initial node group set through the split service node, no node groups are stored in the merged node group set.

[0087] In a feasible embodiment, the processed resource data set includes resource data r1, r2, r3, and r4, where resource data r1, r2, and r3 are resource data corresponding to the first initial node group in the processed node group set, resource data r4 is resource data corresponding to the second initial node group in the initial node group set, and the current initial node group is the third initial node group in the initial node group set, with the resource data corresponding to the current initial node group being r1. After the task scheduling system compares the resource data of the current initial node group with the processed resource data set by splitting service nodes, it determines that there is an intersection r1 between the resource data of the current initial node group and the processed resource data set. The task scheduling system determines that the intersection r1 corresponds to the first initial node group in the processed node group set by splitting service nodes, that is, it performs a merging operation on the current initial node group and the first initial node group to obtain a merged node group, and stores the merged node group in the processed node group set, while deleting the third initial node group in the initial node group set.

[0088] Step S1015: Update the current initial node group through the split service node, and re-execute the step: compare the resource data of the current initial node group with the pre-created set of processed resource data;

[0089] In this step, after processing the current initial node group, the task scheduling system selects the next initial node group from the initial node group set based on the first sorting information by splitting the service nodes, changes the current initial node group, and re-executes the comparison between the resource data of the current initial node group and the pre-created set of processed resource data, as well as subsequent steps. For example, if the processed current initial node group is the third initial node group in the initial node group set, the task scheduling system selects the next initial node group after the third initial node group, i.e., the fourth initial node group, as the current initial node group, and re-executes the above steps.

[0090] Step S1016: Until all initial node groups in the initial node group set have been compared, the target node group set is obtained based on the processed node group set.

[0091] In this step, the task scheduling system compares all initial node groups in the initial node group set through the steps described above, and then obtains the target node group set based on the final processed node group set. It can be understood that the final processed node group set includes the merged node groups, as well as some initial node groups that could not be merged.

[0092] The task scheduling system in this embodiment compares the resource data corresponding to each initial node group by splitting the service nodes, and merges the initial node groups with overlapping resource data to obtain the target node group set. This lays the foundation for subsequent sharding, helps to enable task data to be processed in a distributed manner, prevents network overhead caused by high-frequency judgment locks on task data and the risk of inconsistency of task data in a distributed system, and improves the efficiency and accuracy of handling complex task scheduling, thereby improving the practicality of task scheduling.

[0093] Furthermore, such as Figure 4 As shown, based on the first and second embodiments of the task scheduling method of the present invention, a third embodiment of the task scheduling method of the present invention is proposed.

[0094] The third embodiment of the task scheduling method differs from the first and second embodiments in that the step of obtaining the number of tasks for each target node in the target node group set through the splitting service node, and performing a splitting operation on the target node group set according to the number of tasks and the number of tasks carried to obtain the sharding result includes:

[0095] Step S10221: The current target node group is determined in the target node group set by the splitting service node, and the current shard is determined in the preset shard set;

[0096] In this step, the task scheduling system obtains the second sorting information of all target nodes in the target node group set by the split service node, and obtains the third sorting information of all shards in the preset shard set. The task scheduling system determines the current target node group in the target node group set according to the second sorting information by the split service node, and determines the current shard in the preset shard set according to the third sorting information.

[0097] Step S10222: Calculate the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard through the splitting service node, and compare the first difference with a preset difference range;

[0098] In this step, the task scheduling system calculates the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard by splitting the service nodes, and compares the first difference with the preset difference range. It should be noted that the preset difference range is set in advance in the task scheduling system and can be adjusted accordingly according to specific circumstances.

[0099] Step S10223: If the first difference is within the preset difference range, the current target node group is allocated to the current shard through the splitting service node, and the steps are re-executed: determine the current target node group in the target node group set, and determine the current shard in the preset shard set;

[0100] In this step, the task scheduling system compares the first difference with a preset difference range by splitting the service nodes. If the comparison result shows that the first difference is within the preset difference range, the system allocates the current target node group to the current shard by splitting the service nodes. The system then re-executes the steps of determining the current target node group in the target node group set and determining the current shard in the preset shard set, as well as subsequent steps. For example, if the preset difference range is -5000 to 5000, the number of tasks corresponding to the current target node group is 196,000, and the number of tasks carried by the current shard is 200,000, the task scheduling system calculates the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard by splitting the service nodes to be -4000. That is, the first difference of -4000 is within the preset difference range of -5000 to 5000, and the task scheduling system allocates the current target node group to the current shard.

[0101] Step S10224: If the first difference is not within the preset difference range, the number of tasks corresponding to the current target node group is calculated by the splitting service node, and the sum of the number of tasks corresponding to all the pending node groups in the pre-created pending node group set is calculated. Based on the sum and the number of tasks carried by the current shard, the current target node group and all pending node groups are allocated to the current shard, and the steps are re-executed: determine the current target node group in the target node group set, determine the current shard in the preset shard set, or update the current target node group, and the steps are re-executed: calculate the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard.

[0102] Further, step S10224 includes:

[0103] Step S102241: Calculate the second difference between the sum and the number of tasks carried by the current shard through the splitting service node, and compare the second difference with the preset difference range;

[0104] In this step, the task scheduling system compares the first difference with the preset difference range by splitting the service nodes. If the comparison result shows that the first difference is within the preset difference range, the system calculates the sum of the number of tasks corresponding to the current target node group and the number of tasks corresponding to all the pre-created set of nodes to be allocated. The system then calculates the second difference between the sum and the number of tasks carried by the current shard, and compares the second difference with the preset difference range.

[0105] Step S102242: If the second difference is within the preset difference range, the current target node group and all the node groups to be allocated are allocated to the current shard through the splitting service node, and the set of node groups to be allocated is cleared. Then, the following steps are executed again: determine the current target node group in the target node group set, and determine the current shard in the preset shard set.

[0106] In this step, the task scheduling system compares the second difference with the preset difference range by splitting the service nodes. If the second difference is within the preset difference range, the system then splits the service nodes to allocate the current target node group and all the node groups to be allocated to the current shard, clears the set of node groups to be allocated, and re-executes the steps of determining the current target node group in the target node group set and determining the current shard in the preset shard set, as well as subsequent steps.

[0107] Step S102243: If the second difference is less than the preset difference range, the current target node group is stored in the set of node groups to be allocated through the splitting service node, and the current target node group is updated. Then, the following steps are executed again: calculate the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard.

[0108] In this step, the task scheduling system compares the second difference with a preset difference range by splitting the service nodes. If the second difference is less than the preset difference range, the system stores the current target node group in the set of node groups to be allocated by splitting the service nodes. Based on the second sorting information, the system determines the current target node group in the target node group set and re-executes the calculation of the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard, as well as subsequent steps. For example, if the preset difference range is -5000 to 5000 and the second difference is -6000, then the second difference is less than the preset difference range. The task scheduling system stores the current target node group in the set of node groups to be allocated by splitting the service nodes and selects the next target node group in the target node group set based on the second sorting information as the current target node group, and re-executes the above steps.

[0109] Step S102244: If the second difference is greater than the preset difference range, the current target node group is retained in the target node group set by the splitting service node, and the current target node group is updated. Then, the step of calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard is executed again.

[0110] In this step, the task scheduling system compares the second difference with a preset difference range by splitting the service nodes. If the second difference is greater than the preset difference range, the system retains the current target node group in the target node group set by splitting the service nodes, updates the current target node group, and re-executes the step of calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard, as well as subsequent steps. For example, if the preset difference range is -5000 to 5000 and the second difference is 6000, then the second difference is greater than the preset difference range. The task scheduling system retains the current target node group in the target node group set by splitting the service nodes, and selects the next target node group in the target node group set according to the second sorting information as the current target node group, and re-executes the above steps.

[0111] Step S10225 continues until all target node groups in the target node group set have been allocated, resulting in a sharding result.

[0112] In this step, the task scheduling system allocates all target node groups in the target node group set through the steps described above, and then obtains the allocation result. It can be understood that the allocation result is that each fragment in the preset fragment set is assigned to one or more target node groups.

[0113] The task scheduling system in this embodiment performs a partitioning operation on the target node group set by partitioning service nodes to obtain the partitioning result. After the task data is partitioned, it is easier to process in a distributed manner, which prevents network overhead caused by high-frequency judgment locks of task data and the risk of inconsistency of task data in a distributed system. It can also improve the efficiency and accuracy of handling complex task scheduling, thereby improving the practicality of task scheduling.

[0114] Furthermore, such as Figure 5 As shown, based on the first, second and third embodiments of the task scheduling method of the present invention, a fourth embodiment of the task scheduling method of the present invention is proposed.

[0115] The fourth embodiment of the task scheduling method differs from the first, second, and third embodiments in that, after the step of all target node groups in the target node group set have been allocated and the sharding result is obtained, it includes:

[0116] Step S10226: Detect whether there is an unassigned target node group in the target node group set through the splitting service node;

[0117] Step S10227: If it exists, the unassigned target node group is assigned to the corresponding shard according to the preset allocation strategy through the sharding service node, and the sharding result is updated.

[0118] In this embodiment, when the task scheduling system performs a sharding operation on the target node group in the target node group set by sharding service nodes, there may be a situation where the second difference between the number of tasks corresponding to the current target node group and the sum of the number of tasks corresponding to all the unallocated node groups in the pre-created unallocated node group set and the number of tasks carried by the current shard is greater than a preset difference range. In this case, the current target node group is retained in the target node group set by the sharding service node without further processing. Therefore, after allocating the target node groups in the target node group set, the task scheduling system needs to check whether there are any unallocated target node groups in the target node group set. If so, it determines the number of tasks corresponding to each shard according to the sharding results, sorts each shard from smallest to largest according to the number of tasks corresponding to each shard, and sorts each unallocated target node group from largest to smallest according to the number of tasks corresponding to the unallocated target node groups in the target node group set. Finally, it sorts the unallocated target node groups according to the sharding sorting and the number of unallocated tasks. The target node groups are sorted by sequentially assigning the unassigned target node groups in the target node group set to their corresponding shards, and then updating the sharding results. For example, if there are 4 unassigned target node groups with numbers a and d, and 5 shards with numbers 1-5, the shards are sorted from smallest to largest according to the number of tasks corresponding to each shard, resulting in a sorting result of 1, 2, 4, 5, 3. The unassigned target node groups are then sorted from largest to smallest according to the number of tasks corresponding to them, resulting in a, c, d, b. The task scheduling system, based on the sharding sort and the unassigned target node group sorting, assigns the unassigned target node group with number 'a' to shard number 1, the unassigned target node group with number 'c' to shard number 2, the unassigned target node group with number 'd' to shard number 4, and the unassigned target node group with number 'b' to shard number 5. This ensures that each node group can be assigned to the corresponding shard, which helps improve the efficiency and accuracy of subsequent task execution.

[0119] Furthermore, after obtaining the sharding results, the process also includes:

[0120] Step S10228: Obtain the current time through the segmentation service node. If the current time has not reached the preset daily segmentation time, detect whether a task change has occurred according to the preset time interval.

[0121] Step S10229: If a task change occurs, the resource information corresponding to the changed task is obtained through the splitting service node, and the changed task is allocated to the corresponding shard according to the resource information, or the changed task is added to the initial node group set through the splitting service node, and the steps are re-executed: obtaining the task data and resource data of the initial node group set, and splitting the initial node group set according to the task data, the resource data and the preset shard set to obtain the sharding result.

[0122] In this embodiment, after the task service system detects that there are no unassigned target node groups in the target node group set or assigns unassigned target node groups in the target node group set, the task scheduling system obtains the current time through the split service node. If the current time has not reached the preset daily cut-off time, the split service node detects whether a task change has occurred according to a preset time interval. It obtains the last update time and last creation time from the pre-created task definition table, concurrency control relationship table, and task lock relationship table, and compares the last update time and last creation time with the sharding start time. If the last update time or last creation time is later than the sharding start time, the system will proceed accordingly. If a task change is detected, the resource information corresponding to the changed task is obtained through the split service node. Based on the resource information, the resource data corresponding to the changed task is queried. Based on the resource data, it is determined whether the resource data of the changed task are all located in the same shard. If so, the changed task is allocated to the corresponding shard through the split service node. If not, the changed task is added to the initial node group set through the split service node. The task data and resource data of the initial node group set are obtained again. The initial node group set is then sharded according to the task data, the resource data, and the preset shard set to obtain the sharding result and subsequent steps, that is, the initial node group set is sharded again.

[0123] The task scheduling system in this embodiment, after obtaining the sharding results, allocates the unassigned target node groups to the corresponding shards through the sharding service nodes and updates the sharding results. It also detects whether task changes have occurred through the sharding service nodes. If a task change occurs, the sharding service nodes allocate the changed task to the corresponding shard, or add the changed task to the initial node group set and re-shard the initial node group set. This ensures that all task data can be allocated to the corresponding shards before the task scheduling system schedules the task data, preventing network overhead caused by high-frequency judgment locks on task data and the risk of inconsistency in task data under a distributed system. It also improves the efficiency and accuracy of task scheduling, thereby enhancing the practicality of task scheduling.

[0124] The present invention also provides a task scheduling system.

[0125] The task scheduling system of the present invention includes: a segmentation service node, a scheduling selection service node, an execution proxy service node, middleware, a memory, a processor, and a task scheduler stored in the memory and executable on the processor. When the task scheduler is executed by the processor, it implements the steps of the task scheduling method described above.

[0126] The method implemented when the task scheduler running on the processor is executed can be referred to in various embodiments of the task scheduling method of the present invention, and will not be repeated here.

[0127] like Figure 6 As shown, the task scheduling system of this invention is a distributed architecture-based system that supports rich scheduling control strategies and is capable of fault awareness and self-healing. The task scheduling system connects to hardware devices through a UI (user tool layer) and an API (interface aggregation layer). It possesses the ability to dynamically expand service capacity without downtime and provides eight task scheduling selection strategies: dependency, time window, resource, concurrency, lock, signal, priority, and message. Simultaneously, services are mutually aware and monitor each other, and can automatically take over in the event of a fault, ensuring high system availability. The system is mainly divided into six service nodes: partitioning service nodes (M-Newday, S-Newday), scheduling and selection service nodes (M-Selector, S-Selector), execution broker service nodes (Worker), external event listening service nodes (ZK-Cluster: ZooKeeper distributed collaboration server cluster), front-end routing service nodes (RabbitMQ-Cluster: RabbitMQ message queue cluster), and event processing service nodes (Oracle-Rac: Oracle-Rac architecture relational database, Redis-Cluster: Redis cache cluster, Kafka-Cluster: Kafka message queue cluster, and Es-Cluster: Elastic-search storage cluster). All service nodes adopt a multi-instance architecture.

[0128] Task slicing service nodes (M-Newday, S-Newday): Responsible for the task scheduling system's failover process, mainly including task slicing, data initialization, and cluster command functions. A master-slave architecture is adopted, with M-Newday as the master node and S-Newday as the backup node. The task slicing service nodes (M-Newday, S-Newday) also include the task slicing service bus (Newday Threads).

[0129] The scheduling and selection service nodes (M-Selector, S-Selector) are responsible for task selection and decision-making, and task distribution. They adopt a multi-master and multi-standby architecture, with M-Selector as the master node and S-Selector as the standby node. The scheduling and selection service nodes (M-Selector, S-Selector) also include the scheduling and selection service bus (Selector Threads) and the listening bus (Listener Threads).

[0130] Execution agent service node (Worker): Responsible for the specific execution of tasks and status reporting. The execution agent service node (Worker) also includes: execution agent service bus (Taskexecute Threads) and listening bus (Listener Threads).

[0131] External event monitoring service node (ZK-Cluster: ZooKeeper distributed collaborative server cluster): responsible for obtaining service registration information, monitoring information, heartbeat information and splitting results from the splitting service node (M-Newday, S-Newday), scheduling selection service node (M-Selector, S-Selector) and execution agent service node (Worker), and solving the problems of distributed task allocation and high availability failover.

[0132] The front-end routing service node (RabbitMQ-Cluster: RabbitMQ message queue cluster) is responsible for sending the task distribution information of the scheduling and selection service nodes (M-Selector, S-Selector) to the execution agent service nodes (Worker) through the task distribution bus (Task-Dispatcher), and sending the task reporting information of the execution agent service nodes (Worker) to the scheduling and selection service nodes (M-Selector, S-Selector) through the task reporting bus (Task-Response), thereby realizing the asynchronous decoupling between services and the message persistence problem.

[0133] The event processing service nodes include: Oracle-Rac: an Oracle-Rac architecture relational database; Redis-Cluster: a Redis caching cluster; Kafka-Cluster: a Kafka message queue cluster; and Es-Cluster: an Elasticsearch storage cluster. Specifically, Oracle-Rac (Oracle-Rac architecture relational database) receives daily data preparation from the storage scheduling and selection service nodes (M-Selector, S-Selector); Redis-Cluster (Redis caching cluster) caches task data, improving task data access efficiency and resolving database performance bottlenecks; Kafka-Cluster (Kafka message queue cluster) receives notifications and alarms, implementing log / alarm message notifications; and Es-Cluster (Elasticsearch storage cluster) receives task execution logs sent by the storage broker service nodes (Workers), enabling log persistence and fast retrieval.

[0134] Within this architecture, the task scheduling system integrates various services that work collaboratively. Each service is deployed using multiple instances and dynamically scales horizontally. Once completed, the system aims to efficiently schedule over one million tasks daily, with automatic failover. It supports eight rich scheduling control strategies, including dependency, time window, resource, concurrency, lock, signal, priority, and message scheduling. It also supports various task types, such as scripts, stored procedures, and HTTP interfaces. Scripts include: Shell, Perl, Python, mysql, Crackle, DB2, MapReduce, and Spark.

[0135] The present invention also provides a computer-readable storage medium.

[0136] The present invention provides a computer-readable storage medium storing a task scheduler, which, when executed by a processor, implements the steps of the task scheduling method described above.

[0137] The method implemented when the task scheduler running on the processor is executed can be referred to in various embodiments of the task scheduling method of the present invention, and will not be repeated here.

[0138] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or system. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or system that includes that element.

[0139] The sequence numbers of the above embodiments of the present invention are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.

[0140] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present invention, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk) as described above, and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in the various embodiments of the present invention.

[0141] The above are merely preferred embodiments of the present invention and do not limit the patent scope of the present invention. Any equivalent structural or procedural transformations made based on the content of the present invention's specification and drawings, or direct or indirect applications in other related technical fields, are similarly included within the patent protection scope of the present invention.

Claims

1. A task scheduling method, characterized in that, The task scheduling method is applied to a task scheduling system, which includes: service node splitting, service node selection scheduling, service node execution proxy, and middleware. The task scheduling method includes the following steps: The initial node group set is obtained through the splitting service node. Task data and resource data are then acquired from the initial node group set. Based on the task data, resource data, and a preset sharding set, the initial node group set is sharded to obtain a sharding result. Specifically, the initial node group set is merged based on the resource data to obtain a target node group set. The number of tasks carried by each shard in the preset sharding set is calculated based on the task data and the preset sharding set. The number of tasks carried by each target node in the target node group set is then acquired through the splitting service node. Based on the number of tasks carried and the number of tasks carried, the target node group set is split to obtain a sharding result. Specifically, the splitting service node determines the current target node group from the target node group set and the current shard from the preset shard set; the splitting service node calculates a first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard, and compares the first difference with a preset difference range; if the first difference is within the preset difference range, the splitting service node allocates the current target node group to the current shard, and the steps of determining the current target node group from the target node group set and determining the current shard from the preset shard set are repeated; if the first difference is not within the preset difference range, the splitting service node allocates the current target node group to the current shard, and the steps of determining the current target node group from the target node group set and determining the current shard from the preset shard set are repeated; if the first difference is not within the preset difference range, the splitting service node allocates the current target node group to the current shard, and the steps of determining the current target node group from the target node group set and determining the current shard from the preset shard set are repeated. The node calculates the sum of the number of tasks corresponding to the current target node group and the number of tasks corresponding to all unassigned node groups in the pre-created set of unassigned node groups. Based on the sum and the number of tasks carried by the current shard, the node allocates the current target node group and all unassigned node groups to the current shard, and then re-executes the following steps: determining the current target node group in the target node group set, determining the current shard in the preset shard set, or updating the current target node group, and re-executes the following steps: calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard; until all target node groups in the target node group set have been allocated, the sharding result is obtained. Specifically, the second difference between the sum and the number of tasks carried by the current shard is calculated by the splitting service node, and compared with a preset difference range. If the second difference is within the preset difference range, the current target node group and all unassigned node groups are allocated to the current shard by the splitting service node, and the unassigned node group set is cleared. The steps are then repeated: determining the current target node group in the target node group set and determining the current shard in the preset shard set. If the second difference is less than the preset difference range, then... The splitting service node stores the current target node group in the set of node groups to be allocated, updates the current target node group, and re-executes the step: calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard; if the second difference is greater than the preset difference range, the splitting service node retains the current target node group in the target node group set, updates the current target node group, and re-executes the step: calculating the first difference between the number of tasks corresponding to the current target node group and the number of tasks carried by the current shard; The execution parameters of the task data are obtained through the scheduling selection service node and the middleware, and the task data is scheduled according to the execution parameters and the sharding result through the scheduling selection service node and the execution proxy service node.

2. The task scheduling method as described in claim 1, characterized in that, The steps of obtaining task data and resource data of the initial node group set by splitting service nodes, and then splitting the initial node group set according to the task data, the resource data, and the preset sharding set to obtain the sharding result include: The resource data of each initial node group in the initial node group set is obtained through the splitting service node, and the initial node groups in the initial node group set are merged according to the resource data to obtain the target node group set. The task data of the initial node group set is obtained through the splitting service node, and the target node group set is split according to the task data and the preset number of shards to obtain the sharding result.

3. The task scheduling method as described in claim 2, characterized in that, The step of merging the initial node groups in the initial node group set according to the resource data to obtain the target node group set includes: The current initial node group is determined from the initial node group set by the segmentation service node; The resource data of the current initial node group is compared with the pre-created set of processed resource data through the segmentation service node; If the resource data does not intersect with the processed resource data set, the resource data is stored in the processed resource data set through the splitting service node; If the resource data intersects with the set of processed resource data, the splitting service node obtains the corresponding processed node group from the pre-created set of processed node groups based on the intersection, merges the current initial node group with the processed node group to obtain a merged node group, and stores the merged node group in the set of processed node groups. The current initial node group is updated through the segmentation service node, and the following steps are re-executed: the resource data of the current initial node group is compared with the pre-created set of processed resource data; The process continues until all initial node groups in the initial node group set have been compared, and the target node group set is obtained based on the processed node group set.

4. The task scheduling method as described in claim 1, characterized in that, After the step of allocating all target node groups in the target node group set and obtaining the sharding result, the following steps are included: The splitting service node detects whether there are any unassigned target node groups in the target node group set; If they exist, the unassigned target node group is allocated to the corresponding shard according to the preset allocation strategy through the sharding service node, and the sharding result is updated.

5. The task scheduling method according to any one of claims 1-4, characterized in that, After obtaining the fragmentation result, the following steps are included: The current time is obtained through the segmentation service node. If the current time has not reached the preset daily segmentation time, the task change is detected through the segmentation service node according to the preset time interval. If a task change occurs, the resource information corresponding to the changed task is obtained through the splitting service node, and the changed task is allocated to the corresponding shard according to the resource information, or the changed task is added to the initial node group set through the splitting service node, and the steps are re-executed: obtaining the task data and resource data of the initial node group set, and splitting the initial node group set according to the task data, the resource data and the preset shard set to obtain the sharding result.

6. A task scheduling system, characterized in that, The task scheduling system includes: a segmentation service node, a scheduling selection service node, an execution proxy service node, middleware, a memory, a processor, and a task scheduler stored in the memory and executable on the processor. When the task scheduler is executed by the processor, it implements the steps of the task scheduling method as described in any one of claims 1 to 5.

7. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a task scheduler, which, when executed by a processor, implements the steps of the task scheduling method as described in any one of claims 1 to 5.