Method and apparatus for data processing, and program, medium and device
By optimizing task scheduling through a hierarchical orchestration strategy, the problem of excessively long processing times and blocking of non-critical tasks in complex workflows has been solved, resulting in more efficient task processing.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- BEIJING ZITIAO NETWORK TECH CO LTD
- Filing Date
- 2024-12-20
- Publication Date
- 2026-06-25
AI Technical Summary
Existing task orchestration technologies suffer from problems such as excessive time consumption and non-critical tasks blocking core links when handling complex workflows, resulting in low processing efficiency.
A hierarchical orchestration strategy is adopted. By determining the dependencies and execution times between tasks, tasks are configured into multiple first-level and second-level hierarchies, which can be executed serially or in parallel, thus optimizing task scheduling and reducing overall time consumption.
It effectively reduced workflow processing time, improved processing efficiency, solved the problem of non-critical tasks blocking the core link, and simplified the burden on business developers.
Smart Images

Figure CN2024141211_25062026_PF_FP_ABST
Abstract
Description
Methods, apparatus, programs, media and devices for data processing Technical Field
[0001] The exemplary embodiments disclosed herein generally relate to the field of computers, and particularly to methods, apparatus, devices, computer-readable storage media, and computer program products for data processing. Background Technology
[0002] Task orchestration refers to the process of organizing and managing multiple tasks according to specific dependencies and execution order to ensure that each task can be executed efficiently in a predetermined logical sequence. With the continuous development of internet technology, more and more enterprises and organizations are adopting task orchestration technology to optimize business processes, improve work efficiency, and reduce operating costs. Summary of the Invention
[0003] In a first aspect of this disclosure, a method for data processing is provided. The method includes: determining dependencies between multiple tasks in a workflow to be processed and the execution time of each task; determining a scheduling and orchestration strategy for the multiple tasks based on the dependencies and the execution time of each task, the scheduling and orchestration strategy indicating multiple first-level hierarchies and at least one second-level hierarchy under each first-level hierarchy, each second-level hierarchy including at least one task group, each task group including at least one task from a plurality of tasks, the multiple first-level hierarchies being configured for sequential execution, and each task group in each second-level hierarchy being configured for sequential or parallel execution; and scheduling the multiple tasks in the workflow for execution based on the scheduling and orchestration strategy.
[0004] In a second aspect of this disclosure, an apparatus for data processing is provided. The apparatus includes: an acquisition module configured to determine dependencies between multiple tasks in a workflow to be processed and the execution time of each task; a determination module configured to determine a scheduling and orchestration strategy for the multiple tasks based on the dependencies and the execution time of each task, the scheduling and orchestration strategy indicating multiple first-level hierarchies and at least one second-level hierarchy under each first-level hierarchy, each second-level hierarchy including at least one task group, each task group including at least one task from a plurality of tasks, the multiple first-level hierarchies being configured to be executed sequentially, and each task group in each second-level hierarchy being configured to be executed sequentially or in parallel; and a scheduling module configured to schedule the multiple tasks in the workflow for execution based on the scheduling and orchestration strategy.
[0005] In a third aspect of this disclosure, an electronic device is provided. The device includes at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit. When executed by the at least one processing unit, the instructions cause the device to perform the method of the first aspect.
[0006] In a fourth aspect of this disclosure, a computer-readable storage medium is provided. The computer-readable storage medium stores a computer program that can be executed by a processor to implement the method of the first aspect.
[0007] In a fifth aspect of this disclosure, a computer program product is provided. The computer program product includes computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the method of the first aspect.
[0008] It should be understood that the content described in this content section is not intended to limit the key or essential features of the embodiments of this disclosure, nor is it intended to restrict the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description
[0009] The specific embodiments of this application are described in detail below with reference to the accompanying drawings, wherein:
[0010] Figure 1 shows a schematic diagram of an example environment in which embodiments of the present disclosure can be implemented;
[0011] Figure 2A shows a schematic diagram of an example workflow;
[0012] Figure 2B shows a schematic diagram of an example of how a hierarchical orchestration strategy processes the workflow in Figure 2A;
[0013] Figure 3A shows a schematic diagram of another example of the workflow;
[0014] Figure 3B shows a schematic diagram of an example of how a hierarchical orchestration strategy processes the workflow in Figure 3A;
[0015] Figure 4 shows a flowchart of a data processing procedure according to some embodiments of the present disclosure;
[0016] Figure 5 shows a schematic diagram of an example of a scheduling and orchestration strategy according to some embodiments of the present disclosure;
[0017] Figure 6 illustrates a schematic diagram of an example of processing the workflow in Figure 2A using a scheduling and orchestration strategy according to some embodiments of the present disclosure;
[0018] Figure 7 illustrates a schematic diagram of another example of a scheduling and orchestration strategy according to some embodiments of the present disclosure;
[0019] Figure 8 shows a block diagram of an apparatus for data processing according to some embodiments of the present disclosure; and
[0020] Figure 9 shows a block diagram of an electronic device capable of implementing several embodiments of the present disclosure. Detailed Implementation
[0021] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.
[0022] It should be noted that the headings of any section / subsection provided herein are not limiting. Various embodiments are described throughout this document, and embodiments of any type may be included under any section / subsection. Furthermore, embodiments described in any section / subsection may be combined in any way with any other embodiments described in the same section / subsection and / or different sections / subsections.
[0023] In the description of embodiments of this disclosure, the term "comprising" and similar terms should be understood as open-ended inclusion, i.e., "including but not limited to". The term "based on" should be understood as "at least partially based on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The term "some embodiments" should be understood as "at least some embodiments". Other explicit and implicit definitions may also be included below. The terms "first", "second", etc., may refer to different or the same objects. Other explicit and implicit definitions may also be included below.
[0024] The embodiments of this disclosure may involve user data, data acquisition, and / or use. All of these aspects comply with applicable laws, regulations, and relevant provisions. In the embodiments of this disclosure, all data collection, acquisition, processing, manipulation, forwarding, and use are conducted with the user's knowledge and confirmation. Accordingly, in implementing the embodiments of this disclosure, the type, scope of use, and usage scenarios of any data or information that may be involved should be communicated to the user and their authorization obtained in accordance with relevant laws and regulations through appropriate means. The specific methods of notification and / or authorization may vary depending on the actual situation and application scenario, and the scope of this disclosure is not limited in this respect.
[0025] In this specification and the embodiments, any processing of personal information will be carried out only under the premise of legality (such as obtaining the consent of the personal information subject, or being necessary for the performance of a contract), and will only be carried out within the scope stipulated or agreed upon. A user's refusal to process personal information other than that necessary for basic functions will not affect the user's use of basic functions.
[0026] The following will describe in detail various example implementations of this scheme with reference to the accompanying drawings.
[0027] Figure 1 illustrates a schematic diagram of an example environment 100 in which embodiments of the present disclosure can be implemented. As shown in Figure 1, in example environment 100, an application 120 is installed on an electronic device 110, and a user 130 can interact with the application 120 via the electronic device 110 and / or an attached device of the electronic device 110. In some implementations, the application 120 may be authorized to collect input from the user 130 (e.g., text information entered by the user 130) via the electronic device 110.
[0028] In some embodiments, application 120 may be downloaded and installed on electronic device 110. In some embodiments, application 120 may also be accessed in other ways, such as through a web page.
[0029] In embodiments of this disclosure, application 120 can be any suitable application with task processing capabilities, including but not limited to one or more of the following: chat application components (also known as instant messaging application components), browser application components, task application components, calendar application components, goal and key results (OKR) application components, etc. It is understood that although a single application component is shown in Figure 1, multiple application components can actually be installed on the electronic device 110. In some embodiments, application 120 may include a multi-functional collaboration platform, such as an office collaboration platform (also known as an office suite) that can provide integration of various types of business components to facilitate people's office work, communication, and other activities. In a multi-functional collaboration platform, people can launch different business components as needed to complete corresponding information processing, sharing, communication, etc.
[0030] In environment 100 of Figure 1, electronic device 110 can interact with other devices (not shown). For example, electronic device 110 can receive workflow 140 from other devices to enable application 120 to implement one or more functions corresponding to workflow 140. In some embodiments, during interaction between user 130 and application 120, application 120 can also generate workflows. Electronic device 110 can process the workflows generated by application 120 itself, or send the workflows to other devices to assist electronic device 110 in processing the workflows. It is understood that a workflow refers to the automation of part or all of a business process in a computer application environment, which can organize a set of tasks according to a predetermined triggering order and triggering conditions to complete a process. During workflow processing, electronic device 110 can generate scheduling and orchestration strategies for a set of tasks in the workflow to execute the corresponding set of tasks in the workflow based on the scheduling and orchestration strategies.
[0031] In some embodiments, electronic device 110 can be any type of computing device, including terminal devices or server devices. Terminal devices can be any type of mobile terminal, fixed terminal, or portable terminal, including mobile phones, desktop computers, laptop computers, notebook computers, netbook computers, tablet computers, media computers, multimedia tablets, personal communication system (PCS) devices, personal navigation devices, personal digital assistants (PDAs), audio / video players, digital cameras / camcorders, positioning devices, television receivers, radio receivers, e-book devices, gaming devices, or any combination thereof, including accessories and peripherals of these devices or any combination thereof. Server devices may include, for example, computing systems / servers, such as mainframes, edge computing nodes, computing devices in cloud environments, etc.
[0032] It should be understood that the structure and function of the various elements in environment 100 are described for illustrative purposes only and do not imply any limitation on the scope of this disclosure.
[0033] In engineering, data processing flows for a specific business scenario can be structured through task decomposition and process orchestration. Task orchestration techniques allow for the configuration of a series of tasks with dependencies, which are then executed sequentially in a specific order to ultimately return the information required by the business. Two common orchestration techniques are layered driven processes and directed acyclic graphs (DAGs).
[0034] Taking a layered orchestration technique as an example, see Figure 2A, which illustrates a workflow example 200A. In example 200A, the workflow includes three tasks: A, B, and C. Task B depends on task A, meaning its execution requires the result of task A's execution. Task C has no other dependencies, meaning its execution does not require the result of either task A or task B. The execution time for task A is 50ms, for task B it is 100ms, and for task C it is 150ms. Ideally, the expected execution time for all three tasks (A, B, and C) is approximately 150ms.
[0035] When processing the workflow in Example 200A using a hierarchical orchestration strategy, since task B depends on task A, tasks A and B need to be configured in different layers. Since task C has no other dependencies, it can be configured in the same layer as task A or task B. Refer to Figure 2B, which illustrates Example 200B of processing the workflow in Figure 2A using a hierarchical orchestration strategy. In Example 200B, task A can be configured to execute in one task group (e.g., task group A), task B can be configured to execute in another task group (e.g., task group B), and task C can be configured in the same layer as task A. In this case, the execution of task B depends on the execution result of task A. The execution result of task A can only be provided to task B after group A has completed execution; that is, the execution of group B requires group A to complete. Therefore, in Example 200B, the actual execution time of this workflow is approximately 200ms, resulting in a performance penalty in the overall execution time.
[0036] Figure 3A illustrates another example of a workflow, 300A. In example 300A, the workflow includes six tasks: A, B, C, D, E, and F. Task B has a weak dependency on the other tasks; even if task B fails, it does not affect the core logic (the execution of tasks A, C, D, E, and F). The dependencies in this workflow are as follows: task B depends on task A, task C depends on task A, task D depends on task C, task E depends on task D, and task F depends on tasks B and E. The time taken for tasks A, B, C, D, E, and F are 30ms, 200ms, 40ms, 40ms, 30ms, and 50ms, respectively. Ideally, the total execution time for all six tasks would be approximately 280ms.
[0037] When processing the workflow in Example 300A using a hierarchical orchestration strategy, since task B depends on task A, task C depends on task A, task D depends on task C, task E depends on task D, and task F depends on tasks B and E, and task B has a weak dependency relationship with other tasks, tasks A, C, D, E, and F can be configured into different layers, and task B can be configured to be in the same layer as task C, task D, or task E. See Figure 3B, which illustrates Example 300B processing the workflow in Figure 3A using a hierarchical orchestration strategy. In Example 300B, task B is configured to be in the same layer as task C. In this case, the entire workflow in Example 300B completes in approximately 350ms, and the non-core task B has to be executed in the core link. Failure of the non-core task B will cause the core link to block.
[0038] To address this, embodiments of this disclosure propose a data processing scheme. The scheme includes: acquiring a workflow to be processed, the workflow comprising multiple tasks, dependencies between the tasks, and the execution time of each task; determining a scheduling and orchestration strategy for the multiple tasks based on the dependencies and the execution times of each task, the scheduling and orchestration strategy indicating multiple primary layers and at least one secondary layer under each primary layer, each secondary layer comprising at least one task group, each task group comprising at least one task from multiple tasks, the multiple primary layers being configured for sequential execution, and each task group in each secondary layer being configured for sequential or parallel execution; and scheduling the multiple tasks in the workflow for execution based on the scheduling and orchestration strategy. In this manner, the processing time of the workflow can be reduced, and processing efficiency improved.
[0039] The following description will continue with reference to the accompanying drawings, which will provide some exemplary embodiments of this disclosure.
[0040] Figure 4 shows a flowchart of a process 400 for data processing according to some embodiments of the present disclosure. Process 400 may be implemented at electronic device 110. Process 400 will now be described with reference to the environment 100 of Figure 1.
[0041] In box 410, electronic device 110 determines the dependencies between multiple tasks in the workflow to be processed and the execution time of each task.
[0042] In some embodiments, the electronic device 110 may acquire the workflow to be processed and workflow information in any suitable manner, wherein the workflow information may indicate multiple tasks included in the workflow, and may also indicate the dependencies between multiple tasks and the execution time of each of the multiple tasks.
[0043] For example, electronic device 110 can receive workflows and workflow information sent by other devices. In this case, the workflow information can be determined by other devices and then sent to electronic device 110.
[0044] For example, electronic device 110 can generate corresponding workflows based on user input (e.g., user 130 can input a task request to application 120, and electronic device 110 can determine one or more corresponding workflows based on the user's input task request). In this case, electronic device 110 itself can determine the workflow information.
[0045] In some embodiments, workflow information may also include other appropriate information, such as the participants or components of the workflow, which is not limited by the embodiments of this disclosure.
[0046] In a workflow with multiple tasks, each task can serve as the smallest unit / granularity of the workflow, and each task can be referred to as a task atom. That is, each workflow can include multiple task atoms. In some embodiments, the electronic device 110 can break down a workflow into multiple tasks. For example, if the workflow instructs "Open the page and add comment XXXX to the page," then the electronic device 110 can break down the workflow into task A "Open the page" and task B "Add comment XXX to the page."
[0047] In some embodiments, dependencies between multiple tasks can indicate the execution order of the tasks. For example, if task B depends on task A, task A must be executed first, followed by task B; if task B does not depend on task A, tasks A and B can be executed simultaneously or sequentially.
[0048] In some embodiments, the dependencies between multiple tasks can also indicate sequential execution relationships and / or parallel execution relationships. For example, if task B depends on task A, task A and task B can be determined to have a sequential execution relationship (i.e., one is executed first, then the other); if task B does not depend on task A and the two do not affect each other, task A and task B can be determined to have a parallel execution relationship (i.e., the two tasks can be executed simultaneously or sequentially).
[0049] In some embodiments, the electronic device 110 can determine the dependencies between multiple tasks based on the execution conditions (or execution parameters, etc.) required for each task. For example, the electronic device 110 can determine the execution conditions (also known as the inputs required by the task) and execution results (also known as the outputs of the task) corresponding to each of the multiple tasks, and can determine the dependencies between the multiple tasks based on the execution conditions and execution results corresponding to each task. For example, if task A is "open a page" and task B is "add comment XXX to the page", then task B needs to be executed after task A is successfully executed. It can be said that the execution conditions of task B include the execution result of task A. In this case, it can be determined that task B depends on task A.
[0050] In some embodiments, the execution time of each of the multiple tasks can be determined based on the execution time of historical tasks. For example, when determining the execution time of task A, electronic device 110 can determine the execution time of task A based on the execution of historical tasks of the same type as task A.
[0051] In block 420, electronic device 110 determines a scheduling and orchestration strategy for multiple tasks based on dependencies and the execution times of each task. The scheduling and orchestration strategy specifies multiple first-level hierarchies and at least one second-level hierarchy under each first-level hierarchy. Each second-level hierarchy includes at least one task group, and each task group includes at least one task from a plurality of tasks. The multiple first-level hierarchies are configured to execute sequentially, and each task group in each second-level hierarchy is configured to execute sequentially or in parallel. In some embodiments, at least one task from the plurality of tasks included in each task group may be configured to execute sequentially or in parallel.
[0052] In some embodiments, the electronic device 110 can select a set of tasks with strong dependencies from multiple tasks based on dependencies. It is understood that a strong dependency means that there is a close and indispensable connection between multiple tasks, that is, one task can only begin execution after another task has completed. For example, task A and task B have a strong dependency, and task B depends on task A; the execution of task B must begin after task A has completed its execution and the result of task A's execution has been invoked.
[0053] For example, electronic device 110 can determine the execution conditions and execution results of each of the multiple tasks, and can determine a group of tasks with strong dependencies between them based on the execution conditions and execution results of each of the multiple tasks.
[0054] Furthermore, the electronic device 110 can also determine multiple first-level hierarchies and at least one second-level hierarchy under each first-level hierarchy based on the execution time of each of the selected set of tasks. In some embodiments, the electronic device 110 can determine multiple first-level hierarchies based on the total execution time corresponding to multiple potential combinations of tasks in a set of tasks, each first-level hierarchy including one or more first tasks in a set of tasks.
[0055] In one example, referring to FIG5, FIG5 illustrates a schematic diagram of an example 500 of a scheduling orchestration strategy according to some embodiments of the present disclosure. As shown in FIG5, example 500 includes first-level layers 510-1, 510-2, and 510-3; first-level layer 510-1 includes second-level layers 520-1, 520-2, and 520-3; first-level layer 510-2 includes second-level layers 520-4, 520-5, and 520-6; and first-level layer 510-3 includes second-level layers 520-7, 520-8, and 520-9. For ease of description, first-level layers 510-1, 510-2, and 510-3 may be collectively referred to as first-level layer 510 or individually referred to as second-level layer 520, and second-level layers 520-1, 520-2, 520-3, 520-4, 520-5, 520-6, 520-7, 520-8, and 520-9 may be collectively referred to as second-level layer 520. Assuming the execution time of first-level layer 510-1 is 100ms, the execution time of first-level layer 510-2 is 200ms, and the execution time of first-level layer 510-3 is 50ms, and the workflow obtained by electronic device 110 includes tasks A, B, C, and D, where task D depends on task B, task B depends on task A, and task C does not depend on any other task, the execution time of task A is 50ms, the execution time of task B is 50ms, the execution time of task C is 150ms, and the execution time of task D is 140ms, then electronic device 110 can first select a set of tasks from tasks A, B, C, and D based on the strong dependencies between tasks, and determine the potential combinations of each task in the set of tasks and the total execution time corresponding to the potential combinations as follows:
[0056] Potential combination 1, task A, total execution time 50ms;
[0057] Potential combination 2, task B, total execution time 50ms;
[0058] Potential combination 3, task D, total execution time 140ms;
[0059] Potential combination 4, tasks A and B, total execution time 100ms;
[0060] Potential combination 5, tasks A and D, total execution time 190ms;
[0061] Potential combination 6, tasks B and D, total execution time 190ms;
[0062] Potential combination 7, tasks A, B, and D, total execution time 240ms.
[0063] Since the execution time of first-level layer 510-1 is 100ms, the execution time of first-level layer 510-2 is 200ms, and the execution time of first-level layer 510-3 is 50ms, to ensure the processing efficiency of the workflow, the total execution time of potential combinations should not exceed the execution time of first-level layer 510. Therefore, electronic device 110 can preliminarily determine the scheduling and orchestration strategy based on the total execution time corresponding to multiple potential combinations of tasks in a set of tasks and the execution time of each first-level layer, as follows, to determine the first-level layer 510 and one or more first tasks in a set of tasks included in each first-level layer 510:
[0064] Scheduling strategy 1
[0065] Level 1 Layer 510-1: Task A,
[0066] Level 1 Stratification 510-2: Tasks B and D;
[0067] Scheduling strategy 2
[0068] Level 1 Stratification 510-1: Task B,
[0069] Level 1 Stratification 510-2: Tasks A and D;
[0070] Scheduling strategy 3
[0071] Level 1 Stratification 510-1: Tasks A and B,
[0072] Level 1 Stratification 510-2: Task D.
[0073] For each of the multiple first-level hierarchies, the electronic device 110 can also determine at least one second-level hierarchy 520 under the first-level hierarchy 510 based on the execution time of one or more first tasks assigned to the first-level hierarchy 510. Taking scheduling orchestration strategy 2 as an example, assuming that the first-level hierarchy 510-2 in scheduling orchestration strategy 2 includes three second-level hierarchies, where the execution time of second-level hierarchy 520-4 is 140ms, the execution time of second-level hierarchy 520-5 is 50ms, and the execution time of second-level hierarchy 520-6 is 10ms, based on the above principle, the electronic device 110 can determine that task D is executed by second-level hierarchy 520-4 and task A is executed by second-level hierarchy 520-5.
[0074] In another example, referring to Figure 6, Figure 6 illustrates a schematic diagram of an example of processing the workflow in Figure 2A using a scheduling and orchestration strategy according to some embodiments of the present disclosure. As mentioned above, the actual execution time of the workflow shown in Figure 2A is 200ms. It should be noted that the lower half of Figure 6 shows workflow example 200A in Figure 2A, and the upper half shows example 600 of processing workflow example 200A using a scheduling and orchestration strategy. As shown in Figure 6, tasks A and B are configured to be executed serially in one secondary layer, and task C is configured to be executed in another secondary layer. The two secondary layers are configured to be executed in parallel. Since the execution time of task A is 50ms, the execution time of task B is 100ms, and the execution time of task C is 100ms, and the two secondary layers are configured to be executed in parallel, the execution time of workflow example 200A is 150ms when it is completed, which saves 50ms compared to the time taken by using a layered orchestration strategy, thus improving the efficiency of workflow processing.
[0075] In some embodiments, the electronic device 110 can select at least one task with a weak dependency on other tasks from a plurality of tasks based on dependencies. Compared to strong dependencies, weak dependencies refer to a relatively loose connection between multiple tasks, where the completion of one task does not directly determine the start of another task's execution. For example, task B does not depend on task A. When task B executes, it may or may not call the execution result of task A. When task B starts executing, if task A has completed execution, task B can call the execution result of task A; when task B starts executing, if task A has not completed execution, task B may not call the execution result of task A, and the effect on the final execution result is negligible.
[0076] For example, electronic device 110 can determine the execution conditions and execution results of each of the multiple tasks, and can determine at least one task among the multiple tasks that has a weak dependency relationship with other tasks based on the execution conditions and execution results of each of the multiple tasks.
[0077] Furthermore, the electronic device 110 can also determine a scheduling and orchestration strategy to instruct at least one branch, which is configured to execute asynchronously with multiple first-level hierarchies, and the at least one branch includes at least one task. The mode of configuring at least one branch to execute asynchronously with multiple first-level hierarchies can also be called the "bypass task" mode. In this mode, the logic of the bypass task and the core task can be independent of each other, and the execution result of the bypass task will not affect the core task logic, thus solving the problem of non-critical tasks blocking the core link. Moreover, through this mode, business developers can be unaware of and not manage the additional fine-grained systems such as channels and locks required by bypass tasks when developing atomic tasks; they only need to implement the logic of the core atomic task. The first-level hierarchy is responsible for allocating, recording, and releasing related task information, thereby reducing the burden on business developers and improving the reusability of the scheduling and orchestration strategy and code quality.
[0078] In some embodiments, the electronic device 110 may determine, in a scheduling strategy, a third task for initiating the execution of the second task in a plurality of first-level hierarchies, and a fourth task for invoking the execution result of the second task, based on the input and output of the second task in at least one task.
[0079] In some embodiments, the input to the second task can be the output of one of the multiple first-level hierarchies or the output of a specific task. The output of the second task can be the input of one of the multiple first-level hierarchies or the input of a specific task. The electronic device 110 can determine the initiating node and the callback node of the second task based on the inputs and outputs of the multiple first-level hierarchies and the inputs and outputs of the second task. It is understood that the initiating node can be set at the node that obtains the complete input required by the second task, or it can be any node after that node, as long as the initiating node is before the callback node. The callback node can be set at the node that requires the output of the second task. In some embodiments, the electronic device 110 can construct a third task for initiating the execution of the second task and a fourth task for calling the execution result of the second task based on the input and output nodes of the second task.
[0080] For example, referring to Figure 7, Figure 7 illustrates another example 700 of a scheduling and orchestration strategy according to some embodiments of the present disclosure. As shown in Figure 7, example 700 includes first-level layers 710-1, 710-2, 710-3, 710-4, 710-5 and a branch 720. Each first-level layer is configured with multiple tasks, and branch 720 is configured with a task that has a weak dependency on other tasks. For ease of discussion, a task that has a weak dependency on other tasks can be referred to as an asynchronous task. If the input of the asynchronous task is the output of first-level layer 710-3 and the output is the input of first-level layer 710-5, then the initiating node of the asynchronous task can be set at the node where first-level layers 710-1, 710-2, and 710-3 have completed execution, and the output node of the asynchronous task can be set at the node where first-level layer 710-5 begins execution.
[0081] In some embodiments, the electronic device 110 can also select a set of tasks with strong dependencies and at least one task with weak dependencies from multiple tasks based on dependencies. In this case, the electronic device 110 can determine a scheduling and orchestration strategy for the set of tasks with strong dependencies and a scheduling and orchestration strategy for at least one task with weak dependencies based on the above process, and finally associate the two scheduling and orchestration strategies to obtain the final scheduling and orchestration strategy. It is understood that when associating the above two scheduling and orchestration strategies, the third and fourth tasks corresponding to the scheduling and orchestration strategy for at least one task with weak dependencies can be determined in the scheduling and orchestration strategy corresponding to the set of tasks with strong dependencies, thereby associating the two scheduling and orchestration strategies.
[0082] In some embodiments, a hierarchical structure may be pre-configured in the electronic device 110. This hierarchical structure may include multiple first-level layers and at least one second-level layer under each first-level layer. Each second-level layer includes at least one task group. The multiple first-level layers are configured to execute sequentially, and each task group in each second-level layer is configured to execute sequentially or in parallel. The electronic device 110 may select one or more first-level layers and at least one second-level layer under each selected first-level layer from the pre-configured hierarchical structure to execute multiple tasks in the workflow based on the dependencies between multiple tasks in the workflow and the execution time of each task. Detailed procedures can be found in the description above and will not be repeated here.
[0083] Referring again to Figure 4, in box 430, electronic device 110 schedules multiple tasks in the workflow for execution based on a scheduling orchestration strategy.
[0084] Electronic device 110 can schedule a group of tasks with strong dependencies in a workflow based on the execution order of multiple first-level hierarchies in a scheduling orchestration strategy and the execution order of task groups in each second-level hierarchy under each first-level hierarchy. During task execution, electronic device 110 can also schedule at least one task in the workflow with weak dependencies on other tasks through third and fourth tasks. In some embodiments, during task execution, electronic device 110 can, in response to receiving input from a second task, cause a third task to initiate the execution of the second task, and in response to determining that a fourth task has been executed, invoke the output result of the second task. When invoking the output result of the second task, if the execution of the second task is completed, the execution result of the second task is invoked for the execution of the fourth task; if the execution of the second task is not completed, the invocation of the execution result of the second task is abandoned.
[0085] This approach, through nested first-level and second-level hierarchies, reduces workflow processing time and improves efficiency. Simultaneously, the branching pattern addresses the issue of non-critical tasks blocking core processes.
[0086] Embodiments of this disclosure also provide corresponding apparatus for implementing the methods or processes described above. FIG8 shows a block diagram of an apparatus 800 for data processing according to some embodiments of this disclosure. The apparatus 800 may be implemented as or included in an electronic device 110. The various modules / components in the apparatus 800 may be implemented by hardware, software, firmware, or any combination thereof.
[0087] The apparatus 800 includes: an acquisition module 810 configured to determine the dependencies between multiple tasks in a workflow to be processed and the execution time of each task; a determination module 820 configured to determine a scheduling and orchestration strategy for the multiple tasks based on the dependencies and the execution time of each task, the scheduling and orchestration strategy indicating multiple first-level layers and at least one second-level layer under each first-level layer, each second-level layer including at least one task group, each task group including at least one task among multiple tasks, the multiple first-level layers being configured to be executed serially, and each task group in each second-level layer being configured to be executed serially or in parallel; and a scheduling module 830 configured to schedule the multiple tasks in the workflow for execution based on the scheduling and orchestration strategy.
[0088] In some embodiments, the determining module 820 is further configured to select a set of tasks with strong dependencies from a plurality of tasks based on dependencies; and to determine a plurality of first-level layers and at least one second-level layer under each first-level layer based on the execution time of the selected set of tasks.
[0089] In some embodiments, the determining module 820 is further configured to determine a plurality of first-level strata based on the total execution time corresponding to a plurality of potential combinations of tasks in a set of tasks, each first-level stratum including one or more first tasks in a set of tasks; and for each of the plurality of first-level strata, to determine at least one second-level stratum under the first-level stratum based on the execution time of the one or more first tasks assigned to the first-level stratum.
[0090] In some embodiments, the determining module 820 is further configured to select, based on dependencies, at least one task with a weak dependency on other tasks from a plurality of tasks; and to determine a scheduling orchestration strategy to indicate at least one branch, the at least one branch being configured to execute asynchronously with a plurality of first-level hierarchies, the at least one branch including at least one task.
[0091] In some embodiments, the determining module 820 is further configured to determine, in a scheduling orchestration strategy, a third task for initiating the execution of the second task in a plurality of first-level hierarchies, and a fourth task for invoking the execution result of the second task, based on the input and output of the second task in at least one task.
[0092] In some embodiments, the scheduling module 830 is further configured to, during task execution, in response to receiving input from the second task, cause the third task to start the execution of the second task; in response to determining that the fourth task is being executed, call the execution result output by the second task; if the execution of the second task is completed, call the execution result of the second task for the execution of the fourth task; and if the execution of the second task is not completed, abandon calling the execution result of the second task.
[0093] In some embodiments, at least one of the multiple tasks included in each task group is configured to be executed serially or in parallel.
[0094] Figure 9 illustrates a block diagram of an electronic device 900 in which one or more embodiments of the present disclosure may be implemented. It should be understood that the electronic device 900 shown in Figure 9 is merely exemplary and should not be construed as limiting the functionality and scope of the embodiments described herein. The electronic device 900 shown in Figure 9 may be included in or implemented as the electronic device 110 of Figure 1.
[0095] As shown in Figure 9, the electronic device 900 is in the form of a general-purpose electronic device. Components of the electronic device 900 may include, but are not limited to, one or more processors or processing units 910, a memory 920, a storage device 930, one or more communication units 940, one or more input devices 950, and one or more output devices 960. The processing unit 910 may be a physical or virtual processor and is capable of performing various processes according to the program stored in the memory 920. In a multiprocessor system, multiple processing units execute computer-executable instructions in parallel to improve the parallel processing capability of the electronic device 900.
[0096] Electronic device 900 typically includes multiple computer storage media. Such media can be any accessible media that is accessible to electronic device 900, including but not limited to volatile and non-volatile media, removable and non-removable media. Memory 920 can be volatile memory (e.g., registers, cache, random access memory (RAM)), non-volatile memory (e.g., read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory), or some combination thereof. Storage device 930 can be removable or non-removable media and can include machine-readable media, such as flash drives, disks, or any other media that can be used to store information and / or data and can be accessed within electronic device 900.
[0097] Electronic device 900 may further include additional removable / non-removable, volatile / non-volatile storage media. Although not shown in FIG. 9, disk drives for reading from or writing to removable, non-volatile disks (e.g., "floppy disks") and optical disk drives for reading from or writing to removable, non-volatile optical disks may be provided. In these cases, each drive may be connected to a bus (not shown) via one or more data media interfaces. Memory 920 may include computer program product 925 having one or more program modules configured to perform various methods or actions of various embodiments of the present disclosure.
[0098] The communication unit 940 enables communication with other electronic devices via a communication medium. Additionally, the functionality of the components of the electronic device 900 can be implemented using a single computing cluster or multiple computing machines capable of communicating via communication connections. Therefore, the electronic device 900 can operate in a networked environment using logical connections to one or more other servers, network personal computers (PCs), or another network node.
[0099] Input device 950 can be one or more input devices, such as a mouse, keyboard, trackball, etc. Output device 960 can be one or more output devices, such as a monitor, speaker, printer, etc. Electronic device 900 can also communicate with one or more external devices (not shown) via communication unit 940 as needed. These external devices include storage devices, display devices, etc., and can communicate with one or more devices that enable user interaction with electronic device 900, or with any device that enables electronic device 900 to communicate with one or more other electronic devices (e.g., network card, modem, etc.). Such communication can be performed via input / output (I / O) interface (not shown).
[0100] According to an exemplary implementation of this disclosure, a computer-readable storage medium is provided that stores computer-executable instructions thereon, wherein the computer-executable instructions are executed by a processor to implement the methods described above. According to an exemplary implementation of this disclosure, a computer program product is also provided, which is tangibly stored on a non-transitory computer-readable medium and includes computer-executable instructions, which are executed by a processor to implement the methods described above.
[0101] Various aspects of this disclosure are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatuses, devices, and computer program products implemented according to this disclosure. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.
[0102] These computer-readable program instructions can be provided to a processing unit of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processing unit of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner. Thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.
[0103] Computer-readable program instructions can be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions that execute on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.
[0104] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction, which contains one or more executable instructions for implementing the specified logical function. In some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.
[0105] Various implementations of this disclosure have been described above. These descriptions are exemplary and not exhaustive, nor are they limited to the disclosed implementations. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described implementations. The terminology used herein is chosen to best explain the principles, practical applications, or improvements to technology in the market, or to enable others skilled in the art to understand the various implementations disclosed herein.
Claims
1. A method for data processing, comprising: Determine the dependencies between multiple tasks in the workflow to be processed and the execution time of each of the multiple tasks; Based on the dependencies and the execution time of each of the multiple tasks, a scheduling and orchestration strategy for the multiple tasks is determined. The scheduling and orchestration strategy indicates multiple first-level layers and at least one second-level layer under each first-level layer. Each second-level layer includes at least one task group, and each task group includes at least one task among the multiple tasks. The multiple first-level layers are configured to be executed serially, and each task group in each second-level layer is configured to be executed serially or in parallel. as well as Based on the scheduling and orchestration strategy, the multiple tasks in the workflow are scheduled for execution.
2. The method according to claim 1, wherein determining the scheduling and orchestration strategy includes: Based on the aforementioned dependencies, select a set of tasks with strong dependencies from the plurality of tasks; as well as The plurality of first-level layers and at least one second-level layer under each first-level layer are determined based on the execution time of the selected set of tasks.
3. The method of claim 2, wherein determining the plurality of first-level layers and at least one second-level layer under each first-level layer based on the execution time of each of the selected set of tasks comprises: Based on the total execution time corresponding to multiple potential combinations of tasks in the set of tasks, the multiple first-level layers are determined, and each first-level layer includes one or more first tasks in the set of tasks. For each of the plurality of first-level hierarchies, at least one second-level hierarchy is determined based on the execution time of one or more first tasks assigned to the first-level hierarchy.
4. The method according to claim 1, wherein determining the scheduling and orchestration strategy further comprises: Based on the dependencies, select at least one task from the plurality of tasks that has a weak dependency relationship with other tasks; as well as The scheduling and orchestration strategy is determined to indicate at least one branch, which is configured to execute asynchronously with the plurality of first-level hierarchies, and the at least one branch includes the at least one task.
5. The method according to claim 4, wherein determining the scheduling and orchestration strategy includes: Based on the input and output of the second task in the at least one task, a third task for initiating the execution of the second task and a fourth task for invoking the execution result of the second task are determined in the scheduling strategy.
6. The method of claim 5, wherein scheduling the plurality of tasks in the workflow for execution based on the scheduling and orchestration strategy comprises: During task execution, in response to receiving input from the second task, the third task initiates the execution of the second task; In response to determining that the fourth task has been executed, the execution result output by the second task is invoked. If the second task is completed, the execution result of the second task is used to execute the fourth task; as well as If the execution of the second task is not completed, the execution result of the second task is abandoned.
7. The method of claim 1, wherein at least one of the plurality of tasks contained in each task group is configured to be executed serially or in parallel.
8. An apparatus for data processing, comprising: The acquisition module is configured to determine the dependencies between multiple tasks in the workflow to be processed, as well as the execution time of each task. The determination module is configured to determine a scheduling and orchestration strategy for the multiple tasks based on the dependencies and the execution time of each of the multiple tasks. The scheduling and orchestration strategy indicates multiple first-level layers and at least one second-level layer under each first-level layer. Each second-level layer includes at least one task group, and each task group includes at least one task among the multiple tasks. The multiple first-level layers are configured to be executed serially, and each task group in each second-level layer is configured to be executed serially or in parallel. as well as The scheduling module is configured to schedule the multiple tasks in the workflow for execution based on the scheduling orchestration strategy.
9. An electronic device, comprising: At least one processing unit; as well as At least one memory is coupled to at least one processing unit and stores instructions for execution by the at least one processing unit, which, when executed by the at least one processing unit, cause the electronic device to perform the method according to any one of claims 1 to 7.
10. A computer-readable storage medium having a computer program stored thereon, the computer program being executable by a processor to implement the method according to any one of claims 1 to 7.
11. A computer program product comprising computer-executable instructions, wherein the computer-executable instructions, when executed by a processor, implement the method according to any one of claims 1 to 7.