AI model cluster scheduling method and platform supporting multi-scenario task priority

By using an AI model cluster scheduling method, the problem of the inability to dynamically adjust task priorities in multi-scenario task scheduling is solved, achieving efficient resource utilization and task execution, and adapting to the needs of various business scenarios.

CN122240310APending Publication Date: 2026-06-19HENAN ZHIMA INTERACTIVE TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HENAN ZHIMA INTERACTIVE TECHNOLOGY CO LTD
Filing Date
2026-03-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing equipment lacks a unified scenario adaptation mechanism in multi-scenario task scheduling, which makes it impossible to dynamically adjust task priorities, resulting in task timeouts or resource waste, and failing to meet diversified business needs.

Method used

An AI model cluster scheduling method that supports multi-scenario task priorities is adopted. Through scenario feature extraction, dynamic priority calculation, resource pre-scheduling and dynamic adjustment, combined with multi-dimensional weighted algorithms and bidirectional resource collection strategies, efficient task scheduling is achieved.

Benefits of technology

It enables efficient task scheduling in different business scenarios, avoids task timeouts and resource waste, supports multiple business needs, and dynamically adjusts task priorities to meet the urgency and resource utilization of different scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122240310A_ABST
    Figure CN122240310A_ABST
Patent Text Reader

Abstract

This invention belongs to the field of computer distributed systems technology and discloses an AI model cluster scheduling method and platform that supports multi-scenario task prioritization, including scenario feature extraction, dynamic priority calculation, resource pre-scheduling, and dynamic adjustment. This invention prioritizes real-time transaction scenarios to ensure no timeouts, prioritizes batch processing scenarios to save resources, and prioritizes emergency scenarios for rapid execution. A single system can meet different business needs without requiring a separate scheduling scheme for each scenario. By breaking down resources into resource blocks for precise allocation, high-priority tasks are guaranteed, while low-priority tasks can share idle resources, preventing situations where some nodes are fully loaded while others are idle, which is beneficial for practical applications and operations.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer distributed systems technology, and in particular to an AI model cluster scheduling method and platform that supports task priority in multiple scenarios. Background Technology

[0002] With the rapid development of digital business, distributed clusters have become the core infrastructure supporting large-scale task processing. Their core function is to rationally allocate massive tasks to cluster nodes through scheduling algorithms, thereby achieving efficient utilization of hardware resources such as CPU, memory, and I / O.

[0003] In practical applications, existing equipment scheduling schemes are often designed for single business scenarios. For example, real-time task scheduling focuses only on "low latency," and batch processing scheduling focuses only on "high resource utilization," lacking a unified scenario adaptation mechanism. When the same cluster needs to support multiple scenarios, scheduling parameters must be manually adjusted or multiple scheduling systems must be built, which is cumbersome and prone to scenario conflicts, failing to meet the needs of diversified business development. In existing technologies, task priorities are mostly calculated using fixed weights (such as preset priorities only based on business type), making it impossible to dynamically adjust the weight ratio of evaluation dimensions according to scenario characteristics. For example, in real-time transaction scenarios, the weight of "remaining time limit" is not strengthened, which can easily lead to task timeouts; in emergency scenarios, the weight of "business value" is not increased, making it impossible to quickly respond to core needs, which is not conducive to practical application and operation. Summary of the Invention

[0004] One of the objectives of this invention is to provide an AI model cluster scheduling method and platform that supports multi-scenario task prioritization.

[0005] To achieve the above objectives, the technical solution adopted by this invention is: an AI model cluster scheduling method and platform that supports multi-scenario task priorities, including scenario feature extraction, dynamic priority calculation, resource pre-scheduling, and dynamic adjustment;

[0006] S1. Scene Feature Extraction: Receive task submission requests, extract the task's scene identifier, resource requirement parameters, time constraints, and business value tags, match them with a preset scene feature library, and generate scene adaptation results.

[0007] S2. Dynamic Priority Calculation: Based on the scenario adaptation result, the corresponding priority rule set is called, and the priority score is calculated through a multi-dimensional weighted algorithm. The multi-dimensional factors include business value weight (0.4-0.6), remaining time coefficient (0.2-0.3), and resource utilization rate (0.1-0.3), wherein the business value weight is dynamically adjusted according to the scenario.

[0008] S3. Resource pre-scheduling: Real-time available resource data of cluster nodes is obtained through a two-way resource collection strategy, and tasks are filtered by priority scores to select a set of candidate nodes that meet the resource requirements.

[0009] S4. Scheduling Execution and Feedback: The resource allocation mechanism is used to divide the resources of the candidate node set into resource segments that are adapted to the tasks. The tasks are scheduled in order of priority, and the task execution status and resource usage data are recorded at the same time.

[0010] S5. Dynamic Adjustment: Monitor node load and task waiting time. When the load exceeds the preset threshold or a low-priority task times out, trigger priority reassessment and resource reallocation.

[0011] Preferably, the scenario feature library includes time-priority rules for real-time transaction scenarios, resource optimization rules for batch processing scenarios, and business-priority rules for emergency scenarios, and supports dynamic updates of the rule set.

[0012] Preferably, the bidirectional resource collection strategy includes uplink active reporting and downlink timed collection;

[0013] The uplink active reporting involves deploying a monitoring agent on cluster nodes to detect resource topology changes in real time and report updated data.

[0014] The downlink timed collection: The scheduling system sends resource query instructions to all nodes every 1-5 minutes to verify and supplement resource data.

[0015] Preferably, the resource allocation mechanism includes:

[0016] T1. Decompose node resources into independent resource nodes for CPU, memory, and I / O;

[0017] T2. High-priority tasks are pre-allocated and locked, while low-priority tasks are dynamically allocated using a shared resource pool.

[0018] T3. When resources are insufficient, only high-priority tasks are allowed to preempt resources from medium-priority tasks that do not hold critical resources.

[0019] Preferably, the dynamic adjustment step includes an anti-starvation mechanism: for low-priority tasks whose waiting time exceeds a preset threshold, their business value weight is increased by 10% every ten minutes, up to a maximum of 150% of the baseline value.

[0020] Preferably, it includes a scene adaptation module, a priority calculation module, a resource management module, a scheduling execution module, and a dynamic monitoring module;

[0021] Q1. Scene Adaptation Module: Stores scene feature library and rule set, performs task scene matching and rule invocation;

[0022] Q2. Priority Calculation Module: Built-in multi-dimensional weighted algorithm, receives scene adaptation results and outputs priority scores;

[0023] Q3. Resource Management Module: Maintains a cluster resource view through a two-way collection strategy, enabling resource segment partitioning and allocation;

[0024] Q4. Scheduling Execution Module: Includes cluster filters and resource allocators, and executes scheduling logic according to priority;

[0025] Q5. Dynamic monitoring module: Collects node load and task status in real time and triggers scheduling adjustment commands.

[0026] Preferably, the resource management module is equipped with a resource recycling mechanism: resource nodes are released immediately upon completion of a task, a resource rollback process is initiated for abnormally terminated tasks, and resource cleanup and resource view are completed within 10 minutes.

[0027] Preferably, the dynamic monitoring module includes three levels of early warning thresholds: a yellow warning is triggered when the node CPU utilization exceeds 80%, an orange warning is triggered when it exceeds 90%, and a red warning is triggered when it exceeds 95%, forcibly initiating task migration.

[0028] Compared with the prior art, the beneficial effects of the present invention are as follows:

[0029] (1) This invention prioritizes real-time transaction scenarios to ensure no timeouts, prioritizes batch processing scenarios to save resources, and prioritizes emergency scenarios to execute quickly. One system can meet different business needs without having to build a separate scheduling scheme for each scenario. Resources are broken down into resource blocks for precise allocation. High-priority tasks are guaranteed, and low-priority tasks can share idle resources, so that there will be no situation where some nodes are fully loaded and some nodes are empty.

[0030] (2) This invention automatically increases the weight of low-priority tasks when they time out, so they will not be stuck in the queue. It takes into account both the urgency of high-priority tasks and the necessity of low-priority tasks. It collects resource data in both directions to ensure that the scheduling decision is not biased. It will issue a warning when the node load exceeds the threshold, and even automatically migrate tasks to avoid node crashes. The resources of abnormal tasks can be recovered within 10 minutes without affecting other tasks. The scenario rule base supports dynamic updates. When new business scenarios are added in the future, the system does not need to be rebuilt. Only the corresponding rules need to be added to adapt. Attached Figure Description

[0031] Figure 1 This is a schematic diagram of the overall process structure of the present invention. Detailed Implementation

[0032] The present invention will now be further described in conjunction with specific embodiments. It should be noted that, without conflict, the various embodiments or technical features described below can be arbitrarily combined to form new embodiments.

[0033] In the description of this invention, it should be noted that directional terms such as "center," "lateral," "longitudinal," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," and "counterclockwise" indicate the orientation and positional relationship based on the orientation or positional relationship shown in the accompanying drawings. They are only for the convenience of describing this invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. They should not be construed as limiting the specific protection scope of this invention.

[0034] It should be noted that the terms "first" and "second" in the specification and claims of this invention are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.

[0035] One preferred embodiment of the present invention, such as Figure 1 As shown, an AI model cluster scheduling method and platform that supports multi-scenario task priorities includes scenario feature extraction, dynamic priority calculation, resource pre-scheduling, and dynamic adjustment.

[0036] S1. Scene Feature Extraction: Receive task submission requests, extract the task's scene identifier, resource requirement parameters, time constraints, and business value tags, match them with a preset scene feature library, and generate scene adaptation results.

[0037] S2. Dynamic Priority Calculation: Based on the scenario adaptation result, the corresponding priority rule set is called, and the priority score is calculated through a multi-dimensional weighted algorithm. The multi-dimensional factors include business value weight (0.4-0.6), remaining time coefficient (0.2-0.3), and resource utilization rate (0.1-0.3). The business value weight is dynamically adjusted according to the scenario.

[0038] S3. Resource pre-scheduling: Real-time available resource data of cluster nodes is obtained through a two-way resource collection strategy, and tasks are filtered by priority scores to select a set of candidate nodes that meet the resource requirements.

[0039] S4. Scheduling Execution and Feedback: The resource allocation mechanism is used to divide the resources of the candidate node set into resource segments that are adapted to the tasks. The tasks are scheduled in order of priority, and the task execution status and resource usage data are recorded at the same time.

[0040] S5. Dynamic Adjustment: Monitor node load and task waiting time. When the load exceeds the preset threshold or a low-priority task times out, trigger priority reassessment and resource reallocation.

[0041] The scenario feature library includes time-priority rules for real-time transaction scenarios, resource optimization rules for batch processing scenarios, and business-priority rules for emergency scenarios, and supports dynamic updates of the rule set.

[0042] The two-way resource collection strategy includes uplink proactive reporting and downlink timed collection;

[0043] Uplink proactive reporting: Cluster nodes deploy monitoring agents to detect resource topology changes in real time and report updated data;

[0044] Downlink timed collection: The scheduling system sends resource query instructions to all nodes every 1-5 minutes to verify and supplement resource data.

[0045] Resource allocation mechanisms include:

[0046] T1. Decompose node resources into independent resource nodes for CPU, memory, and I / O;

[0047] T2. High-priority tasks are pre-allocated and locked, while low-priority tasks are dynamically allocated using a shared resource pool.

[0048] T3. When resources are insufficient, only high-priority tasks are allowed to preempt resources from medium-priority tasks that do not hold critical resources.

[0049] The dynamic adjustment steps include a starvation prevention mechanism: for low-priority tasks whose waiting time exceeds a preset threshold, their business value weight is increased by 10% every ten minutes, up to a maximum of 150% of the baseline value.

[0050] It includes a scene adaptation module, a priority calculation module, a resource management module, a scheduling and execution module, and a dynamic monitoring module;

[0051] Q1. Scene Adaptation Module: Stores scene feature library and rule set, performs task scene matching and rule invocation;

[0052] Q2. Priority Calculation Module: Built-in multi-dimensional weighted algorithm, receives scene adaptation results and outputs priority scores;

[0053] Q3. Resource Management Module: Maintains a cluster resource view through a two-way collection strategy, enabling resource segment partitioning and allocation;

[0054] Q4. Scheduling Execution Module: Includes cluster filters and resource allocators, and executes scheduling logic according to priority;

[0055] Q5. Dynamic monitoring module: Collects node load and task status in real time and triggers scheduling adjustment commands.

[0056] The resource management module is equipped with a resource reclamation mechanism: resources are released immediately upon completion of tasks, and a resource rollback process is initiated for abnormally terminated tasks, completing resource cleanup and updating the resource view within 10 minutes.

[0057] The dynamic monitoring module includes three levels of early warning thresholds: a yellow warning is triggered when the node's CPU utilization exceeds 80%, an orange warning is triggered when it exceeds 90%, and a red warning is triggered when it exceeds 95%, forcibly initiating task migration.

[0058] Working principle:

[0059] When a task is submitted, its key information is first extracted, such as whether it is a real-time transaction, batch processing, or an emergency task. This information is then compared with a pre-defined scenario rule base to determine the corresponding scheduling rule. Multi-dimensional scoring and ranking are used: based on the matched scenario, a fixed algorithm is used to calculate the task priority score, primarily considering three dimensions: business importance, remaining task time, and the amount of resources required. The higher the score, the higher the scheduling priority. On one hand, each node in the cluster actively reports its resource changes; on the other hand, the scheduling system also actively checks all nodes every 1-5 minutes to ensure the accuracy of the resource data it possesses.

[0060] Each node's resources are divided into three independent resource blocks: CPU, memory, and I / O. High-priority tasks directly lock the resource blocks they need, while low-priority tasks share a single resource pool. If resources are insufficient, only high-priority tasks can preempt resources from medium-priority tasks that are not currently processing critical data. Node load and task waiting time are monitored continuously: if node load is too high, task migration is initiated; if low-priority tasks wait too long, their importance weight is increased every 10 minutes to prevent them from being constantly queued; resources are released immediately upon task completion or abnormal termination to ensure no waste.

[0061] The basic principles, main features, and advantages of this invention have been described above. Those skilled in the art should understand that this invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely principles of the invention. Various changes and modifications can be made without departing from the spirit and scope of the invention, and all such changes and modifications fall within the scope of the invention as claimed. The scope of protection claimed by this invention is defined by the appended claims and their equivalents.

Claims

1. A method for scheduling AI model clusters that supports task priorities across multiple scenarios, characterized in that, This includes scene feature extraction, dynamic priority calculation, resource pre-scheduling, and dynamic adjustment; S1. Scene Feature Extraction: Receive task submission requests, extract the task's scene identifier, resource requirement parameters, time constraints, and business value tags, match them with a preset scene feature library, and generate scene adaptation results. S2. Dynamic Priority Calculation: Based on the scenario adaptation result, the corresponding priority rule set is called, and the priority score is calculated through a multi-dimensional weighted algorithm. The multi-dimensional factors include business value weight (0.4-0.6), remaining time coefficient (0.2-0.3), and resource utilization rate (0.1-0.3), wherein the business value weight is dynamically adjusted according to the scenario. S3. Resource pre-scheduling: Real-time available resource data of cluster nodes is obtained through a two-way resource collection strategy, and tasks are filtered by priority scores to select a set of candidate nodes that meet the resource requirements. S4. Scheduling Execution and Feedback: The resource allocation mechanism is used to divide the resources of the candidate node set into resource segments that are adapted to the tasks. The tasks are scheduled in order of priority, and the task execution status and resource usage data are recorded at the same time. S5. Dynamic Adjustment: Monitor node load and task waiting time. When the load exceeds the preset threshold or a low-priority task times out, trigger priority reassessment and resource reallocation.

2. The AI ​​model cluster scheduling method supporting multi-scenario task priority as described in claim 1, characterized in that: The scenario feature library includes time-priority rules for real-time transaction scenarios, resource optimization rules for batch processing scenarios, and business-priority rules for emergency scenarios, and supports dynamic updates of the rule set.

3. The AI ​​model cluster scheduling method supporting multi-scenario task priority as described in claim 1, characterized in that: The bidirectional resource collection strategy includes uplink active reporting and downlink timed collection. The uplink active reporting involves deploying a monitoring agent on cluster nodes to detect resource topology changes in real time and report updated data. The downlink timed collection: The scheduling system sends resource query instructions to all nodes every 1-5 minutes to verify and supplement resource data.

4. The AI ​​model cluster scheduling method supporting multi-scenario task priority as described in claim 1, characterized in that: The resource allocation mechanism includes: T1. Decompose node resources into independent resource nodes for CPU, memory, and I / O; T2. High-priority tasks are pre-allocated and locked, while low-priority tasks are dynamically allocated using a shared resource pool. T3. When resources are insufficient, only high-priority tasks are allowed to preempt resources from medium-priority tasks that do not hold critical resources.

5. The AI ​​model cluster scheduling method supporting multi-scenario task priority as described in claim 1, characterized in that: The dynamic adjustment steps include an anti-starvation mechanism: for low-priority tasks whose waiting time exceeds a preset threshold, their business value weight is increased by 10% every ten minutes, up to a maximum of 150% of the baseline value.

6. An AI model cluster platform that supports multi-scenario task prioritization, characterized by: It includes a scene adaptation module, a priority calculation module, a resource management module, a scheduling and execution module, and a dynamic monitoring module; Q1. Scene Adaptation Module: Stores scene feature library and rule set, performs task scene matching and rule invocation; Q2. Priority Calculation Module: Built-in multi-dimensional weighted algorithm, receives scene adaptation results and outputs priority scores; Q3. Resource Management Module: Maintains a cluster resource view through a two-way collection strategy, enabling resource segment partitioning and allocation; Q4. Scheduling Execution Module: Includes cluster filters and resource allocators, and executes scheduling logic according to priority; Q5. Dynamic monitoring module: Collects node load and task status in real time and triggers scheduling adjustment commands.

7. The AI ​​model cluster platform supporting multi-scenario task prioritization as described in claim 6, characterized in that: The resource management module is equipped with a resource recycling mechanism: resources are released immediately upon completion of a task, and a resource rollback process is initiated for tasks that terminate abnormally, completing resource cleanup and updating the resource view within 10 minutes.

8. The AI ​​model cluster platform supporting multi-scenario task prioritization as described in claim 6, characterized in that: The dynamic monitoring module includes three levels of early warning thresholds: a yellow warning is triggered when the node's CPU utilization exceeds 80%, an orange warning is triggered when it exceeds 90%, and a red warning is triggered when it exceeds 95%, forcibly initiating task migration.