A load-aware server computing power scheduling method and system
By collecting and integrating multi-dimensional load data and combining load fluctuation coefficients and storage media characteristics, task allocation and data transmission paths are optimized, solving the problem of the disconnect between computing power scheduling and storage status in existing technologies, and improving resource utilization and task execution efficiency in distributed storage scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HANGZHOU SHUKE TECHNOLOGY CO LTD
- Filing Date
- 2026-03-25
- Publication Date
- 2026-06-19
AI Technical Summary
Existing server computing power scheduling methods in distributed storage scenarios sever the collaborative relationship between computing power scheduling and distributed storage status, ignore the impact of storage load on task execution, resulting in idle resources or bottlenecks, and are difficult to adapt to the dynamic changes of heterogeneous nodes and mixed task scenarios.
By collecting and fusing multi-dimensional load data, hot and cold data can be identified in real time. Combined with load fluctuation coefficient and storage medium characteristics, node classification and task parsing are performed to optimize task allocation and data transmission paths, and scheduling strategies are dynamically adjusted to adapt to heterogeneous storage environments.
It achieves coordinated scheduling of computing power and storage load, improves overall performance and resource utilization in distributed storage scenarios, reduces cross-node data transmission overhead, adapts to heterogeneous nodes and mixed task scenarios, and improves task execution efficiency and system stability.
Smart Images

Figure CN122240322A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of server scheduling technology, specifically to a server computing power scheduling method and system based on load awareness. Background Technology
[0002] In scenarios where distributed storage and cloud computing converge, server clusters need to simultaneously handle multiple tasks such as data storage, computation processing, and data reading and writing. The rationality of computing power scheduling directly determines the overall system performance and resource utilization. Existing load-aware server computing power scheduling technologies mostly focus on computing power metrics such as processor core load, task complexity, and communication overhead. They achieve scheduling optimization through task partitioning, dynamic weight allocation, and heuristic algorithms, such as the HEFT algorithm and LATS algorithm. Although these technologies can improve computing power utilization and reduce task scheduling latency to some extent, they have significant limitations in distributed storage scenarios.
[0003] The core flaw of existing technologies lies in their disconnect between computing power scheduling and the coordinated relationship between computing power scheduling and distributed storage status. On the one hand, they rely solely on computing power load indicators such as CPU and memory as scheduling criteria, ignoring the storage load of distributed storage nodes, such as storage capacity utilization, read / write I / O rates, and the impact of data sharding distribution on the execution efficiency of computing tasks. This leads to some computing tasks being assigned to nodes with saturated storage loads, resulting in the contradiction of data read / write blocking and idle computing resources. On the other hand, they lack awareness of the correlation between stored data. When a task needs to access related data across multiple nodes in distributed storage, traditional scheduling only considers node computing power without optimizing data transmission paths, increasing cross-node data interaction overhead and reducing the overall execution efficiency of the task. Furthermore, existing scheduling algorithms are mostly designed for homogeneous nodes or single task types. In heterogeneous nodes in distributed storage scenarios, nodes with different storage capacities and read / write performance, and mixed storage-intensive and compute-intensive tasks coexisting, they exhibit poor flexibility and load balancing, making it difficult to adapt to dynamically changing storage and computing power demands. Summary of the Invention
[0004] The purpose of this invention is to provide a load-aware server computing power scheduling method and system to solve the problems mentioned in the background art. These methods and systems have severed the collaborative relationship between computing power scheduling and distributed storage status, and are mostly designed for homogeneous nodes or single task types. In heterogeneous nodes and mixed task scenarios under distributed storage, they have poor flexibility, poor load balancing effect, and are difficult to adapt to dynamically changing storage and computing power requirements.
[0005] To achieve the above objectives, the present invention provides the following technical solution: a load-aware server computing power scheduling method, comprising the following steps: S1. Multi-dimensional load data collection: Real-time collection of computing load data and distributed storage load data of each server node, and data verification and hot / cold data identification during the collection process. S2. Load fusion assessment and node classification: The load fusion assessment model is used to calculate the comprehensive load score of the node, and the node level is classified by combining the load fluctuation coefficient and storage medium characteristics. S3. Task parsing and requirement matching: Parse the attributes of the task to be scheduled and the required stored data information, and simultaneously identify the data privacy level and split mixed-type tasks. S4. Cooperative scheduling strategy execution: Target nodes are selected based on node hierarchy and data correlation, data preloading and temporary high-speed link scheduling are performed, and task allocation and data transmission paths are optimized. S5 features dynamic adjustment and feedback optimization, real-time monitoring of task execution and node load, optimization of scheduling parameters through incremental migration and quality assessment, and periodic adjustment of data sharding layout.
[0006] In this embodiment, S1, the distributed storage load data includes storage capacity utilization, read / write IO throughput, number and distribution of data shards, and data redundancy between storage nodes; the computing power load data includes CPU utilization, memory utilization, GPU computing speed, and task queue length; the data verification uses timestamp synchronization and hash verification to filter and complete abnormal data; hot and cold data are based on access frequency and recent access time markers; and dual-dimensional load data, including computing power load data and distributed storage load data, is collected in real time through lightweight monitoring agents deployed on each server node; the computing power load data includes CPU utilization, memory utilization, GPU computing speed, configuration, and task queue length; the distributed storage load data... This includes data collection on storage capacity utilization, read / write I / O throughput, number and distribution of data shards, and data redundancy between storage nodes. Simultaneously, it collects static attributes of each server node, such as CPU core count, memory capacity, storage media type, and network bandwidth, to build a node load status database. The data collection cycle can be dynamically adjusted according to the scenario, with a default of 100ms. A data verification mechanism is incorporated during the collection process, using timestamp synchronization and data hash verification to exclude abnormal data, such as missing values caused by collection interruptions or errors caused by network transmission distortion. Abnormal data is supplemented using the average of the previous three cycles to ensure the accuracy and reliability of the load data. Furthermore, it collects the hot / cold attributes of data in distributed storage, access frequency, and recent access time, providing a basis for subsequent data sharding optimization.
[0007] In this embodiment, S2, the load fusion evaluation model adopts a weighted summation algorithm, and the weight coefficients are adaptively adjusted according to the task type. In storage-intensive task scenarios, the storage load weight is higher than the computing load weight, and in computing-intensive task scenarios, the computing load weight is higher than the storage load weight. Based on the collected dual-dimensional load data, a load fusion evaluation model is constructed to comprehensively score and classify the load of each server node. The load fusion evaluation model adopts a weighted summation algorithm, and the weights are adaptively adjusted through training with historical scheduling data. The specific formula is as follows:
[0008] in, To score the overall load of nodes, The standardized score for computing load is calculated by weighting metrics such as CPU utilization and memory usage. A higher score indicates a heavier computing load. The storage load standardization score is calculated by weighting metrics such as storage capacity utilization and read / write I / O throughput. A higher score indicates a heavier storage load. , These are the weighting coefficients; + =1, in storage-intensive task scenarios > In computationally intensive task scenarios > Based on comprehensive load scoring Nodes are divided into three levels: idle nodes. < Suitable nodes ≤ ≤ Overloaded nodes > ,in , This is the load threshold, which can be dynamically configured based on cluster size and business needs.
[0009] Simultaneously, a load fluctuation coefficient is introduced. , The standard deviation of the comprehensive load score of a node over the past 10 collection cycles is used to reduce the task allocation priority of nodes with excessive fluctuation coefficients. Load score weights are adjusted based on the characteristics of different storage media. Data privacy levels (public, confidential, and top secret) are identified simultaneously during task parsing. For confidential and higher-level data, only nodes with encrypted storage capabilities are selected as candidate nodes. The storage location of the data encryption key is also recorded to ensure data access security during task execution. Nodes with fluctuation coefficients exceeding a preset threshold of 15% (default) have their task allocation priority reduced to prevent task interruption due to drastic load fluctuations. Furthermore, the storage media characteristics of the nodes (SSD vs. HDD) are considered. The calculation assigns differentiated weights, with SSD nodes having a higher weight for read / write I / O throughput than HDD nodes, to adapt to the performance differences of heterogeneous storage media.
[0010] In this embodiment, S3 and task attributes include task type (storage-intensive, compute-intensive, hybrid), required computing resources, CPU and memory requirements, required stored data, data identifier, data shard location, and data correlation. Based on the data identifier, the distributed nodes and data correlation information of the corresponding data shard are queried from the distributed storage metadata server, such as whether there is cross-node related data and the location of redundant data nodes. A task requirement list is generated. When parsing the task, the data privacy level (public, confidential, top secret) is identified simultaneously. For confidential and higher-level data, only nodes with encrypted storage capabilities are selected as candidate nodes. At the same time, the storage location of the data encryption key is recorded to ensure the security of data access during task execution. For hybrid tasks, computing power sub-tasks and storage sub-tasks are further split, and the dependencies of each sub-task are clarified to provide support for subsequent collaborative scheduling.
[0011] In this embodiment, step S4 involves executing a differentiated scheduling strategy based on the task requirement list and node load classification results, including node filtering, data association optimization, and load balancing fine-tuning. Node filtering: Exclude overloaded nodes and filter candidate nodes from idle and adequately loaded nodes. Filtering criteria include: node static attributes meeting task computing power and storage requirements, and storage load score. Set the storage threshold to ≤ preset to avoid storage bottlenecks; Data association optimization: For tasks that need to call data associated with multiple nodes, priority is given to selecting candidate nodes that contain core data shards and are in the associated data shard set as target nodes. If the core data shards are distributed across multiple nodes, the data transmission overhead between each node is calculated, and the node combination with the lowest transmission overhead is selected as the target node group to achieve localized data scheduling. Load balancing fine-tuning: If there are multiple candidate nodes (groups), calculate the load balancing gain of each candidate node (group), select the node (group) with the largest gain to allocate tasks, and ensure overall load balancing of the cluster; the load balancing gain is calculated by the difference between the current comprehensive load of the node and the average load of the cluster, and the smaller the difference, the greater the gain.
[0012] For storage-intensive tasks, a data sharding preloading mechanism is implemented. Before task allocation, the associated data shards are preloaded into the cache of the target node. When the cache space is insufficient, cold data shards are replaced first to reduce read and write latency during task execution. For cross-node collaborative tasks, a temporary high-speed communication link is established to reduce network latency for data interaction between nodes. After the task is completed, the link resources are released to save network bandwidth.
[0013] In this embodiment, during S5, the dual-dimensional load data and task execution status of the target node (group) are monitored in real time. If the node load reaches the overload threshold, task migration is triggered, and some tasks are migrated to idle nodes. During the migration, the data sharding position is optimized simultaneously to reduce the data transmission overhead of subsequent tasks. At the same time, based on historical scheduling data (task execution efficiency, load balancing effect, data transmission overhead), the weight coefficients and load thresholds of the load fusion evaluation model are adaptively adjusted to optimize the scheduling strategy. During task migration, an incremental migration mechanism is adopted, migrating only unfinished task fragments and associated data shards to avoid resource waste caused by full migration. During the feedback optimization process, task execution quality evaluation indicators, such as task completion rate and data transmission error rate, are added. Scheduled cases that fail to meet quality standards are analyzed separately, and the corresponding weight coefficients and screening conditions are adjusted. Periodically, by default every hour, data shards in distributed storage are redistributed, migrating hot data shards to high-performance nodes and cold data shards to large-capacity nodes to optimize the overall storage layout.
[0014] A load-aware server computing power scheduling system includes a distributed storage cluster, a monitoring and acquisition module, a load assessment module, a task parsing module, a collaborative scheduling module, and a feedback optimization module. The distributed storage cluster has a built-in secure storage and cache management subunit to ensure data security and read / write efficiency; The monitoring and acquisition module collects dual-dimensional load data and performs verification and hot / cold data identification. The load assessment module combines the fluctuation coefficient and media characteristics to generate a comprehensive load score and rating for the node. The task parsing module identifies data privacy levels and splits mixed tasks; The collaborative scheduling module performs preloading and temporary link scheduling; The feedback optimization module optimizes scheduling parameters and adjusts data sharding layout through incremental migration and quality assessment.
[0015] In this embodiment, the distributed storage cluster consists of multiple heterogeneous server nodes. Each node integrates a computing power processing unit, including CPU, memory, GPU, and storage units such as hard disk and SSD. The nodes are interconnected through a high-speed network to realize distributed data storage and collaborative computing power processing. At the same time, a metadata server is deployed to store metadata such as data sharding distribution, data correlation, and node attributes. The cluster has a built-in secure storage subunit that supports encrypted data storage, AES-256 symmetric encryption algorithm and access control. Differentiated storage strategies are configured for data with different privacy levels. It is equipped with a cache management subunit to uniformly manage the cache resources of each node, realize intelligent preloading and hot / cold vaporization replacement of data shards, and improve data read and write efficiency. In this embodiment, the monitoring and acquisition module includes a lightweight monitoring agent deployed on each server node, which is used to collect computing load data and distributed storage load data in real time, format the collected data and transmit it to the load assessment module, and also supports dynamic adjustment of the acquisition cycle. The system integrates a data verification submodule and a hot / cold data identification submodule. The data verification submodule filters and completes abnormal data through timestamp synchronization and hash verification. The hot / cold data identification submodule marks data attributes based on access frequency and timestamp, providing data support for subsequent scheduling and storage optimization. It also supports cross-node data synchronization to ensure the consistency of collected data.
[0016] In this embodiment, the load assessment module constructs a load fusion assessment model, receives two-dimensional load data transmitted by the monitoring and acquisition module, performs standardization processing and weighted summation, generates a comprehensive load score for each node, completes node classification, and synchronizes the results to the collaborative scheduling module. The system includes a load fluctuation analysis submodule and a media adaptation submodule. The load fluctuation analysis submodule calculates the node load fluctuation coefficient and marks nodes with high fluctuations. The media adaptation submodule adjusts the load scoring weights for different storage media characteristics, improving the evaluation model's adaptability to heterogeneous storage clusters. It also supports dynamic calibration of load thresholds and automatic adjustments based on cluster operating status. , Numerical value.
[0017] In this embodiment, the task parsing module receives external tasks to be scheduled, parses the task attributes and data requirements, obtains data sharding distribution and correlation information by querying the metadata server, generates a task requirement list, and transmits it to the collaborative scheduling module. The system integrates a privacy level identification submodule and a task splitting submodule. The privacy level identification submodule identifies the privacy attributes of task-related data and filters nodes that meet security requirements. The task splitting submodule splits mixed-type tasks into computing power subtasks and storage subtasks, clarifies the subtask dependencies, and improves scheduling accuracy.
[0018] In this embodiment, the collaborative scheduling module performs node screening, data association optimization, and load balancing fine-tuning operations based on the node classification results and task requirement list to determine target nodes (groups) and assign tasks, while coordinating the target nodes (groups) to optimize data transmission paths and data sharding locations. It includes a preloading management submodule and a temporary link scheduling submodule. The preloading management submodule performs data sharding preloading and cache optimization for storage-intensive tasks, while the temporary link scheduling submodule establishes high-speed communication links for cross-node collaborative tasks and dynamically releases resources. It supports task priority adaptation, prioritizes scheduling high-priority tasks, and ensures the stable execution of core business. In this embodiment, the feedback optimization module monitors the task execution status and node load changes in real time, triggering task migration of overloaded nodes; it trains and optimizes the parameters of the load fusion evaluation model based on historical scheduling data, dynamically adjusts the scheduling strategy, and improves scheduling accuracy and adaptability. Equipped with an incremental migration submodule, a quality assessment submodule, and a shard redistribution submodule, the incremental migration submodule enables incremental migration of task fragments and related data, the quality assessment submodule evaluates scheduling performance and optimizes parameters through multi-dimensional indicators, and the shard redistribution submodule periodically adjusts the data shard layout to achieve intelligent migration of hot and cold data and optimize cluster storage and computing resource allocation.
[0019] Compared with the prior art, the beneficial effects of the present invention are: This load-aware server computing power scheduling method and system enables collaborative scheduling of computing power load and distributed storage load. By combining data correlation to optimize task allocation and data transmission paths, it improves the overall performance, resource utilization and task execution efficiency of server clusters in distributed storage scenarios.
[0020] Furthermore, it enables collaborative scheduling of computing power and storage load: through dual-dimensional load data collection and fusion evaluation, it breaks through the limitations of traditional scheduling that separates computing power and storage, avoids assigning tasks to nodes with saturated storage load, solves the problem of coexistence of idle computing power and storage bottlenecks, and significantly improves the overall resource utilization of the cluster.
[0021] Furthermore, optimize data association scheduling efficiency: Based on the data sharding distribution and association of distributed storage, prioritize local data scheduling to reduce cross-node data transmission overhead, which is especially suitable for multi-node associated data processing scenarios and improves task execution efficiency.
[0022] Furthermore, it adapts to heterogeneous nodes and mixed task scenarios: through adaptive weight adjustment and node grading, the weight of evaluation indicators can be dynamically optimized according to task type and storage / computation intensive tasks, enabling flexible scheduling in heterogeneous distributed storage clusters. The load balancing effect is better than traditional algorithms, improving the overall throughput of the cluster.
[0023] Furthermore, it exhibits strong dynamic adaptability: through real-time monitoring and feedback optimization mechanisms, it can dynamically adjust load thresholds, scheduling weights, and data sharding locations to adapt to dynamic changes in distributed storage and computing tasks, avoid overload, and improve system operational stability.
[0024] Dual enhancements in security and performance: Data privacy identification, encrypted storage, and access control ensure the security of confidential data storage and access; combined with data preloading, cache optimization, and temporary high-speed links, task execution latency is further reduced, while incremental migration and sharding redistribution mechanisms reduce resource waste, balancing system security and operational efficiency. Attached Figure Description
[0025] Figure 1 This is a schematic diagram of the server computing power scheduling method based on load awareness according to the present invention; Figure 2 This is a schematic diagram of the server computing power scheduling system based on load awareness according to the present invention. Detailed Implementation
[0026] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0027] This invention provides the following technical solution: a server computing power scheduling method based on load awareness, the core of which is multi-dimensional load data collection. This involves real-time collection of computing power load data and distributed storage load data from each server node. During the collection process, data verification and hot / cold data identification are performed, along with load fusion assessment and node classification. A load fusion assessment model is used to calculate the comprehensive load score of each node. Node levels are then classified based on load fluctuation coefficients and storage media characteristics. Task parsing and demand matching are performed, analyzing the attributes of the tasks to be scheduled and the required storage data information. Simultaneously, data privacy levels are identified and mixed-type tasks are split. A collaborative scheduling strategy is executed, selecting target nodes based on node classification and data correlation. Data preloading and temporary high-speed link scheduling are performed, optimizing task allocation and data transmission paths. Dynamic adjustments and feedback optimizations are implemented, and task execution and node load are monitored in real-time. Scheduling parameters are optimized through incremental migration and quality assessment. Data sharding layout is periodically adjusted to achieve collaborative scheduling of computing power load and distributed storage load. Combined with data correlation optimization of task allocation and data transmission paths, this improves the overall performance, resource utilization, and task execution efficiency of the server cluster in a distributed storage scenario.
[0028] Example 1: To better understand the above technical solution, the following will provide a detailed description of the technical solution in conjunction with the accompanying drawings and specific implementation methods. (Refer to...) Figure 1 As shown, Figure 1 This is a load-aware server computing power scheduling method, which includes the following steps: In S1, multi-dimensional load data is collected in real time, including computing load data and distributed storage load data from each server node. During the collection process, data verification and hot / cold data identification are performed. Specifically, distributed storage load data includes storage capacity utilization, read / write I / O throughput, number and distribution of data shards, and data redundancy between storage nodes. Computational load data includes CPU utilization, memory utilization, GPU computing speed, and task queue length. Data verification uses timestamp synchronization and hash verification to filter and complete abnormal data. Hot and cold data are based on access frequency and recent access time markers. A lightweight monitoring agent deployed on each server node collects dual-dimensional load data in real time, including computational load data and distributed storage load data. The computational load data includes CPU utilization, memory utilization, GPU computing speed, and task queue length. The distributed storage load data includes storage capacity utilization, memory utilization, GPU computing speed, and task queue length. The system collects data on load occupancy, read / write I / O throughput, number and distribution of data shards, and data redundancy between storage nodes. It also collects static attributes of each server node, including CPU core count, memory capacity, storage media type, and network bandwidth, to build a node load status database. The data collection period can be dynamically adjusted according to the scenario, with a default of 100ms. A data verification mechanism is incorporated during the collection process, using timestamp synchronization and data hash verification to exclude abnormal data, such as missing values caused by collection interruptions or errors caused by network transmission distortion. Abnormal data is supplemented using the average of the previous three periods to ensure the accuracy and reliability of the load data. Simultaneously, it collects the hot / cold attributes of data in distributed storage, access frequency, and recent access time, providing a basis for subsequent data sharding optimization.
[0029] It should be noted that the tiered data collection strategy is designed with tiered collection logic for different types of data. Computing load data, such as CPU utilization and GPU computing speed, are collected at a high frequency of 100ms by default to adapt to rapid fluctuations in computing load; distributed storage load data, such as storage capacity utilization and data redundancy, are collected at a low frequency of 1s by default to reduce the collection overhead of storage nodes; static attribute data is only collected when nodes are initialized or configurations are changed to reduce invalid data transmission. Edge preprocessing mechanism: The lightweight monitoring agent has built-in edge preprocessing capabilities to perform preliminary noise reduction and normalization on the collected raw load data, such as mapping the CPU utilization rate to the range of 0-100%, before uploading it to the node load status database to reduce the computing pressure on the central node. Adaptive dynamic threshold for hot and cold data: Based on historical cluster access data, the threshold for determining hot and cold data is dynamically adjusted. For example, when the overall cluster access volume increases, the access frequency threshold for hot data is automatically increased from 10 times / minute to 15 times / minute to prevent a large amount of data from being mistakenly marked as hot data, which could overload cache resources. At the same time, the access trend of the data is recorded, such as the daily growth rate, to provide a basis for predictive scheduling. Node communication quality acquisition: Synchronously collect network communication quality data between nodes, including link latency, packet loss rate, and bandwidth utilization, as an auxiliary basis for subsequent cross-node task scheduling, and avoid assigning tasks that require frequent cross-node interaction to node groups with poor communication quality. Data collection status monitoring and alarm: Monitor the online status and data collection success rate of each node's monitoring agent. When the agent goes offline or the data collection success rate is lower than the preset threshold of 95%, an alarm is triggered and the system automatically switches to the backup data collection channel, such as by supplementing data collection through the backup agent of the cluster management node, to ensure the continuity of data collection. Data compression and transmission optimization: The LZ4 lightweight compression algorithm is used to compress the collected data before transmission. Combined with the incremental transmission mechanism, only the difference data with the previous period is transmitted, which reduces the bandwidth consumption of cross-node data transmission and is especially suitable for large-scale distributed cluster scenarios.
[0030] S1 (Multi-dimensional load data collection) → S2 (Load fusion assessment and node classification): Real-time data synchronization: S1 pushes the verified and hot / cold identified dual-dimensional load data, node static attributes, and network communication quality data to the load fusion evaluation model of S2 in real time, as the basic input for node scoring and classification.
[0031] Dynamic threshold linkage: When S1 detects fluctuations in the overall access volume of the cluster, it synchronously feeds back the results of the cold and hot data threshold adjustment to S2. S2 then adjusts the weight coefficient of the storage load score accordingly to adapt to changes in data popularity.
[0032] Data collection anomaly triggers assessment degradation: If the data collection success rate of S1 is less than 95% and an alarm is triggered, S2 will automatically switch to the assessment mode with historical data as a backup, and calculate the node score based on the effective load data of the most recent 30 minutes to avoid data loss causing assessment failure.
[0033] In S2, load fusion assessment and node classification are performed by calculating the overall load score of nodes through the load fusion assessment model and classifying nodes based on load fluctuation coefficients and storage media characteristics.
[0034] Specifically, the load fusion evaluation model employs a weighted summation algorithm, with weight coefficients adaptively adjusted according to task type. In storage-intensive task scenarios, the storage load weight is higher than the computing load weight, while in compute-intensive task scenarios, the computing load weight is higher than the storage load weight. Based on the collected dual-dimensional load data, a load fusion evaluation model is constructed to comprehensively score and classify the load of each server node. The load fusion evaluation model uses a weighted summation algorithm, with weights adaptively adjusted through training on historical scheduling data. The specific formula is as follows:
[0035] in, To score the overall load of nodes, The standardized score for computing load is calculated by weighting metrics such as CPU utilization and memory usage. A higher score indicates a heavier computing load. The storage load standardization score is calculated by weighting metrics such as CPU utilization and memory usage. A higher score indicates a heavier computing load. The storage load standardization score is calculated by weighting metrics such as storage capacity utilization and read / write I / O throughput. A higher score indicates a heavier storage load. , These are the weighting coefficients; + =1, in storage-intensive task scenarios > In computationally intensive task scenarios > Based on comprehensive load scoring Nodes are divided into three levels: idle nodes. < Suitable nodes ≤ ≤ Overloaded nodes > ,in , This is the load threshold, which can be dynamically configured based on cluster size and business needs.
[0036] Simultaneously, a load fluctuation coefficient is introduced. , The standard deviation of the comprehensive load score of a node over the past 10 collection cycles is used to reduce the task allocation priority of nodes with excessive fluctuation coefficients. Load score weights are adjusted based on the characteristics of different storage media. Data privacy levels (public, confidential, and top secret) are identified simultaneously during task parsing. For confidential and higher-level data, only nodes with encrypted storage capabilities are selected as candidate nodes. The storage location of the data encryption key is also recorded to ensure data access security during task execution. Nodes with fluctuation coefficients exceeding a preset threshold of 15% (default) have their task allocation priority reduced to prevent task interruption due to drastic load fluctuations. Furthermore, the storage media characteristics of the nodes (SSD vs. HDD) are considered. The calculation assigns differentiated weights, with SSD nodes having a higher weight for read / write I / O throughput than HDD nodes, to adapt to the performance differences of heterogeneous storage media.
[0037] It should be noted that the weight adaptive training mechanism is as follows: a weight optimizer is built based on a reinforcement learning algorithm, with the optimization objectives of minimizing the total latency of cluster task execution and maximizing resource utilization. It uses historical scheduling data, task type, node load status, and execution efficiency for iterative training, and automatically outputs the optimal weights for different task types. , Combinations; for example, for storage-intensive tasks, when the proportion of SSD nodes in the cluster increases, the weight optimizer will adaptively reduce... The baseline value should be set to avoid over-reliance on storage load and wasting computing resources. Dynamic weighted calculation of load standardization score: The calculation no longer uses fixed weights, but dynamically adjusts the weights of the indicators according to the differences in the computing resources required by the tasks. For CPU-intensive tasks, the weight of CPU utilization is increased; for memory-intensive tasks, the weight of memory usage is increased. Similarly, for I / O-intensive storage tasks, increase the weighting of read and write throughput to ensure that the score calculation is highly matched with the task requirements; Cluster load-aware adjustment of load thresholds: , Instead of static configuration values, they are dynamically adjusted based on the overall cluster load level. When the overall cluster load falls below 30%, the value is lowered. For example, from 30 points to 20 points, or from 30 points to 20 points. If the score is increased from 70 to 80, the range of idle and suitable nodes is expanded to improve the flexibility of task allocation; when the overall cluster load exceeds 80%, the score is increased. , lower Strictly limit the node load limit to avoid cluster overload; Fine-grained weighting of storage media characteristics: Addressing the performance differences between SSDs and HDDs, not only are the weights of read / write throughput adjusted, but a storage media response time metric is also introduced. SSD node response time is given a lower negative weight; the shorter the response time, the better. The lower the score, the more lenient the requirements for HDD nodes will be; meanwhile, for hybrid storage nodes, which are equipped with both SSDs and HDDs, the calculation will be weighted according to the capacity ratio of the two media. ; The tiered weighting strategy for high volatility nodes: For nodes with volatility coefficients exceeding the limit, instead of simply reducing their priority, a tiered weighting is implemented based on the range of volatility coefficients. Nodes with volatility coefficients between 15% and 20% have their task allocation priority reduced by 20%; nodes with volatility coefficients between 20% and 30% have their priority reduced by 50%; nodes with volatility coefficients exceeding 30% are directly excluded from the candidate node pool and will be reinstated after their load volatility stabilizes, further ensuring the stability of task execution. Privacy level and node security capability matching verification: When screening candidate nodes for confidential data and above, a node security capability score index is added. This score is calculated by comprehensively considering factors such as the node's encryption algorithm level, access control precision, and security vulnerability patching frequency. Only nodes with a security capability score ≥80 are selected as candidate nodes. At the same time, a key access log is established to monitor the key call trajectory in real time and prevent key leakage or unauthorized use.
[0038] S2 (Load Convergence Assessment and Node Classification) → S3 (Task Analysis and Requirement Matching): Node capability tag synchronization: S2 synchronizes tags such as node classification results, load fluctuation coefficient, storage media characteristics, and security capability score to the task parsing module of S3 as the node attribute basis for matching task requirements.
[0039] Privacy level verification linkage: After S3 identifies the data privacy level, it directly calls the list of nodes with encrypted storage capabilities output by S2 to quickly filter candidate nodes that meet security requirements without having to repeatedly verify node qualifications.
[0040] Task type-driven weight adjustment: S3 parses the task type, and after storage / computation / hybridization, sends a weight adjustment instruction to S2. S2's weight optimizer adaptively updates the computing power / storage weights of the load fusion evaluation model to ensure that the subsequent node scores are accurately matched with the task requirements.
[0041] In S3, task parsing and requirement matching are performed, parsing the attributes of the task to be scheduled and the required stored data information, synchronously identifying the data privacy level and splitting mixed-type tasks.
[0042] Specifically, task attributes include task type (storage-intensive, compute-intensive, hybrid), required computing resources (CPU, memory), required data identifiers, data shard locations, and data relationships. Based on the data identifiers, the distributed storage metadata server queries the distribution nodes and data relationship information of the corresponding data shards, such as whether there is cross-node data association and redundant data node locations, generating a task requirement list. When parsing the task, the data privacy level is identified simultaneously: public, confidential, and top secret. For confidential and higher-level data, only nodes with encrypted storage capabilities are selected as candidate nodes, and the storage location of the data encryption key is recorded to ensure the security of data access during task execution. For hybrid tasks, computing power sub-tasks and storage sub-tasks are further broken down, and the dependencies between each sub-task are clarified to provide support for subsequent collaborative scheduling.
[0043] It should be noted that the task priority is dynamically graded and resources are reserved: when parsing task attributes, the task priority identifier is extracted simultaneously (high, medium, low). This is combined with the task submission time and business importance. For example, core business tasks are marked as high priority by default to generate the final priority level. For high-priority tasks, 10%-20% of computing power and storage resources are reserved in advance in the candidate nodes. The reservation time is 5 minutes by default and can be dynamically adjusted to avoid delays in the execution of high-priority tasks due to resource contention. At the same time, the deadline requirement of the task is recorded, and "whether the deadline is met" is used as a constraint condition for subsequent scheduling decisions. Deep data correlation analysis and topology mapping: Not only does it identify cross-node related data, but it also queries the dependency topology of data shards through the metadata server. For example, data shard D3 needs to be generated based on D1 and D2, and a data correlation topology graph is constructed. At the same time, key data shards are marked, such as irreplaceable core data and data without redundant backups. In subsequent scheduling, the access priority of key data is given priority to avoid task failure due to missing data shards or access delays. Intelligent splitting and resource allocation for mixed-type tasks: For mixed-type tasks, an intelligent splitting algorithm based on task behavior characteristics is adopted. By analyzing historical execution data of tasks, such as the proportion of computing power consumption and storage I / O, the resource requirements of computing power subtasks and storage subtasks are automatically divided. For example, if the CPU consumption of a mixed task accounts for 60% and the storage I / O consumption accounts for 40% in the historical execution, then after splitting, the computing power subtask is configured with 60% of the total resource requirements, and the storage subtask is configured with 40%. At the same time, dynamic merging of subtasks is supported. If the resource requirements of two subtasks overlap by ≥80% and there is no dependency conflict, they are automatically merged into one subtask for execution, reducing scheduling overhead. A refined data privacy level adaptation strategy: Differentiated processing rules are formulated for data with different privacy levels: Public data has no access restrictions; Confidential data requires dual verification through node identity authentication and key verification, and is only allowed to be transmitted between nodes within the cluster; Top secret data, in addition to meeting the security requirements for confidential data, additionally enables data access behavior auditing, records the access node, access time, operation type, and transmission encryption, and uses the national cryptographic algorithm SM4 for end-to-end encryption; At the same time, dynamic upgrades of privacy levels are supported. When abnormal data access frequency or security risks are detected, the data privacy level is automatically temporarily upgraded by one level until the risk is eliminated. Task compatibility verification and execution environment pre-detection: When parsing a task, the compatibility between the task and the execution environment of the server node is verified simultaneously, including the operating system version, dependent software library version, hardware driver requirements, etc.; the execution environment configuration of candidate nodes is queried through the metadata server to eliminate incompatible nodes in advance; for candidate nodes with missing dependencies, a dependency installation list is generated and pushed to the node to complete the dependency pre-installation before task allocation, so as to avoid task execution interruption due to execution environment incompatibility; Data access cost estimation and optimization suggestions: Based on the data shard distribution location, node network bandwidth, and storage media performance, the total data access cost during task execution is estimated, including data transmission time and read / write time. For tasks with access costs exceeding a preset threshold, optimization suggestions are generated, such as pre-migrating scattered related data shards to the same node, selecting a node group with better network bandwidth, and using access cost as an important reference indicator for subsequent node selection, prioritizing the candidate node with the lowest access cost.
[0044] S3 (Task parsing and requirement matching) → S4 (Cooperative scheduling strategy execution): Task requirement list delivery: The task requirement list generated by S3 includes task attributes, data privacy level, subtask splitting results, and data association topology, which directly serves as the core basis for S4 node selection and scheduling strategy execution.
[0045] Data locality priority scheduling: S3 synchronizes the data correlation parsing results, core shard location, and associated shard distribution to S4. Based on this, S4 prioritizes nodes containing core data shards and associated data sets to achieve data locality scheduling and reduce cross-node transmission overhead.
[0046] Hybrid subtask collaborative scheduling: The computing power / storage subtasks split by S3 are scheduled separately by the heterogeneous node resource adaptation unit of S4. The computing power subtasks are allocated to GPU / high CPU nodes, and the storage subtasks are allocated to SSD nodes to avoid resource contention.
[0047] In S4, the collaborative scheduling strategy is executed, target nodes are selected based on node hierarchy and data correlation, data preloading and temporary high-speed link scheduling are performed, and task allocation and data transmission paths are optimized.
[0048] Specifically, based on the task requirement list and node load classification results, differentiated scheduling strategies are implemented, including node selection, data association optimization, and load balancing fine-tuning. Node filtering: Exclude overloaded nodes and filter candidate nodes from idle and adequately loaded nodes. Filtering criteria include: node static attributes meeting task computing power and storage requirements, and storage load score. Set the storage threshold to ≤ preset to avoid storage bottlenecks; Data association optimization: For tasks that need to call data associated with multiple nodes, priority is given to selecting candidate nodes that contain core data shards and are in the associated data shard set as target nodes. If the core data shards are distributed across multiple nodes, the data transmission overhead between each node is calculated, and the node combination with the lowest transmission overhead is selected as the target node group to achieve localized data scheduling. Load balancing fine-tuning: If there are multiple candidate nodes (groups), calculate the load balancing gain of each candidate node (group), select the node (group) with the largest gain to allocate tasks, and ensure overall load balancing of the cluster; the load balancing gain is calculated by the difference between the current comprehensive load of the node and the average load of the cluster, and the smaller the difference, the greater the gain.
[0049] For storage-intensive tasks, a data sharding preloading mechanism is implemented. Before task allocation, the associated data shards are preloaded into the cache of the target node. When the cache space is insufficient, cold data shards are replaced first to reduce read and write latency during task execution. For cross-node collaborative tasks, a temporary high-speed communication link is established to reduce network latency for data interaction between nodes. After the task is completed, the link resources are released to save network bandwidth.
[0050] It should be noted that the candidate nodes are sorted by multi-dimensional adaptation: after screening the candidate nodes, they are sorted by four-dimensional indicators: resource matching degree, computing power + storage, data localization rate, network communication quality, and historical execution success rate. The weights are dynamically adjusted according to the task type. For example, the data localization rate accounts for 40% of the weight for storage-intensive tasks. The top 3 nodes (groups) in the comprehensive ranking are selected first to enter the final decision, thereby improving the scheduling accuracy. Intelligent priority allocation for preloading: When preloading data, it is preloaded in batches according to the access priority of data shards, core data > related data > redundant data and task execution sequence; for ultra-large capacity data shards, a streaming preloading mode is adopted to avoid preloading occupying too much cache and causing task blocking. Dynamic bandwidth allocation across nodes: When establishing a temporary high-speed link, the link bandwidth is dynamically allocated according to the task data transmission requirements and the current network load of the cluster. For example, 80% of the available bandwidth is allocated when transmitting very large files, and 30% of the bandwidth is allocated when transmitting small batches of interactive data. At the same time, the link avoids network congestion and selects a transmission path with low latency and low packet loss rate. Dynamic task migration trigger mechanism: If the target node experiences a sudden increase in load during task execution, approaching... If a threshold or communication link failure occurs, some subtasks will be automatically migrated to backup candidate nodes. Data consistency will be maintained during the migration process, and only unexecuted task fragments will be migrated to reduce migration overhead. Elastic adaptation of heterogeneous node resources: For heterogeneous nodes, based on their hardware characteristics, such as GPU nodes and high-IOSSD nodes, dedicated resource channels are allocated to different types of subtasks. For example, computationally intensive subtasks are scheduled to dedicated computing power channels of GPU nodes, and storage subtasks are scheduled to dedicated IO channels of SSD nodes to avoid resource contention.
[0051] S4 (Cooperative scheduling strategy execution) → S5 (Dynamic adjustment and feedback optimization): Real-time reporting of task execution status: S4 will synchronize the task allocation results, data preloading progress, and temporary link status of the target node (group) to the monitoring unit of S5 in real time, as the basis for dynamic adjustment.
[0052] Load fluctuation triggers migration warning: When S4 detects that the load fluctuation coefficient of the target node exceeds 15%, it immediately sends a warning signal to S5. S5 starts the resource reservation of the backup candidate node in advance, and triggers incremental migration when the node load approaches the threshold to avoid task interruption.
[0053] Closed-loop feedback of scheduling effect data: S4 sends data such as data transmission overhead and node load balancing gain during task execution back to the quality assessment submodule of S5 for iterative optimization of the weight coefficients and load thresholds of the load fusion assessment model.
[0054] In S5, dynamic adjustment and feedback optimization are implemented, task execution and node load are monitored in real time, scheduling parameters are optimized through incremental migration and quality assessment, and data sharding layout is adjusted periodically.
[0055] Specifically, during task execution, the dual-dimensional load data and task execution status of the target nodes (groups) are monitored in real time. If the node load reaches the overload threshold, task migration is triggered, and some tasks are migrated to idle nodes. During the migration process, the data sharding position is optimized simultaneously to reduce the data transmission overhead of subsequent tasks. At the same time, based on historical scheduling data, task execution efficiency, load balancing effect, and data transmission overhead, the weight coefficients and load thresholds of the load fusion evaluation model are adaptively adjusted to optimize the scheduling strategy. During task migration, an incremental migration mechanism is adopted, migrating only incomplete task fragments and associated data shards to avoid resource waste caused by full migration. During the feedback optimization process, task execution quality evaluation indicators, such as task completion rate and data transmission error rate, are added. Scheduled cases that fail to meet quality standards are analyzed separately, and the corresponding weight coefficients and screening conditions are adjusted. Data shards in distributed storage are redistributed periodically by default every 1 hour, migrating hot data shards to high-performance nodes and cold data shards to large-capacity nodes to optimize the overall storage layout.
[0056] It should be noted that load warning and advance scheduling: real-time monitoring of node load trends. When the node load increases by more than 20% month-on-month within 3 consecutive collection periods, but does not reach the overload threshold, an early warning is triggered and subsequent tasks to be assigned are diverted to idle nodes in advance to avoid nodes suddenly becoming overloaded and to reduce the frequency of task migration. Intelligent selection of migration nodes: When a task migration is triggered, nodes with high resource redundancy, high data similarity to the original node, and low network latency are selected from the backup candidate nodes as migration targets. At the same time, a migration buffer time is reserved, which is 300ms by default, to ensure seamless connection of task segments and avoid execution interruption. A closed-loop iterative mechanism for feedback optimization: A scheduling case library is built based on the quality assessment results, and stored according to task type (storage / computation / hybrid) and node type (SSD / HDD / GPU); every 100 cases of the same type are accumulated, the model parameters are automatically iterated once. By comparing the differences between the best and worst cases, the weight coefficients and selection conditions are accurately corrected to improve scheduling adaptability. Data shard redistribution is dynamically adjusted on demand: the redistribution cycle is no longer fixed at 1 hour, but is combined with the life cycle of hot data. For example, redistribution is triggered when hot data continues to be popular for more than 30 minutes and the cluster load is idle or the load is below 40%. For ultra-large-scale cold data shards, a lazy migration mode is adopted, and the data is only migrated to the corresponding node when there is an access demand, thereby reducing the overhead of ineffective migration. Emergency optimization strategy for abnormal scenarios: When the cluster experiences a sudden large-scale task request or node failure, the emergency mode is automatically activated, and the load threshold of idle nodes is temporarily increased. Improve by 10%, simplify scheduling decision-making process, retain only core screening conditions, suspend the redistribution of non-critical data, prioritize task execution continuity, and automatically switch back to normal mode after fault recovery.
[0057] S5 (Dynamic Adjustment and Feedback Optimization) → S1 / S2 / S3 / S4 (Full-Step Reverse Collaboration): Data sharding redistribution and linked acquisition: When S5 triggers data sharding redistribution, it sends a hot and cold data attribute update command to S1. S1 synchronously adjusts the hot and cold data identification rules to ensure that the new hot data shards are acquired at high frequency.
[0058] Model parameter iteration and synchronous update: After S5 completes the optimization of the load fusion assessment model parameters, it synchronizes the new weight coefficients and load thresholds to S2. S2 updates the node hierarchical logic in real time to improve the accuracy of subsequent assessments.
[0059] Emergency mode full-link collaboration: When S5 triggers emergency mode, it sends a simplified scheduling instruction to S3 / S4. S3 suspends the splitting of mixed tasks and the fine-grained verification of privacy levels, while S4 retains only the core node screening conditions to prioritize task continuity.
[0060] Example 2: This invention provides a load-aware server computing power scheduling system. To better understand the above technical solution, the following will describe it in detail with reference to the accompanying drawings and specific implementation methods. Figure 2 As shown, Figure 2 It is a load-aware server computing power scheduling system, which includes the following modules: The distributed storage cluster incorporates a secure storage and cache management subunit to ensure data security and read / write efficiency. Specifically, the distributed storage cluster consists of multiple heterogeneous server nodes. Each node integrates computing power processing units (CPU, memory, GPU) and storage units (hard disk, SSD). The nodes are interconnected through a high-speed network to achieve distributed data storage and collaborative computing power processing. At the same time, a metadata server is deployed to store metadata such as data shard distribution, data correlation, and node attributes. The cluster has a built-in secure storage subunit that supports encrypted data storage, symmetric encryption algorithm AES-256, and access control. Differentiated storage strategies are configured for data with different privacy levels. A cache management subunit is also provided to uniformly manage the cache resources of each node, realize intelligent preloading and hot / cold vaporization replacement of data shards, and improve data read and write efficiency.
[0061] It should be noted that when configuring differentiated storage strategies for data with different privacy levels, data integrity verification is integrated simultaneously. Based on the SHA-256 hash algorithm and operation log auditing function, data addition, deletion, modification and query operations are recorded in real time. The log retention period can be configured as needed with a default of 90 days. It also supports intelligent detection of abnormal access behavior, such as high-frequency access from different locations or unauthorized operations. After the detection is triggered, data access permissions are automatically frozen and an alarm is pushed. For top-secret data, a sharding encryption + multi-node key sharding storage mechanism is additionally enabled. Decryption can only be performed by collecting key shards from at least 3 authorized nodes, further strengthening data security protection. The cache management subunit manages the cache resources of each node in a unified manner, realizes intelligent preloading and hot / cold gasification replacement of data shards, improves data read and write efficiency, and supports dynamic elastic expansion of cache resources. When the cluster cache hit rate is lower than the preset threshold of 85% by default, it automatically requisitions part of the memory of idle nodes as temporary cache. At the same time, based on the task execution sequence and data access popularity, a cache preloading prediction model is built to identify data shards that will be accessed in advance and cache them first. Combined with the cache partitioning isolation mechanism, independent cache space is allocated to high-priority tasks to avoid cache resource contention.
[0062] The monitoring and acquisition module collects dual-dimensional load data and performs verification and hot / cold data identification.
[0063] Specifically, the monitoring and acquisition module includes lightweight monitoring agents deployed on each server node for real-time collection of computing load data and distributed storage load data. The collected data is formatted and transmitted to the load assessment module, while also supporting dynamic adjustment of the collection cycle. It integrates a data verification submodule and a hot / cold data identification submodule. The data verification submodule filters and completes abnormal data through timestamp synchronization and hash verification. The hot / cold data identification submodule marks data attributes based on access frequency and timestamps, providing data support for subsequent scheduling and storage optimization. It also supports cross-node data synchronization to ensure the consistency of the collected data.
[0064] It should be noted that when filtering and completing abnormal data, a data consistency verification mechanism is introduced simultaneously. By cross-comparing data collected across nodes, isolated data is eliminated. Abnormal data that appears for three consecutive periods is automatically triggered to trigger backup agent re-collection. If re-collection fails, a historical data prediction model is used to generate a reliable alternative value. The hot and cold data identification submodule marks data attributes based on access frequency and timestamps, and optimizes the marking logic by combining data access correlation and task execution prediction results. It marks high-frequency related data groups as hot data groups, and marks data that has not been accessed for a long time but has scheduling needs in the next 72 hours as "quasi-hot data". It also supports the dynamic flow of hot and cold attributes. When the access frequency of cold data reaches the standard, it is upgraded to hot data in real time and pushed to the cache priority queue.
[0065] The coordinated operation of the monitoring and data acquisition module and the load assessment module: Real-time transmission of dual-dimensional data: The monitoring and acquisition module transmits computing power / storage load data, network communication quality data, and hot and cold data tags to the load assessment module through incremental compression, reducing bandwidth usage.
[0066] Evaluation results are used to optimize data collection: When the load evaluation module detects abnormal load fluctuations in a node, it sends a high-frequency collection command to the monitoring and collection module, reducing the computing load collection cycle of that node from 100ms to 50ms, thereby improving data granularity.
[0067] The load assessment module combines the fluctuation coefficient and media characteristics to generate a comprehensive load score and rating for the nodes.
[0068] Specifically, the load assessment module constructs a load fusion assessment model, receives two-dimensional load data transmitted from the monitoring and acquisition module, performs standardization processing and weighted summation, generates a comprehensive load score for each node, completes node classification, and synchronizes the results to the collaborative scheduling module. It also sets up a load fluctuation analysis submodule and a media adaptation submodule. The load fluctuation analysis submodule calculates the node load fluctuation coefficient and marks nodes with high fluctuations, while the media adaptation submodule adjusts the load score weights according to the characteristics of different storage media, improving the assessment model's adaptability to heterogeneous storage clusters, supporting dynamic calibration of load thresholds, and automatically adjusting based on cluster operating status. , Numerical value.
[0069] It should be noted that when the load fluctuation analysis submodule calculates the node load fluctuation coefficient and marks high-fluctuation nodes, it simultaneously combines the load fluctuation trend slope to judge the fluctuation stability. It triggers early warning for nodes with continuously rising high fluctuation. It also supports setting fluctuation thresholds differently according to task type (the threshold for high-priority tasks is tightened to 10%). When the media adaptation submodule adjusts the load scoring weight for different storage media characteristics, it introduces the media performance degradation coefficient to dynamically correct the weight. It appropriately increases the storage load weight ratio for HDD nodes that have been used for more than 3 years. It also supports custom mapping relationship between media type and weight. It simultaneously optimizes the threshold calibration logic by referring to the node hardware resource redundancy. It appropriately relaxes the threshold range for nodes with sufficient CPU and memory redundancy and tightens the threshold for nodes with tight resources to avoid overload. The calibration results are synchronized to the node classification system in real time and the adjustment log is retained.
[0070] The task parsing module identifies data privacy levels and splits mixed tasks.
[0071] Specifically, the task parsing module receives external tasks to be scheduled, parses task attributes and data requirements, obtains data sharding distribution and correlation information by querying the metadata server, generates a task requirement list, and transmits it to the collaborative scheduling module. It integrates a privacy level identification submodule and a task splitting submodule. The privacy level identification submodule identifies the privacy attributes of task-related data and filters nodes that meet security requirements. The task splitting submodule splits mixed-type tasks into computing power subtasks and storage subtasks, clarifies the subtask dependencies, and improves scheduling accuracy.
[0072] It should be noted that when identifying the privacy attributes of task-related data, the data encryption status and node security qualifications are verified simultaneously. For confidential and above data, nodes with national cryptographic algorithms are automatically matched, a secure access token is generated, and an expiration date is set to prevent unauthorized access caused by token leakage. The task splitting submodule splits mixed-type tasks into computing power subtasks and storage subtasks based on task resource consumption characteristics and execution sequence, clarifying the subtask dependencies and resource allocation. New features include: support for dynamic merging and splitting of subtasks; automatic merging of adjacent subtasks with resource requirement overlap exceeding 80% to reduce scheduling overhead; splitting subtasks whose resource requirements exceed the node limit into parallel execution subtask fragments; marking critical path subtasks and increasing their scheduling priority, significantly improving scheduling accuracy and task execution efficiency.
[0073] Task parsing module ↔ Cooperative scheduling module: Precise matching of task requirements and node capabilities: The task requirement list generated by the task parsing module is compared in real time with the node classification results of the collaborative scheduling module, and candidate nodes with a resource matching degree of ≥80% are automatically selected.
[0074] Subtask dynamic merging and linkage: Subtasks with resource overlap ≥80% marked by the task parsing module are automatically merged into a single task for execution by the collaborative scheduling module, reducing scheduling overhead and improving execution efficiency.
[0075] The collaborative scheduling module performs preloading and temporary link scheduling.
[0076] Specifically, the collaborative scheduling module, based on the node hierarchy results and task requirement list, performs node screening, data association optimization, and load balancing fine-tuning operations to determine target nodes (groups) and assign tasks. Simultaneously, it coordinates the target nodes (groups) to optimize data transmission paths and data sharding locations. It includes a pre-loading management submodule and a temporary link scheduling submodule. The pre-loading management submodule performs data sharding pre-loading and cache optimization for storage-intensive tasks, while the temporary link scheduling submodule establishes high-speed communication links for cross-node collaborative tasks and dynamically releases resources. It supports task priority adaptation, prioritizing the scheduling of high-priority tasks to ensure the stable execution of core business operations.
[0077] It should be noted that when the preloading management submodule performs data shard preloading and cache optimization for storage-intensive tasks, it supports batch preloading based on data access priority and task execution sequence. It enables a cache locking mechanism for core data shards to prevent them from being replaced. At the same time, it dynamically adjusts the preloading ratio based on the cache hit rate to avoid wasting cache resources. When dynamically releasing resources, it intelligently selects the link transmission protocol, such as RDMA protocol, based on real-time network bandwidth and latency data between nodes. It also reserves dedicated bandwidth channels for high-priority cross-node tasks. After the link is used up, it triggers resource release detection to ensure no residual occupation. It also introduces a task preemption mechanism. When a high-priority task arrives, a low-priority task can be paused and its occupied resources can be migrated. After the high-priority task is completed, the low-priority task is resumed. At the same time, resource preemption logs are recorded for subsequent scheduling strategy optimization to ensure the stable execution of core business.
[0078] Cooperative scheduling module ↔ Feedback optimization module: Closed-loop optimization of preloading effect: The preloading management submodule of the collaborative scheduling module synchronizes cache hit rate and preloading delay data to the feedback optimization module, which then adjusts the cache preloading ratio and priority rules accordingly.
[0079] Link resource dynamic release linkage: The temporary link scheduling submodule of the collaborative scheduling module triggers resource release detection after the link is used up, and the feedback optimization module synchronously updates the node network communication quality data for subsequent link path optimization.
[0080] The feedback optimization module optimizes scheduling parameters and adjusts data sharding layout through incremental migration and quality assessment.
[0081] Specifically, the feedback optimization module monitors task execution status and node load changes in real time, triggering task migration for overloaded nodes; it trains and optimizes the parameters of the load fusion evaluation model based on historical scheduling data, dynamically adjusts scheduling strategies, and improves scheduling accuracy and adaptability; it is equipped with an incremental migration submodule, a quality assessment submodule, and a sharding redistribution submodule. The incremental migration submodule realizes incremental migration of task fragments and related data, the quality assessment submodule evaluates scheduling effects and optimizes parameters through multi-dimensional indicators, and the sharding redistribution submodule periodically adjusts the data sharding layout to realize intelligent migration of hot and cold data and optimize cluster storage and computing resource configuration.
[0082] It should be noted that the system supports breakpoint resumption and data consistency verification during the migration process. An incremental synchronization algorithm is used to transmit only the difference data of unexecuted task segments. Temporary priorities are assigned to migration tasks to shorten migration time and avoid interfering with normal task execution. The quality assessment submodule evaluates scheduling effectiveness and optimizes parameters through multi-dimensional indicators. New features include: constructing a four-dimensional evaluation system encompassing task execution efficiency, resource utilization, data transmission overhead, and task completion rate; performing root cause analysis on scheduling cases with evaluation results below thresholds; automatically generating model parameter adjustment schemes and verifying their effectiveness; and the sharding redistribution submodule periodically adjusts the data sharding layout to achieve intelligent migration of hot and cold data. New features include: selecting off-peak periods based on cluster load tidal patterns, such as 2:00-4:00 AM for sharding migration. A data sharding aggregation strategy is also introduced to merge frequently associated small data shards for storage, reducing metadata management overhead and optimizing cluster storage and computing resource configuration.
[0083] Feedback optimization module ↔ Distributed storage cluster: Hot and cold data migration linked caching: When the feedback optimization module triggers data shard redistribution, it sends a hot data preloading instruction to the cache management subunit of the distributed storage cluster. The cache management subunit will then cache the hot data shards to high-performance nodes first.
[0084] Privacy level dynamic upgrade linked to secure storage: When the feedback optimization module detects abnormal data access, it sends a privacy level upgrade command to the secure storage subunit, which then automatically upgrades the data encryption strength and freezes abnormal access permissions.
[0085] The contents not described in detail in this specification are existing technologies known to those skilled in the art.
[0086] Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A load-aware server computing power scheduling method, characterized in that: Includes the following steps: S1. Multi-dimensional load data collection: Real-time collection of computing load data and distributed storage load data of each server node, and data verification and hot / cold data identification during the collection process. S2. Load fusion assessment and node classification: The load fusion assessment model is used to calculate the comprehensive load score of the node, and the node level is classified by combining the load fluctuation coefficient and storage medium characteristics. S3. Task parsing and requirement matching: Parse the attributes of the task to be scheduled and the required stored data information, and simultaneously identify the data privacy level and split mixed-type tasks. S4. Cooperative scheduling strategy execution: Target nodes are selected based on node hierarchy and data correlation, data preloading and temporary high-speed link scheduling are performed, and task allocation and data transmission paths are optimized. S5 features dynamic adjustment and feedback optimization, real-time monitoring of task execution and node load, optimization of scheduling parameters through incremental migration and quality assessment, and periodic adjustment of data sharding layout.
2. The server computing power scheduling method based on load awareness according to claim 1, characterized in that: S1, the distributed storage load data includes storage capacity utilization, read / write IO throughput, number and distribution of data shards, and data redundancy between storage nodes. The computing load data includes CPU utilization, memory utilization, GPU computing speed, and task queue length. The data verification uses timestamp synchronization and hash verification to filter and complete abnormal data. Hot and cold data are based on access frequency and recent access time markers. A data verification mechanism is added during the collection process. Abnormal data is excluded through timestamp synchronization and data hash verification. Abnormal data is completed using the average of the previous 3 periods to ensure the accuracy and reliability of the load data.
3. The server computing power scheduling method based on load awareness according to claim 2, characterized in that: The load fusion evaluation model in S2 employs a weighted summation algorithm, with weight coefficients adaptively adjusted according to task type. In storage-intensive task scenarios, the weight of storage load is higher than that of computing load, while in computing-intensive task scenarios, the weight of computing load is higher than that of storage load. Simultaneously, a load fluctuation coefficient is introduced, reducing the task allocation priority for nodes with excessive fluctuation coefficients. Load scoring weights are adjusted based on the characteristics of different storage media. Data privacy levels are simultaneously identified during task parsing; for confidential or higher-level data, only nodes with encrypted storage capabilities are selected as candidate nodes. The storage location of the data encryption key is also recorded to ensure the security of data access during task execution.
4. The server computing power scheduling method based on load awareness according to claim 3, characterized in that: The S3 task attributes include task type and required storage data. Based on the data identifier, the distributed nodes and data association information of the corresponding data shards are queried from the distributed storage metadata server to generate a task requirement list. When parsing the task, the data privacy level is identified simultaneously. For confidential and higher-level data, only nodes with encrypted storage capabilities are selected as candidate nodes. At the same time, the storage location of the data encryption key is recorded to ensure the security of data access during task execution.
5. The server computing power scheduling method based on load awareness according to claim 4, characterized in that: S4 executes differentiated scheduling strategies based on the task requirement list and node load classification results, including node filtering, data association optimization, and load balancing fine-tuning. For storage-intensive tasks, a data sharding preloading mechanism is executed. Before task allocation, associated data shards are preloaded into the cache of the target node. When the cache space is insufficient, cold data shards are replaced first to reduce read and write latency during task execution.
6. The server computing power scheduling method based on load awareness according to claim 5, characterized in that: In step S5, an incremental migration mechanism is used during task migration, which only migrates task fragments that have not been completed and associated data fragments, thus avoiding resource waste caused by full migration.
7. A load-aware server computing power scheduling system, applied to the load-aware server computing power scheduling method according to any one of claims 1-6, characterized in that: It includes a distributed storage cluster, a monitoring and acquisition module, a load assessment module, a task parsing module, a collaborative scheduling module, and a feedback optimization module; The distributed storage cluster has a built-in secure storage and cache management subunit to ensure data security and read / write efficiency; The monitoring and acquisition module collects dual-dimensional load data and performs verification and hot / cold data identification. The load assessment module combines the fluctuation coefficient and media characteristics to generate a comprehensive load score and rating for the node. The task parsing module identifies data privacy levels and splits mixed tasks; The collaborative scheduling module performs preloading and temporary link scheduling; The feedback optimization module optimizes scheduling parameters and adjusts data sharding layout through incremental migration and quality assessment.
8. A server computing power scheduling system based on load awareness according to claim 7, characterized in that: The distributed storage cluster has a built-in secure storage sub-unit that supports encrypted data storage and access control, and configures differentiated storage strategies for data with different privacy levels. The monitoring and acquisition module integrates a data verification submodule and a hot / cold data identification submodule. The data verification submodule filters and completes abnormal data through timestamp synchronization and hash verification. The hot / cold data identification submodule marks data attributes based on access frequency and timestamp, providing data support for subsequent scheduling and storage optimization.
9. A server computing power scheduling system based on load awareness according to claim 8, characterized in that: The load assessment module includes a load fluctuation analysis submodule and a media adaptation submodule. The load fluctuation analysis submodule calculates the node load fluctuation coefficient and marks nodes with high fluctuations. The media adaptation submodule adjusts the load scoring weights according to the characteristics of different storage media, improving the assessment model's adaptability to heterogeneous storage clusters. The task parsing module integrates a privacy level identification submodule and a task splitting submodule. The privacy level identification submodule identifies the privacy attributes of task-related data and filters nodes that meet security requirements. The task splitting submodule splits mixed-type tasks into computing power subtasks and storage subtasks, clarifies the subtask dependencies, and improves scheduling accuracy.
10. A server computing power scheduling system based on load awareness according to claim 9, characterized in that: The collaborative scheduling module includes a preloading management submodule and a temporary link scheduling submodule. The preloading management submodule performs data fragmentation preloading and cache optimization for storage-intensive tasks, while the temporary link scheduling submodule establishes high-speed communication links for cross-node collaborative tasks and dynamically releases resources. The feedback optimization module is equipped with an incremental migration submodule, a quality assessment submodule, and a shard redistribution submodule. The incremental migration submodule realizes the incremental migration of task fragments and related data. The quality assessment submodule evaluates the scheduling effect and optimizes parameters through multi-dimensional indicators. The shard redistribution submodule periodically adjusts the data shard layout to realize the intelligent migration of hot and cold data and optimize the cluster storage and computing resource configuration.