A computing power cloud resource scheduling method and system based on adaptive learning
The adaptive learning-based cloud resource scheduling method uses training datasets and long short-term memory networks to predict resource demand and optimize resource allocation. This solves the problem of resource idleness or overload in traditional scheduling methods and improves the responsiveness and cost-effectiveness of cloud services.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- 邓慕超
- Filing Date
- 2025-04-18
- Publication Date
- 2026-06-16
AI Technical Summary
Traditional cloud resource scheduling methods struggle to cope with dynamic changes and diverse demands of task loads, leading to idle or overloaded resources, increased operating costs, and limitations on the responsiveness of cloud services in high-concurrency scenarios.
An adaptive learning-based cloud computing resource scheduling method is adopted. By acquiring a training dataset to train a long short-term memory network, short-term resource demand is predicted. Combined with system status and task priority, the resource allocation scheme is optimized to reduce resource idleness or overload.
It improves the responsiveness of cloud services in high-concurrency scenarios, reduces the probability of resource idleness or overload, and lowers operating costs.
Smart Images

Figure CN120336026B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of information technology, and in particular to a method and system for scheduling computing power cloud resources based on adaptive learning. Background Technology
[0002] Currently, cloud computing, as a core pillar of the information technology field, directly determines the balance between system performance and cost-effectiveness through its resource scheduling capabilities. This is especially true in computing-intensive applications such as artificial intelligence and big data analytics, where efficient resource utilization has become crucial for industry competition. However, traditional cloud resource scheduling methods often rely on static configuration or simple rule-based strategies, making it difficult to cope with dynamic changes in task load and diverse demands. This leads to widespread resource idleness or overload. This inefficiency not only increases operating costs but also limits the responsiveness of cloud services in high-concurrency scenarios.
[0003] Therefore, it is necessary to provide a novel cloud computing resource scheduling method and system based on adaptive learning to solve the above-mentioned problems in the existing technology. Summary of the Invention
[0004] The purpose of this invention is to provide a method and system for scheduling computing power cloud resources based on adaptive learning, which reduces the probability of resource idleness or overload.
[0005] To achieve the above objectives, the cloud computing resource scheduling method based on adaptive learning of the present invention includes:
[0006] The original training data is obtained, and then the original training data is divided into time segments using a sliding window method to obtain the training dataset. The original training data includes historical load task volume and historical real-time task request volume.
[0007] The Long Short-Term Memory network is trained using the training dataset to obtain a prediction model;
[0008] The prediction model is used to predict short-term resource demand. Then, combined with the current resource usage status of the system, it is determined whether there is a trend of sudden increase or decrease in resource demand, so as to obtain an adjusted prediction value. Based on the adjusted prediction value, it is determined whether it is necessary to increase the resources in the system.
[0009] The system obtains the status of resources in the system, sorts the system's task priorities, determines the computing power allocation ratio required for each task, and obtains a preliminary resource allocation plan.
[0010] Based on the constraints of resource scheduling, the preliminary resource allocation scheme is optimized to obtain an optimized resource allocation scheme;
[0011] Resources are scheduled according to the optimized resource allocation scheme.
[0012] The beneficial effects of the adaptive learning-based cloud resource scheduling method are as follows: A Long Short-Term Memory (LSTM) network is trained using the training dataset to obtain a prediction model. This model predicts short-term resource demands, and by combining this prediction with the current resource usage status of the system, it determines whether there is a sudden increase or decrease in resource demand, thus obtaining an adjusted prediction value. This ensures the accuracy of the prediction. The system's resource status is then obtained, and the system's task priorities are sorted to determine the required computing power allocation ratio for each task, resulting in a preliminary resource allocation scheme. This preliminary scheme is then optimized based on resource scheduling constraints to obtain an optimized resource allocation scheme. This reduces the probability of resource idleness or overload, reduces operating costs, and improves the responsiveness of cloud services in high-concurrency scenarios.
[0013] Optionally, after obtaining the raw training data, the following may also be included:
[0014] The original training data is then cleaned.
[0015] Optionally, before training the Long Short-Term Memory network using the training dataset, the method further includes:
[0016] The historical workload of each time period is compared with the preset workload threshold.
[0017] Based on the comparison results, each time period is divided into peak or trough periods. A first weight is assigned to the historical workload during the peak period for training the Long Short-Term Memory Network, and a second weight is assigned to the historical workload during the trough period for training the Long Short-Term Memory Network. The first weight is greater than the second weight.
[0018] Optionally, when predicting the forecast value of short-term resource demand using the prediction model, the method further includes:
[0019] When the historical real-time task request volume changes suddenly, the predicted value of the short-term resource demand predicted by the prediction model is compared with the preset sudden threshold.
[0020] If the predicted value of short-term resource demand by the prediction model exceeds the preset burst threshold, the weight of the historical real-time task request volume in the prediction model is adjusted, and the predicted value of short-term resource demand is re-predicted.
[0021] Optionally, based on the current resource usage status of the system, it can be determined whether there is a sudden increase or decrease in resource demand to obtain adjusted forecast values, including:
[0022] The current resource usage status of the system includes the real-time load quantity. The weighted average of the real-time load quantity and the predicted value of short-term resource demand is used to obtain the adjusted predicted value.
[0023] The predicted value during adjustment is compared with the preset adjustment threshold. If the predicted value during adjustment is greater than the preset adjustment threshold, the resource demand trend is determined.
[0024] By observing the changes in the magnitude of the predicted values during multiple consecutive adjustments, it can be determined whether the trend of resource demand is a sudden increase or a decrease, so as to obtain an initial adjustment direction;
[0025] The initial adjustment direction is combined with the task type of the real-time load to update the predicted value during the adjustment, resulting in the adjusted predicted value.
[0026] Optionally, the resources in the system are provided by several processors, and the status of the resources in the system includes the unloaded computing power of several processors; the status of the resources in the system is obtained, and then the task priorities of the system are sorted to determine the computing power allocation ratio required for each task, so as to obtain a preliminary resource allocation scheme, including:
[0027] Obtain the computing power of several processors in the system when they are not under load;
[0028] Based on the system's task requirements for computing power, prioritize the tasks and determine the computing power allocation ratio required for each task.
[0029] Tasks are assigned to several processors in sequence. If the computing power required by the current task exceeds the computing power of the corresponding processor, the current task is assigned to the next processor. If multiple tasks are assigned to the same processor, the tasks are processed according to their priority.
[0030] Optionally, the resource scheduling constraints include processor load computing power limitations; based on the resource scheduling constraints, the initial resource allocation scheme is optimized to obtain an optimized resource allocation scheme, including:
[0031] If it is determined that the processor's computing power exceeds its load capacity limit after being assigned a task, the processor will be reassigned to the lowest priority task to obtain an optimized resource allocation scheme.
[0032] Optionally, the cloud computing resource scheduling method based on adaptive learning further includes:
[0033] Determine whether the result of subtracting the average computing power from the processor's current computing power is greater than the equilibrium threshold;
[0034] If the result of subtracting the average computing power from the processor's current computing power is greater than the balance threshold, then the task requiring the least computing power from the processor with the highest current computing power will be assigned to the processor with the lowest current computing power.
[0035] Optionally, the resources in the system are provided by several processors, and the adaptive learning-based cloud computing resource scheduling method further includes:
[0036] Determine whether the difference between the adjusted predicted value and the real-time load is greater than the target threshold;
[0037] If the difference between the adjusted predicted value and the real-time load is greater than the target threshold, the original training data is updated to update the prediction model.
[0038] Optionally, the cloud computing resource scheduling method based on adaptive learning further includes:
[0039] Predicted values are generated using the updated prediction model;
[0040] Determine whether the difference between the predicted value generated by the updated prediction model and the actual load is greater than the target threshold;
[0041] If the difference between the predicted value generated by the updated prediction model and the actual load is greater than the target threshold, the original training data will continue to be updated to update the prediction model.
[0042] Optionally, the resources in the system are provided by several processors, and the adaptive learning-based cloud computing resource scheduling method further includes:
[0043] Retrieve the resource scheduling execution logs from the system;
[0044] The idle rate and overload rate of resources are obtained through the scheduling execution log;
[0045] The resource scheduling frequency is adjusted based on the idle rate and the overload rate.
[0046] This invention also provides a cloud computing resource scheduling system based on adaptive learning, comprising a data processing unit, a training unit, a model running unit, and a resource allocation unit. The data processing unit acquires raw training data and then uses a sliding window method to segment the raw training data by time to obtain a training dataset. The raw training data includes historical load task volume and historical real-time task request volume. The training unit trains a Long Short-Term Memory network using the training dataset to obtain a prediction model. The model running unit predicts short-term resource demand using the prediction model, and then, combined with the current resource usage status of the system, determines whether there is a sudden increase or decrease in resource demand to obtain an adjusted prediction value. Based on the adjusted prediction value, it determines whether to increase resources in the system. The resource allocation unit optimizes the initial resource allocation scheme based on resource scheduling constraints to obtain an optimized resource allocation scheme, and schedules resources according to the optimized resource allocation scheme. Attached Figure Description
[0047] Figure 1 This is a flowchart of a cloud computing resource scheduling method based on adaptive learning in some embodiments of the present invention. Detailed Implementation
[0048] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions in the embodiments of this invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of this invention. All other embodiments obtained by those skilled in the art based on the embodiments of this invention without inventive effort are within the scope of protection of this invention. Unless otherwise defined, the technical or scientific terms used herein should have the ordinary meaning understood by those skilled in the art. The terms "comprising" and similar expressions used herein mean that the element or object preceding the word covers the element or object listed following the word and its equivalents, but do not exclude other elements or objects.
[0049] To address the problems existing in the prior art, embodiments of the present invention provide a computing power cloud resource scheduling method based on adaptive learning. (Refer to...) Figure 1 The cloud resource scheduling method based on adaptive learning includes the following steps:
[0050] S1: Obtain the raw training data, and then use a sliding window method to segment the raw training data by time to obtain the training dataset. The raw training data includes historical load task volume and historical real-time task request volume. For example, at 2 PM on a past day, the system was performing 50 tasks and waiting for 80 tasks. The load task volume is the sum of the tasks being performed and the waiting tasks, which is 130. Relative to this moment, this is the historical load task volume. The task request volume at the current moment is the historical real-time task request volume. For example, if the current time is 12 PM, then the task request volume at 12 PM is the historical real-time task request volume. If the current time is 3 PM, then the task request volume at 3 PM is the historical real-time task request volume.
[0051] In some embodiments, after acquiring the original training data, the process further includes: data cleaning of the original training data. The original training data may contain missing values or outliers due to equipment malfunctions or network latency in the system.
[0052] When missing values are present, imputation methods can be used to fill them. For example, if the historical load of the system for a certain hour is missing, it can be filled by the average of the two hours before and after, or by a percentage of the average of the two hours before and after, such as 70%.
[0053] When outliers are detected, the median can be used to replace them. For example, if the historical load of the system is 50 in the first minute, 60 in the third minute, and 600 in the second minute, it can be seen that the historical load in the second minute is much larger than that in the first and third minutes, exceeding a reasonable range. Therefore, the median of the historical load in the first and third minutes, 55, can be used to replace the historical load in the second minute (600) to ensure data smoothness.
[0054] In some embodiments, the process of dividing the original training data by time using a sliding window method can be represented as follows: Setting the sliding window size to 1 hour and the step size to 15 minutes, the 24 hours are divided into 96 time periods, with each time period containing the average value of the original training data. For example, the first time period includes a first time point, a second time point, a third time point, and a fourth time point. The historical workload at the first time point is 44, the historical workload at the second time point is 48, the historical workload at the third time point is 46, and the historical workload at the fourth time point is 52. Therefore, the average of 44, 48, 46, and 42, which is 45, is taken as the historical workload of the first time period.
[0055] In some embodiments, the raw training data obtained in step S1 is at least the raw training data of any past day, or it can be any past 2 days, any 3 days, or even any 30 days.
[0056] In some embodiments, before training the Long Short-Term Memory (LSTM) network using the training dataset, the method further includes: comparing the historical workload of each time period with a preset workload threshold; dividing each time period into peak or trough periods based on the comparison results; assigning a first weight for training the LSM network to the historical workload during peak periods; and assigning a second weight for training the LSM network to the historical workload during trough periods, wherein the first weight is greater than the second weight.
[0057] The preset load thresholds include peak load thresholds and trough load thresholds. For example, the peak load threshold is 75, and the trough load threshold is 25. Both thresholds can be adjusted based on actual conditions; no specific restrictions are imposed here. Taking the total historical load for any given time period on any day as an example, if the total historical load from 2 PM to 3 PM is 90, and the total historical load from 10 PM to midnight is 20, then the total historical load of 90 from 2 PM to 3 PM is greater than the peak load threshold of 75, and therefore 2 PM to 3 PM is considered a peak period. Conversely, the total historical load of 20 from 10 PM to midnight is less than the trough load threshold of 25, and therefore 10 PM to midnight is considered a trough period.
[0058] In some embodiments, the first weight is 0.6 and the second weight is 0.4. When the duration of the peak period is 1.5 times the duration of the trough period, the first weight can be adjusted to 0.7 and the second weight can be adjusted to 0.3 to form a balanced training set.
[0059] S2: Train the Long Short-Term Memory network using the training dataset to obtain a prediction model.
[0060] In some embodiments, in step S2, the Long Short-Term Memory (LSTM) network, which is a prior art technology, is selected to build the prediction model. Since the LSTM network is a prior art technology, the specific training process will not be described in detail here.
[0061] S3: The predicted value of short-term resource demand is predicted by the prediction model, and then combined with the current resource usage status of the system, it is determined whether there is a trend of sudden increase or decrease in resource demand, so as to obtain the adjusted prediction value.
[0062] In some embodiments, when predicting the short-term resource demand using the prediction model, the method further includes: when the historical real-time task request volume undergoes a sudden change, comparing the predicted short-term resource demand value predicted by the prediction model with a preset sudden threshold; if the predicted short-term resource demand value predicted by the prediction model exceeds the preset sudden threshold, adjusting the weight of the historical real-time task request volume in the prediction model, and re-predicting the predicted short-term resource demand value.
[0063] In some embodiments, when the number of real-time task requests is more than 1.5 times that of the previous task request, it is determined that the current number of real-time task requests has undergone a sudden change.
[0064] In some embodiments, the historical workload and historical real-time task request volume jointly determine the prediction model. When the historical real-time task request volume experiences a sudden change, the predicted short-term resource demand may be affected by the historical real-time task request volume, leading to an abnormally large predicted short-term resource demand. Therefore, when the historical real-time task request volume experiences a sudden change, the predicted short-term resource demand is compared with a preset sudden threshold. When the predicted short-term resource demand is greater than the preset sudden threshold, it indicates that the predicted short-term resource demand is abnormally large. By adjusting the weight of the historical real-time task request volume in the prediction model, the impact of sudden changes in the historical real-time task request volume on the predicted value can be reduced.
[0065] In some embodiments, the system's current resource usage status is considered to determine whether there is a sudden increase or decrease in resource demand, in order to obtain an adjusted predicted value. This includes: the current resource usage status of the system includes the real-time load quantity; a weighted average is taken of the real-time load quantity and the predicted value of short-term resource demand to obtain an adjusted predicted value; the adjusted predicted value is compared with a preset adjustment threshold; if the adjusted predicted value is greater than the preset adjustment threshold, a resource demand trend is determined; by observing the changes in the magnitude of multiple consecutive adjusted predicted values, it is determined whether the resource demand trend is a sudden increase or decrease, to obtain a preliminary adjustment direction; the preliminary adjustment direction is combined with the task type of the real-time load to update the adjusted predicted value, resulting in an adjusted predicted value. Specifically, a decrease in the real-time load quantity will result in a smaller adjusted predicted value, and an increase in the real-time load quantity will result in a larger adjusted predicted value. This method can smooth the impact of sudden changes in historical data by integrating historical and real-time data.
[0066] In some embodiments, taking a predicted value of 60 for short-term resource demand as an example, that is, the prediction model predicts that the task volume of the next stage is 60, while the number of real-time loads is 80. The weight of the number of real-time loads is 0.4, and the predicted value of short-term resource demand is 0.6. Then, by weighted averaging the predicted values of the number of real-time loads and the short-term resource demand, the adjusted predicted value can be 68.
[0067] In some embodiments, taking the preset adjustment threshold as an example, if the predicted value 68 during adjustment is less than the preset adjustment threshold 70, then the predicted value during adjustment is updated to 68, and the adjusted predicted value is 68.
[0068] In some embodiments, taking the preset adjustment threshold of 70 as an example, if the predicted value 78 during adjustment is greater than the preset adjustment threshold of 70, several predicted values during adjustment are obtained. If the several predicted values during adjustment obtained in chronological order include 68, 70, and 75, it can be determined that the resource demand trend is a sudden increase, and the initial adjustment direction is to increase the predicted value. If the several predicted values during adjustment obtained in chronological order include 88, 85, and 79, it can be determined that the resource demand trend is a decrease, and the initial adjustment direction is to decrease the predicted value.
[0069] In some embodiments, the real-time workload tasks can be divided into normal computing tasks, difficult data computing tasks, and simple data computing tasks. The corresponding difficult data computing tasks have a large demand for resources, while the simple data computing tasks have a small demand for resources. In other words, difficult data computing tasks consume more resources than simple data computing tasks.
[0070] For example, a normal computing task requires 50 units of system resources, a difficult data computing task requires 80 units of system resources, and a simple data computing task requires 20 units of system resources.
[0071] When the real-time load task type is a difficult data processing task, it means that the task type of the next stage is also likely to be a difficult data processing task. The system resources required for a task in the next stage are greater. In order to avoid system overload, it is necessary to increase the estimated prediction value. If the initial adjustment direction is to increase the prediction value, then increase the prediction value by 20% based on the prediction value in the adjustment to obtain the adjusted prediction value; if the initial adjustment direction is to decrease the prediction value, then decrease the prediction value by 10% based on the prediction value in the adjustment to obtain the adjusted prediction value.
[0072] When the real-time workload is a simple data processing task, it means that the task type of the next stage is also likely to be a simple data processing task. The system resources required for a task in the next stage are smaller. In order to avoid system overload, the estimated prediction value needs to be reduced. If the initial adjustment direction increases the prediction value, the prediction value in the adjustment is increased by 10% to obtain the adjusted prediction value. If the initial adjustment direction decreases the prediction value, the time is reduced by 20% in the adjustment prediction value.
[0073] In some embodiments, the adjusted predicted value is compared with the number of tasks at the system's highest load. When the ratio of the adjusted predicted value to the number of tasks at the system's highest load is greater than or equal to 80%, it is determined that the system needs to increase resources, for example, from 3 servers to 1 server.
[0074] S4: Obtain the status of resources in the system through the adjusted predicted values, then sort the system's task priorities, determine the computing power allocation ratio required for each task, and obtain a preliminary resource allocation plan.
[0075] In some embodiments, the resources in the system are provided by several processors, and the state of the resources in the system includes the computing power of several processors not under load. The process involves obtaining the state of the resources in the system, then prioritizing the tasks in the system to determine the computing power allocation ratio required for each task, in order to obtain a preliminary resource allocation scheme. This includes: obtaining the computing power of several processors not under load in the system; prioritizing the tasks according to their computing power requirements and determining the computing power allocation ratio required for each task; sequentially allocating tasks to several processors; if the computing power required by the current task exceeds the computing power of the corresponding processor under load, then the current task is allocated to the next processor; if the same processor is allocated multiple tasks, then the tasks are processed according to their priority.
[0076] Taking a system that includes a cluster of Graphics Processing Units (GPUs) as an example, a cluster of GPUs includes 4 GPUs, each GPU has a total computing power of 100 units, and the total computing power of the 4 GPUs is 400 units. 400 units is the resource in the system.
[0077] The four graphics processors are designated as the first, second, third, and fourth graphics processors. The first and second graphics processors are fully loaded, while the third and fourth graphics processors are idle. Therefore, the resource status in the system is as follows: the first graphics processor is fully loaded, the second graphics processor is fully loaded, the third graphics processor is idle, and the fourth graphics processor is idle.
[0078] Suppose there are 5 tasks, designated as Task 1, Task 2, Task 3, Task 4, and Task 5. Task 1 requires 50 units of computing power, Task 2 requires 30 units, Task 3 requires 40 units, Task 4 requires 20 units, and Task 5 requires 10 units. The tasks are prioritized according to their required computing power, with higher priority tasks requiring more computing power. Therefore, the priorities of Task 1, Task 2, Task 3, Task 4, and Task 5 are: Task 1, Task 3, Task 2, Task 4, and Task 5. The computing power allocation ratio for Task 1 is 50 / 150, for Task 2 it's 30 / 150, for Task 3 it's 40 / 150, for Task 4 it's 20 / 150, and for Task 5 it's 10 / 150.
[0079] The first task is assigned to the first processor, the second task to the second processor, the third task to the third processor, the fourth task to the fourth processor, and the fifth task to the first processor. However, since the first and second processors are fully loaded, the first and fifth tasks are assigned to the third processor. The third processor then handles three tasks: the first, third, and fifth tasks. The total computing power required for these three tasks is 100 units, which is sufficient for the third processor. If the third processor's computing power is insufficient, the tasks are assigned to the next processor, the fourth processor. The third processor then processes the first, third, and fifth tasks sequentially according to their priority. The second task is then assigned to the fourth processor. The fourth processor then handles two tasks: the second and fourth tasks. The total computing power required for these two tasks is 50 units (e.g., based on the number of CPU / GPU cores or floating-point performance). The fourth processor's computing power is sufficient for these three tasks. The fourth processor then processes the first and third tasks sequentially according to their priority.
[0080] S5: Optimize the initial resource allocation scheme by combining the constraints of resource scheduling to obtain an optimized resource allocation scheme.
[0081] In some embodiments, the resource scheduling constraints include the processor's load computing power limit; in combination with the resource scheduling constraints, the preliminary resource allocation scheme is optimized to obtain an optimized resource allocation scheme, including: if it is determined that after the processor is assigned a task, the processor's already loaded computing power exceeds the processor's load computing power limit, then the processor is reassigned to the lowest priority task to obtain an optimized resource allocation scheme.
[0082] Taking the processor's load computing power limit as an example, that is, if the total computing power of the graphics processor is 100 units, then the graphics processor can process a maximum of 80 units of tasks.
[0083] Suppose the graphics processing unit (GPU) handles two tasks, designated Task 1 and Task 2, with Task 1 having a higher priority than Task 2. Task 1 requires 50 units of computing power, while Task 2 requires 35 units, for a total computing power requirement of 85 units. When the GPU is handling Task 1 and Task 2, the load is 85%, exceeding the 80% load capacity limit. Therefore, the GPU needs to be reallocated to Task 2.
[0084] In some embodiments, the cloud resource scheduling method based on adaptive learning includes: determining whether the result of subtracting the average computing power from the computing power already loaded by the processor is greater than a balancing threshold; if the result of subtracting the average computing power from the computing power already loaded by the processor is greater than the balancing threshold, then assigning the task requiring the least computing power from the processor with the largest computing power to the processor with the smallest computing power already loaded. Here, the average computing power refers to the average computing power loaded by all processors.
[0085] Assume there are four graphics processing units (GPUs): GPU 1, GPU 2, GPU 3, and GPU 4. GPU 1 handles task 1, GPU 2 handles task 2, GPU 3 handles task 3, and GPU 4 handles tasks 4 and 5. Task 1 requires 38 units of computing power, so GPU 1 has a total computing power load of 38 units. Task 2 requires 35 units of computing power, so GPU 2 has a total computing power load of 35 units. Task 3 requires 41 units of computing power, so GPU 3 has a total computing power load of 41 units. Task 4 requires 40 units of computing power, and task 5 requires 6 units of computing power, so GPU 4 has a total computing power load of 46 units. The average computing power of the first, second, third, and fourth graphics processors is 40 units. If the balancing threshold is set to 5 units, the result of subtracting the average computing power from the computing power already loaded on the fourth graphics processor is 6 units, which is greater than the balancing threshold of 5 units. Therefore, the fifth task with the lowest computing power requirement in the fourth graphics processor is assigned to the second graphics processor with the lowest computing power load.
[0086] S6: Schedule resources according to the optimized resource allocation scheme.
[0087] In some embodiments, the resources in the system are provided by several processors, and the cloud resource scheduling method based on adaptive learning further includes: determining whether the difference between the adjusted predicted value and the real-time load is greater than a target threshold; if the difference between the adjusted predicted value and the real-time load is greater than the target threshold, then the original training data is updated to update the prediction model.
[0088] In some embodiments, the cloud resource scheduling method based on adaptive learning further includes: generating a predicted value through an updated prediction model; determining whether the difference between the predicted value generated by the updated prediction model and the actual load is greater than a target threshold; if the difference between the predicted value generated by the updated prediction model and the actual load is greater than the target threshold, then continuing to update the original training data to update the prediction model.
[0089] After the adjusted prediction value is generated by the prediction model and the computing power cloud resources are scheduled, the system will generate new original training data. If the difference between the adjusted prediction value and the real-time load is greater than the target threshold, it indicates that the prediction value generated by the prediction model has a large deviation. In this case, the prediction model needs to be corrected and updated. Updating the prediction model with new original training data can reduce the deviation of the subsequently generated prediction values and make the generated prediction values meet the requirements.
[0090] In some embodiments, the cloud computing resource scheduling method based on adaptive learning further includes: retrieving resource scheduling execution logs from the system; obtaining the idle rate and overload rate of the resources through the scheduling execution logs; and adjusting the resource scheduling frequency according to the idle rate and the overload rate.
[0091] Suppose there are two graphics processing units (GPUs), designated as GPU 1 and GPU 2. Within a preset time period, such as 10 minutes, GPU 1 has an idle rate of 20% and an overload rate of 1%. Since GPU 1's idle rate is too high, its scheduling frequency needs to be increased. Similarly, GPU 2 has an idle rate of 1% and an overload rate of 25%. Since GPU 2's overload rate is too high, its scheduling frequency needs to be decreased. This means that tasks that should be allocated to GPU 2 can be preferentially allocated to GPU 1.
[0092] This invention also provides an adaptive learning-based cloud resource scheduling system for implementing the aforementioned cloud resource scheduling, comprising a data processing unit, a training unit, a model running unit, and a resource allocation unit. The data processing unit acquires raw training data and then uses a sliding window method to segment the raw training data by time to obtain a training dataset. The raw training data includes historical load task volume and historical real-time task request volume. The training unit trains a Long Short-Term Memory network using the training dataset to obtain a prediction model. The model running unit predicts short-term resource demand using the prediction model, and then, combined with the current resource usage status of the system, determines whether there is a sudden increase or decrease in resource demand to obtain an adjusted prediction value. Based on the adjusted prediction value, it determines whether to increase resources in the system. The resource allocation unit optimizes the initial resource allocation scheme based on resource scheduling constraints to obtain an optimized resource allocation scheme, and schedules resources according to the optimized resource allocation scheme.
[0093] While embodiments of the present invention have been described in detail above, it will be apparent to those skilled in the art that various modifications and variations can be made to these embodiments. However, it should be understood that such modifications and variations fall within the scope and spirit of the invention as set forth in the claims. Furthermore, the invention described herein may have other embodiments and can be implemented or carried out in various ways.
Claims
1. A method for scheduling computing power cloud resources based on adaptive learning, characterized in that, include: The original training data is obtained, and then the original training data is divided into time segments using a sliding window method to obtain the training dataset. The original training data includes historical load task volume and historical real-time task request volume. The Long Short-Term Memory network is trained using the training dataset to obtain a prediction model; The prediction model is used to predict short-term resource demand. Then, combined with the current resource usage status of the system, it is determined whether there is a trend of sudden increase or decrease in resource demand, so as to obtain an adjusted prediction value. Based on the adjusted prediction value, it is determined whether it is necessary to increase the resources in the system. The system obtains the status of resources in the system, sorts the system's task priorities, determines the computing power allocation ratio required for each task, and obtains a preliminary resource allocation plan. Based on the constraints of resource scheduling, the preliminary resource allocation scheme is optimized to obtain an optimized resource allocation scheme; Resources are scheduled according to the optimized resource allocation scheme; The predicted values of the prediction model include the predicted quantity of tasks, which is used to represent the predicted value of short-term resource demand. This includes determining, based on the current resource usage status of the system, whether there is a sudden increase or decrease in resource demand, in order to obtain adjusted forecast values, including: The current resource usage status of the system includes the real-time load quantity. The weighted average of the real-time load quantity and the predicted value of short-term resource demand is used to obtain the adjusted predicted value. The predicted value during adjustment is compared with the preset adjustment threshold. If the predicted value during adjustment is greater than the preset adjustment threshold, the resource demand trend is determined. By observing the changes in the magnitude of the predicted values during multiple consecutive adjustments, it can be determined whether the trend of resource demand is a sudden increase or a decrease, so as to obtain an initial adjustment direction; The initial adjustment direction is combined with the task type of the real-time load to update the predicted value during the adjustment, resulting in the adjusted predicted value.
2. The cloud computing resource scheduling method based on adaptive learning according to claim 1, characterized in that, Before training the Long Short-Term Memory network using the training dataset, the method further includes: The historical workload of each time period is compared with the preset workload threshold. Based on the comparison results, each time period is divided into peak or trough periods. A first weight is assigned to the historical workload during the peak period for training the Long Short-Term Memory Network, and a second weight is assigned to the historical workload during the trough period for training the Long Short-Term Memory Network. The first weight is greater than the second weight.
3. The cloud computing resource scheduling method based on adaptive learning according to claim 1, characterized in that, When predicting short-term resource demand using the prediction model, the method further includes: When the historical real-time task request volume changes suddenly, the predicted value of the short-term resource demand predicted by the prediction model is compared with the preset sudden threshold. If the predicted value of short-term resource demand by the prediction model exceeds the preset burst threshold, the weight of the historical real-time task request volume in the prediction model is adjusted, and the predicted value of short-term resource demand is re-predicted.
4. The cloud computing resource scheduling method based on adaptive learning according to claim 1, characterized in that, The system's resources are provided by several processors, and the status of these resources includes the unloaded computing power of each processor. The system's resource status is obtained, and then the system's tasks are prioritized to determine the required computing power allocation ratio for each task, thus obtaining a preliminary resource allocation scheme, including: Obtain the computing power of several processors in the system when they are not under load; Based on the system's task requirements for computing power, prioritize the tasks and determine the computing power allocation ratio required for each task. Tasks are assigned to several processors in sequence. If the computing power required by the current task exceeds the computing power of the corresponding processor, the current task is assigned to the next processor. If multiple tasks are assigned to the same processor, the tasks are processed according to their priority.
5. The cloud computing resource scheduling method based on adaptive learning according to claim 4, characterized in that, Resource scheduling constraints include processor load and computing power limitations. Based on these constraints, the initial resource allocation scheme is optimized to obtain an optimized resource allocation scheme, including: If it is determined that the processor's computing power exceeds its load capacity limit after being assigned a task, the processor will be reassigned to the lowest priority task to obtain an optimized resource allocation scheme.
6. The cloud computing resource scheduling method based on adaptive learning according to claim 4 or 5, characterized in that, Also includes: Determine whether the result of subtracting the average computing power from the processor's current computing power is greater than the equilibrium threshold; If the result of subtracting the average computing power from the processor's current computing power is greater than the balance threshold, then the task requiring the least computing power from the processor with the highest current computing power will be assigned to the processor with the lowest current computing power.
7. The cloud computing resource scheduling method based on adaptive learning according to claim 1, characterized in that, The system's resources are provided by several processors, and the adaptive learning-based cloud computing resource scheduling method further includes: Determine whether the difference between the adjusted predicted value and the real-time load is greater than the target threshold; If the difference between the adjusted predicted value and the real-time load is greater than the target threshold, the original training data is updated to update the prediction model. Predicted values are generated using the updated prediction model; Determine whether the difference between the predicted value generated by the updated prediction model and the actual load is greater than the target threshold; If the difference between the predicted value generated by the updated prediction model and the actual load is greater than the target threshold, the original training data will continue to be updated to update the prediction model.
8. The cloud computing resource scheduling method based on adaptive learning according to claim 1, characterized in that, The system's resources are provided by several processors, and the adaptive learning-based cloud computing resource scheduling method further includes: Retrieve the resource scheduling execution logs from the system; The idle rate and overload rate of resources are obtained through the scheduling execution log; The resource scheduling frequency is adjusted based on the idle rate and the overload rate.
9. A computing power cloud resource scheduling system based on adaptive learning, characterized in that, The method for scheduling cloud computing resources based on adaptive learning as described in any one of claims 1 to 8 includes a data processing unit, a training unit, a model running unit, and a resource allocation unit. The data processing unit acquires raw training data and then segments the raw training data by time using a sliding window method to obtain a training dataset. The raw training data includes historical load task volume and historical real-time task request volume. The training unit trains a Long Short-Term Memory network using the training dataset to obtain a prediction model. The model running unit predicts short-term resource demand using the prediction model, then, based on the current resource usage status of the system, determines whether there is a sudden increase or decrease in resource demand to obtain an adjusted prediction value. Based on the adjusted prediction value, it determines whether additional resources are needed in the system. The resource allocation unit optimizes the initial resource allocation scheme based on resource scheduling constraints to obtain an optimized resource allocation scheme, and schedules resources according to the optimized resource allocation scheme.