An emerging cloud resource elastic scheduling method and system based on machine learning

By calculating scheduling weights in edge cloud nodes and combining network latency and bandwidth to assess the actual delivery capabilities between nodes, the problems of distorted computing power supply and demand assessment and traffic overflow are solved, and efficient computing power scheduling and task allocation in heterogeneous networks are realized.

CN122248068APending Publication Date: 2026-06-19XIAN MINGFU CLOUD COMPUTING CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
XIAN MINGFU CLOUD COMPUTING CO LTD
Filing Date
2026-05-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies in cloud computing and edge computing networks suffer from problems such as distorted assessment of computing power supply and demand and inaccurate prediction of network traffic overflow. In particular, when the underlying communication links are congested, it is impossible to accurately assess the network delivery capacity between nodes, resulting in the obstruction or erroneous overflow of business data flow.

Method used

By acquiring the network latency, available bandwidth, and idle computing power coefficient of edge cloud nodes, and combining a single-layer feedforward neural network and a long short-term memory network, scheduling weights are calculated. This process removes false indicators of hardware idle rate, weakens the evaluation weight of remote nodes, and rationally allocates computing tasks to edge nodes that balance data distribution and underlying computing capacity.

Benefits of technology

Achieving efficient and reasonable computing power scheduling in heterogeneous network environments, avoiding restricted network nodes, predicting and avoiding network cascading impacts and traffic overflows, ensuring that computing tasks are executed on nearby nodes with sufficient computing power and complete data, and maintaining stable network response under massive bursts of concurrent services.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122248068A_ABST
    Figure CN122248068A_ABST
Patent Text Reader

Abstract

This invention belongs to the field of edge cloud computing network resource scheduling technology in information processing discipline, and relates to an emerging cloud resource elastic scheduling method and system based on machine learning. The method includes: obtaining the state sequence of edge cloud nodes in an idle state and a routing adjacency matrix based on routing hop count; calculating network congestion based on the state sequence to obtain the computing power supply index of the edge cloud nodes; obtaining the attention coefficient by combining the routing adjacency matrix and the state sequence, and obtaining the computing power demand index through the output of the Long Short-Term Memory network; obtaining the scheduling weight based on the computing power demand index, computing power supply index and cache affinity, and allocating tasks according to the scheduling weight; by comprehensively considering the external network communication congestion and the risk of future business load surges, preventing transmission delays of computing tasks in restricted links, and improving the reliability of computing power scheduling.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of edge cloud computing network resource scheduling technology in the field of information processing discipline, and particularly relates to an emerging cloud resource elastic scheduling method and system based on machine learning, which is used for network resource allocation and task scheduling in heterogeneous edge cloud environments. It can prevent the transmission delay of computing tasks in restricted links and improve the reliability of computing power scheduling. Background Technology

[0002] Cloud computing and edge computing networks play a vital role in the fields of concurrent processing of massive data and distributed computing. In the process of providing computing services, uneven node load and business processing delays often occur. Therefore, efficient and reasonable network resource scheduling is particularly important for ensuring the service quality of cloud platforms.

[0003] Existing related technologies, such as Chinese patent application CN117931424A, disclose a network resource scheduling optimization system for a cloud environment. This system includes a data acquisition module, a machine learning module, and a real-time scheduling decision module. It trains a deep learning neural network on historical network traffic and resource utilization data, uses a time series prediction model to predict future network traffic and resource load, and performs real-time resource allocation accordingly. However, related technologies mainly rely on time series models to independently extrapolate the historical traffic and load of a single node. When congestion occurs in the underlying communication link, the queuing delay of data packets increases significantly and the available bandwidth is limited, causing business data flow to be blocked outside the network link. At this time, the resource load prediction cannot truly reflect the node's data reception and actual delivery capabilities.

[0004] Among existing related technologies, Chinese patent document CN114675972B discloses a cloud network resource elastic scheduling method and system based on an integral algorithm. This scheme updates the integral value by collecting the actual number of CPU clock cycles consumed by each virtual logical server and the percentage of those cycles, and dynamically limits the CPU consumption of VMs within the chip during the next working time based on this integral value, thereby achieving resource allocation. However, this scheme mainly relies on the CPU computing power consumption of a single node and the integral mechanism for one-dimensional rate limiting and allocation, failing to fully consider the underlying communication resistance of cross-regional edge computing networks. When data interaction faces congestion in the underlying links, simply relying on CPU resource integrals cannot accurately assess the actual network delivery capacity between nodes, easily leading to distorted assessments of computing power supply and demand.

[0005] Among existing related technologies, Chinese patent document CN121210092A discloses a cloud resource elastic scheduling optimization method based on reinforcement learning. This method acquires dynamic demand data of tasks in a cloud computing platform and defines task states and action spaces. It then employs a reinforcement learning algorithm combined with a reward function encompassing resource utilization efficiency, task response time, and energy consumption for dynamic resource scheduling. However, when modeling platform resource states, this related technology is not constrained by network routing resistance. Furthermore, when assessing node load evolution, the lack of specific constraints on network space cascading propagation makes it difficult to effectively identify false data associations between nodes without direct links or those located extremely far apart, easily leading to misjudgments of erroneous overflow of business traffic between network-isolated nodes. Summary of the Invention

[0006] The purpose of this invention is to overcome the shortcomings of the prior art, solve the technical problems of distorted computing power supply and demand assessment and inaccurate prediction of network traffic space overflow, and provide an emerging cloud resource elastic scheduling method and system based on machine learning.

[0007] To achieve the aforementioned objectives, this invention provides an emerging cloud resource elastic scheduling method based on machine learning. The main process includes the following steps: obtaining an idle computing power coefficient based on the average of the processor idle ratio and the memory idle ratio; combining the node network latency, uplink and downlink available bandwidth of the edge cloud node with the idle computing power coefficient to obtain a state sequence; using the smaller value of the uplink and downlink available bandwidth as the equivalent available bandwidth; calculating the ratio of node network latency to equivalent available bandwidth to obtain network congestion; adjusting the idle computing power coefficient using the network congestion to obtain the computing power supply index of the edge cloud node; taking the state sequence of any edge cloud node as the first sequence, and taking any adjacent node of that edge cloud node... The state sequence of a point is the second sequence; the first sequence and the second sequence are concatenated and input into a single-layer feedforward neural network to obtain the correlation between the edge cloud node and its neighboring nodes; the correlation is adjusted by combining the number of router devices in the packet forwarding path to obtain the attention coefficient; the second sequences of each neighboring node are fused using the attention coefficient to obtain the fused feature vector, and the fused feature vector is input into the long short-term memory network model to output the computing power demand index; the hit ratio of the dependent files required by the task to be scheduled is used as the cache affinity, and the scheduling weight is obtained according to the computing power supply index, cache affinity and computing power demand index, and the task to be scheduled is distributed to the edge cloud node with the first position in the ranking according to the ranking result of the scheduling weight.

[0008] This invention extracts network congestion and routing hop count from edge cloud nodes to attenuate and truncate the initial idle state of hardware and the data correlation between nodes. It then combines the local cache affinity of nodes to obtain the final scheduling weight, thereby executing task distribution. This transforms the phenomenon of congestion in the underlying communication links into downward correction parameters on the computing power supply side, and transforms the spatial cascading effect of sudden business traffic spreading along the network architecture to adjacent nodes into attention attenuation weights for feature fusion between nodes. By introducing communication indicators and routing resistance, it effectively removes false node associations with no direct links or excessively long spans, offsetting the supply and demand distortion caused by relying solely on hardware idle rate for evaluation. This prompts scheduling decisions to automatically avoid restricted network nodes and high-load nodes that are predicted to face native business impacts, and rationally allocates computing tasks to edge nodes that balance data distribution and underlying computing capacity. This makes the actual delivery of computing power closer to the underlying transmission patterns of the edge network, thereby achieving efficient and reasonable computing power scheduling in a heterogeneous network environment.

[0009] The scheduling weights described in this invention satisfy the following relation: In the formula, For the first The first edge cloud node is for the first The scheduling weight of each task to be scheduled For the first The first edge cloud node in the... The computing power supply index at each time step. For the first The edge cloud node pairs the first Cache affinity of each pending task. For the first The first edge cloud node in the... The computing power demand index for each time step.

[0010] This invention transforms the latency risks caused by fetching large files across regions and the overload risks of nodes facing native business demands into collaborative control weights for task distribution priorities. On the numerator side, nodes with locally resident data have their scheduling priority increased, offsetting the transmission overhead of remote data retrieval; on the denominator side, the weight of nodes predicted to be under high load is reduced, avoiding collisions between external tasks and node native tasks competing for hardware bus and computing resources, making scheduling decisions approach a reasonable allocation of in-memory computing resources.

[0011] The present invention describes using the smaller of the uplink and downlink available bandwidths as the equivalent available bandwidth, which includes: linearly normalizing the uplink and downlink available bandwidths respectively, extracting the normalized uplink available bandwidth and the normalized downlink available bandwidth; and using the smaller of the normalized uplink available bandwidth and the normalized downlink available bandwidth as the equivalent available bandwidth.

[0012] The method of obtaining network congestion by calculating the ratio of node network latency to equivalent available bandwidth according to the present invention includes: adding the equivalent available bandwidth to a preset minimum value to obtain the corrected available bandwidth; calculating the normalized ratio of node network latency to corrected available bandwidth, and using the ratio as the network congestion level.

[0013] This invention extracts the smaller value between the normalized uplink and downlink available bandwidth and uses it as the equivalent available bandwidth. This transforms the phenomenon of data packets being constrained by bottleneck links during bidirectional transmission between edge cloud nodes and the cloud into a capacity limitation parameter for bandwidth bottlenecks. During data interaction, the complete flow of tasks depends on the joint coordination of uplink and downlink. The method of taking the smaller value offsets the false prosperity brought about by unidirectional high bandwidth, removes redundant bandwidth data that cannot be actually used for the entire link business flow, and makes the evaluation index closer to the true bidirectional data throughput boundary of the node, avoiding the overestimation of the node's data carrying capacity.

[0014] The computing power supply index described in this invention satisfies the following relationship: In the formula, For the first The first edge cloud node in the... The computing power supply index at each time step. For the first The first edge cloud node in the... The idle computing power coefficient at each time step For the first The first edge cloud node in the... Network congestion at each time step.

[0015] This invention transforms the risk of data not being effectively delivered to nodes due to communication link deterioration into a weighted reduction in hardware computing power. When congestion increases, it rapidly suppresses the computing power score brought by idle hardware, offsetting the false impression of inflated evaluation caused by nodes having idle hardware but no usable network. It transfers the pure hardware idle attribute to the actual network delivery capability, thereby making the supply index approach the true computing capacity boundary of edge nodes under the current communication constraints.

[0016] The present invention describes adjusting the correlation based on the number of router devices in the packet forwarding path to obtain the attention coefficient, which includes: extracting the packet forwarding path between each edge cloud node using the border gateway protocol; counting the number of router devices traversed between any two edge cloud nodes and using the corresponding number of router devices as the routing hop count; and setting the corresponding attention coefficient to zero in response to the routing hop count being greater than the maximum effective hop count threshold.

[0017] The present invention further includes adjusting the correlation based on the number of router devices in the packet forwarding path to obtain the attention coefficient, and further includes: in response to the routing hop count being less than or equal to the maximum effective hop count threshold, combining the routing hop count and the correlation to obtain the attention coefficient.

[0018] The attention coefficient described in this invention satisfies the following relationship: In the formula, For the first The edge cloud node and the first Attention coefficient among neighboring cloud nodes For the first The edge cloud node and the first The correlation between neighboring cloud nodes For the first The edge cloud node and the first Number of hops between neighboring cloud nodes This is the threshold for the maximum number of valid hops.

[0019] The present invention describes fusing the second sequences of each adjacent node using attention coefficients to obtain a fused feature vector, and inputting the fused feature vector into a long short-term memory network model to output a computing power demand index. This includes: weighting the corresponding second sequences with non-zero attention coefficients as weights to obtain a fused feature vector containing spatial structure attributes; constructing an input sequence from the fused feature vectors corresponding to each time step within a historical preset time window; inputting the input sequence into a pre-trained long short-term memory network model for temporal evolution feature extraction, outputting a load prediction value for the next time step; and performing max-min normalization on the load prediction value to obtain the computing power demand index for the next time step.

[0020] This invention uses a non-zero attention coefficient as a weight to perform a weighted summation of state sequences, constructs the fused features into an input sequence, and feeds it into a long short-term memory network to extract temporal evolution features and obtain a computing power demand index. It transforms the spatial cascading phenomenon of sudden traffic bursts in a specific area and outward spread along the topology into a spatiotemporal feature coordination process containing structural attributes. It distributes and transfers the operating load of surrounding adjacent nodes to the central node according to the magnitude of communication resistance. It can coordinate the historical state evolution in the time dimension and the network spillover in the spatial dimension, so that the deduced demand index approaches the real traffic surge transmitted from adjacent links.

[0021] The present invention also provides an emerging cloud resource elastic scheduling system based on machine learning, including a processor and a memory, wherein the memory stores computer program instructions, and when the computer program instructions are executed by the processor, the above-mentioned emerging cloud resource elastic scheduling method based on machine learning is implemented.

[0022] By adopting the above-mentioned technical solution, the present invention generates a computer program from the above-mentioned machine learning-based emerging cloud resource elastic scheduling method and stores it in a memory so that it can be loaded and executed by a processor. This allows for the creation of a terminal device based on the memory and processor, making it convenient to use.

[0023] Compared with existing technologies, this invention has at least the following beneficial effects: First, it incorporates the fluctuation of the underlying communication link into the evaluation of node availability. By comprehensively considering the bottlenecks of uplink and downlink available bandwidth and data packet queuing delays, it eliminates the false computing power indicators caused by hardware idleness, and compensates for the local limitations of relying solely on processor utilization for decision-making. This allows the computing power evaluation standard to evolve from the hardware level to a real delivery level that includes network connectivity, thus providing a benchmark with real reachability for resource task flow under harsh and complex network conditions. Second, it integrates the blocking effect of router topology on data interaction into the correlation analysis of adjacent nodes, by using routing hop count as a hard blocking rule and soft distance... The first method uses a penalty factor to cut off and filter out false node associations that are completely isolated in cyberspace but occasionally exhibit waveform overlap. This makes the constructed spatiotemporal input characteristics closely match the trajectory of burst data flow spreading layer by layer along optical fibers, providing a prediction boundary close to avoid network cascading impacts and traffic overflows in advance. The second method treats the local resident ratio of task-dependent files as a compensatory weighting of node supply and demand, and regards the high load state faced by the prediction as a demotion penalty for allocation priority. The task distribution action automatically deviates from edge cloud nodes that need to transfer large files across the network remotely and face traffic peaks, so that computing tasks can be executed near the best node with sufficient computing power and complete data, maintaining the network's stable response under massive burst concurrent services. Attached Figure Description

[0024] Figure 1 This is a flowchart illustrating the emerging cloud resource elastic scheduling method based on machine learning that is involved in this invention.

[0025] Figure 2 This invention relates to a time-series variation diagram of the scheduling weight of edge cloud clusters. Detailed Implementation

[0026] The technical solution of the present invention will now be clearly and completely described in conjunction with the embodiments and accompanying drawings.

[0027] Example 1: This embodiment discloses an emerging cloud resource elastic scheduling method based on machine learning, the main process of which is described in the appendix. Figure 1 This includes steps S1-S4: S1. Obtain the state sequence and the routing adjacency matrix.

[0028] Typically, the operational status of edge cloud nodes providing computing power services is constrained by both their own hardware consumption and the external network environment. Furthermore, the interconnection of nodes via fiber optic cables and routing devices determines the data transmission path of sudden traffic surges within the cluster. Scheduling based solely on the hardware load of a single node ignores spatial cascading effects, leading to erroneous overflows of sudden traffic to neighboring nodes without direct connections. Therefore, this invention obtains state sequences and a routing adjacency matrix containing hop counts, providing a foundation for subsequent computing power assessment under the network environment.

[0029] Specifically, when the system is in a no-load, idle state, operation logs are continuously collected within a preset time period using cloud monitoring probes. The processor idle ratio and memory idle ratio are extracted from each operation log within the preset time period. A first average of all processor idle ratios and a second average of all memory idle ratios are calculated. The average of the first and second averages is used as the idle computing power coefficient. Node network latency and available uplink and downlink bandwidth are extracted from the operation logs. Based on timestamps, the idle computing power coefficient, node network latency, and available uplink and downlink bandwidth are combined and normalized to obtain a state sequence. The packet forwarding paths between edge cloud nodes are extracted using the border gateway protocol. The number of router devices traversed between any two edge cloud nodes is counted, and this number is used as the corresponding routing hop count. Based on all routing hop counts, the routing adjacency matrix of the edge cloud node cluster is obtained.

[0030] For example, the preset time period is 5 minutes. Implementers can determine the preset time period based on the actual situation. For example, when the hardware of the edge cloud node is highly volatile or there are many background processes, the time period can be appropriately extended to filter out instantaneous jitter and obtain a more stable baseline idle state. When the system has high requirements for the real-time deployment of nodes, the time period can be appropriately shortened to shorten the initialization waiting time.

[0031] For example, an edge cloud node cluster containing nodes A, B, and C is used in timestamps. Taking the state as an example, if the idle computing power coefficient of node A is 0.8, the node network latency is 50ms, and the available uplink and downlink bandwidth is 100Mbps, after linear normalization according to the preset maximum latency of 100ms and maximum bandwidth of 1000Mbps, the state of node A can be obtained. The state sequence corresponding to the given time is [0.8, 0.5, 0.1]. Simultaneously, through border gateway protocol parsing, it is known that the packet forwarding path from node A to node B actually passes through 2 router devices, from node A to node C through 1 router device, and from node B to node C through 3 router devices. These router device numbers are used as the corresponding route hop counts, thus constructing a route adjacency matrix with diagonal elements of 0. .

[0032] S2. Obtain the computing power supply index of edge cloud nodes.

[0033] It should be noted that due to the heterogeneity of edge computing networks, the actual availability of computing power differs from the idle state of the hardware. When edge cloud nodes are in a congested state, increased packet queuing latency and limited available bandwidth cause business data flows to be blocked outside the network link. Conventional methods directly equate hardware idle rate with available computing power, which can lead to task distribution timeouts. Therefore, this invention uses communication indicators that reflect network congestion status to dynamically adjust the initial idle computing power coefficient downwards, obtaining an effective computing power supply status that reflects the actual delivery capability.

[0034] Specifically, the idle computing power coefficient, node network latency, and uplink and downlink available bandwidth of the state sequence at the current time step are obtained. The uplink and downlink available bandwidths are linearly normalized, and the smaller of the normalized uplink and downlink available bandwidths is extracted as the equivalent available bandwidth. The equivalent available bandwidth is added to a preset minimum value to obtain the corrected available bandwidth. The ratio of the normalized node network latency to the corrected available bandwidth is calculated and used as the network congestion level. Combining the idle computing power coefficient and the network congestion level, the computing power supply index of the edge cloud node is obtained.

[0035] For example, the preset minimum value is 0.001.

[0036] Specifically, the computing power supply index satisfies the following relationship: ; In the formula, For the first The first edge cloud node in the... The computing power supply index at each time step. For the first The first edge cloud node in the... The idle computing power coefficient at each time step For the first The first edge cloud node in the... Network congestion at each time step.

[0037] in, This represents the severity of data transmission congestion between edge cloud nodes and the cloud center. A higher value indicates a greater likelihood of underlying link congestion, leading to… Approaching 0, this negatively impacts the idle computing power coefficient; a smaller value indicates a higher likelihood of smooth network transmission, leading to... The value approaches 1, thus preserving the initial state of the idle computing power coefficient. Based on the network latency pattern, the network congestion degree and the idle computing power coefficient jointly constrain the boundary of the node's computing power output, thereby avoiding the risk of scheduling computing tasks to restricted network nodes.

[0038] S3. Obtain the computing power demand index of edge cloud nodes.

[0039] It should be noted that network traffic exhibits spatial cascading and propagation; traffic bursts in specific areas can spread along the network architecture to adjacent nodes. Conventional graph networks, when evaluating node associations, rely on the feature similarity of data feature vectors for weight allocation. This can lead to a phenomenon where the underlying network routing resistance is ignored, resulting in false overflow paths in the prediction results indicating network connectivity issues. Therefore, this invention utilizes routing hop count to attenuate data correlation, ensuring that the extracted spatial features conform to data transmission patterns.

[0040] Specifically, to obtain the current time step, the first... The state sequence of the nth edge cloud node is recorded as the first sequence; simultaneously, the state sequence of the nth edge cloud node is obtained. The state sequence of the n neighboring cloud nodes is denoted as the second sequence. The first sequence and the second sequence are concatenated as vectors, and the concatenated vector is input into a single-layer feedforward neural network with an output dimension of 1 for dimensionality reduction mapping to obtain the nth... The edge cloud node and the first The correlation between neighboring cloud nodes is analyzed; the corresponding route hop count is extracted from the route adjacency matrix, and the route hop count is compared with a preset maximum effective hop count threshold; if the route hop count is greater than the maximum effective hop count threshold, then the first node is determined to be the first node. If a neighboring cloud node is in the overflow blocking zone, its attention coefficient is set to 0; in response to the route hop count being less than or equal to the maximum effective hop count threshold, the attention coefficient is obtained by combining the route hop count and correlation.

[0041] For example, the maximum effective hop count threshold is 4. Implementers can determine this threshold according to the actual situation. When the physical distribution of the edge cloud cluster is relatively concentrated and the router hierarchy is shallow, the threshold can be appropriately reduced to reduce the computational overhead of invalid nodes. When the cluster topology is distributed in a long chain, the threshold can be appropriately increased to prevent the overflow traffic of remote nodes from being missed.

[0042] Specifically, the attention coefficient satisfies the following relationship: ; In the formula, For the first The edge cloud node and the first Attention coefficient among neighboring cloud nodes For the first The edge cloud node and the first The correlation between neighboring cloud nodes For the first The edge cloud node and the first Number of hops between neighboring cloud nodes This is the threshold for the maximum number of valid hops.

[0043] in, This represents the resistance to data flow between two nodes. Normalize the route hop count to an open interval of 0 to 1, so that even in directly connected nodes... In scenarios where the attenuation factor remains greater than zero, the effective correlation between directly connected nodes is preserved. A larger value indicates a greater distance between the routing levels of the two edge cloud nodes in the network topology, thus suppressing the correlation downwards. A smaller value indicates a higher probability of direct interconnection between the two edge cloud nodes, thus preserving their initial correlation. Based on the phenomenon of hop count attenuation in network transmission, the routing hop count is non-linearly adjusted to obtain an attention coefficient, avoiding misprediction of the propagation of service traffic between network-isolated nodes. Correlation It reflects the similarity of the two nodes in the business load waveform, and the feature aggregation by combining the routing hop count aims to eliminate false data associations between nodes without direct links or those that are extremely far apart, so that the subsequently fused features can reflect the data spillover effect of network traffic propagating along the communication link.

[0044] Furthermore, with the first The attention coefficients between each edge cloud node and all its non-zero attention coefficient neighbor cloud nodes are used as weights to perform weighted fusion on the corresponding second sequence, resulting in a fused feature vector containing spatial structure attributes. A pre-trained Long Short-Term Memory (LSTM) network model is then invoked, and the fused feature vectors corresponding to each time step within a historical preset time window are constructed as an input sequence. This input sequence is fed into the LSM network model for temporal evolution feature extraction, outputting the load prediction value for the next time step. This load prediction value is then subjected to min-max normalization to obtain the computing power demand index for the next time step.

[0045] For example, taking edge cloud node A and its neighboring cloud nodes B and C with non-zero attention coefficients as an example; the state sequence of B is [0.8, 0.5, 0.1], and the state sequence of node C is [0.6, 0.4, 0.2]; the attention coefficient between node A and node B is 0.7, and the attention coefficient between node A and node C is 0.3; when performing feature aggregation, each element in the state sequence of node B is multiplied by its corresponding attention coefficient 0.7 to obtain the first feature component [0.56, 0.35, 0.07], and each element in the state sequence of node C is multiplied by its corresponding attention coefficient 0.3 to obtain the second feature component [0.18, 0.12, 0.06]; the data at the corresponding positions in the first feature component and the second feature component are added together to finally obtain the fused feature vector of node A as [0.74, 0.47, 0.13].

[0046] In one embodiment, a single-layer feedforward neural network includes a linear fully connected layer and a non-linear activation layer connected sequentially. The weight matrix of the linear fully connected layer is a single column vector, used to reduce the dimensionality of the concatenated vector (which has twice the input dimension) to a single scalar through inner product operations. The non-linear activation layer employs a Leaky ReLU function. During model training, the single-layer feedforward neural network receives gradient information backed up by the backend Long Short-Term Memory network based on the final prediction residual, and iteratively updates its internal weight matrix using the backpropagation algorithm.

[0047] In one embodiment, the training process of the Long Short-Term Memory (LSTM) network includes: acquiring state sequence samples from historical time periods and corresponding real computing load labels to construct an initial network model comprising a single-layer feedforward neural network and an LTM network. During training, the LTM network acts as the temporal evolution backend, receiving the fused feature vector output from the frontend to construct an input sequence, performing forward propagation to output computing load prediction samples, and calculating the residual between the predicted computing load samples and the real computing load labels. Based on this residual, the LTM network updates its temporal model parameters using the backpropagation algorithm and simultaneously propagates the error gradient back to the single-layer feedforward neural network in a chain. When the residual is less than a preset convergence threshold, the overall network is considered to have iteratively converged, and the network at this point is saved as a pre-trained single-layer feedforward neural network and LTM network.

[0048] S4. Obtain the scheduling weight.

[0049] It should be noted that during global node resource scheduling, the computational tasks to be executed have dependencies on the basic operating environment or model weights. If the target node does not pre-store the relevant dependency files, fetching large files across regions will cause significant delays in task processing. Therefore, this invention extracts the local residency status of the data required for task execution on the node, and combines it with a supply index reflecting the current computing power and a demand index reflecting the future load to impose multi-dimensional constraints, thereby achieving an allocation that balances data distribution and node computing capacity.

[0050] Specifically, in response to receiving a scheduled task from the cloud, the system extracts the hit rate of the dependent files required by the task on each edge cloud node, using this hit rate as the cache affinity. A scheduling weight is obtained by combining the computing power demand index, computing power supply index, and cache affinity. All edge cloud nodes are then sorted in descending order according to the scheduling weight, and the scheduled task is distributed to the top-ranked edge cloud node.

[0051] Specifically, the scheduling weights satisfy the following relationship: ; In the formula, For the first The first edge cloud node is for the first The scheduling weight of each task to be scheduled For the first The first edge cloud node in the... The computing power supply index at each time step. For the first The edge cloud node pairs the first Cache affinity of each pending task. For the first The first edge cloud node in the... The computing power demand index for each time step.

[0052] in, This represents the degree of matching between the local storage environment of the edge cloud node and the data requirements of the target task. A higher value indicates that more essential data is already residing locally on the edge cloud node, leading to... The larger the index, the greater the impact on the computing power supply index. It has an upward enhancing effect; This value represents the likelihood of a sudden surge in business traffic for the edge cloud node in the future. The larger the value, the greater the likelihood that the edge cloud node will face impacts from native business in the future. To avoid inevitable competition for processor time slices and bus bandwidth between newly scheduled computing tasks and the node's upcoming native business, the denominator of the fraction is made larger, thus weakening the numerator downwards and reducing the scheduling weight. This allows task decisions to automatically avoid nodes with predicted high loads. The smaller the value, the greater the likelihood that the edge cloud node will operate smoothly in the future, and the less the weakening effect on the numerator. This results in the node having a high elastic scheduling weight and being placed at the top of the sorting, avoiding blindly scheduling tasks to nodes with abundant computing power but lacking basic data or about to face load surges.

[0053] For example, Figure 2 This is a time-series diagram showing the scheduling weight changes of the edge cloud cluster in this invention. As can be seen from the diagram, during the stable operation phase, the scheduling model can perform differentiated weight assessments based on the real-time computing power supply and demand indices of each node, achieving refined resource allocation. When a sudden network congestion causes node degradation, the scheduling weight of the first faulty node (node ​​A) quickly returns to zero, accurately capturing the spatial cascading effect caused by the outward overflow of service traffic. This causes the weights of adjacent nodes (nodes B and C) to adaptively reach their lowest levels, thus proactively cutting off task assignment to high-risk nodes and effectively avoiding task timeouts. Furthermore, during periods of continuous congestion, maintaining a zero scheduling weight avoids ineffective task scheduling fluctuations. Finally, upon exiting congestion, each node gradually recovers according to the availability of underlying resources, achieving elastic scheduling of cloud resources in complex edge network environments.

[0054] This embodiment also discloses an emerging cloud resource elastic scheduling system based on machine learning, including a processor and a memory. The memory stores computer program instructions, and when the computer program instructions are executed by the processor, the emerging cloud resource elastic scheduling method based on machine learning according to the present invention is implemented.

Claims

1. A machine learning-based emerging cloud resource elastic scheduling method, characterized in that, The main process steps include: The idle computing power coefficient is obtained by averaging the processor idle ratio and the memory idle ratio; the node network latency, uplink and downlink available bandwidth of the edge cloud node are combined with the idle computing power coefficient to obtain the state sequence; the smaller value of the uplink and downlink available bandwidth is taken as the equivalent available bandwidth. The network congestion level is obtained by calculating the ratio of network latency to equivalent available bandwidth of computing nodes. The idle computing power coefficient is adjusted using the network congestion level to obtain the computing power supply index of edge cloud nodes. The state sequence of any edge cloud node is taken as the first sequence, and the state sequence of any neighboring node of the edge cloud node is taken as the second sequence. The first sequence and the second sequence are concatenated and input into a single-layer feedforward neural network to obtain the correlation between the edge cloud node and its neighboring nodes. The correlation is adjusted by combining the number of router devices in the packet forwarding path to obtain the attention coefficient. The second sequences of each neighboring node are fused using the attention coefficient to obtain the fused feature vector. The fused feature vector is input into the long short-term memory network model to output the computing power demand index. The cache affinity is determined by the hit ratio of the dependent files required by the task to be scheduled. The scheduling weight is obtained based on the computing power supply index, cache affinity and computing power demand index. The task to be scheduled is then distributed to the top edge cloud node according to the ranking result of the scheduling weight.

2. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 1, characterized in that, The scheduling weights satisfy the following relationship: In the formula, For the first The first edge cloud node is for the first The scheduling weight of each task to be scheduled For the first The edge cloud node in the ... The computing power supply index at each time step. For the first The edge cloud node pairs the first Cache affinity of each pending task. For the first The edge cloud node in the ... The computing power demand index for each time step.

3. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 1, characterized in that, The step of using the smaller of the uplink and downlink available bandwidths as the equivalent available bandwidth includes: linearly normalizing the uplink and downlink available bandwidths respectively, extracting the normalized uplink available bandwidth and the normalized downlink available bandwidth; and using the smaller of the normalized uplink available bandwidth and the normalized downlink available bandwidth as the equivalent available bandwidth.

4. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 1, characterized in that, The method of obtaining network congestion by calculating the ratio of node network latency to equivalent available bandwidth includes: adding the equivalent available bandwidth to a preset minimum value to obtain the corrected available bandwidth; calculating the normalized ratio of node network latency to corrected available bandwidth, and using the ratio as the network congestion level.

5. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 1, characterized in that, The computing power supply index satisfies the following relationship: ; In the formula, For the first The edge cloud node in the ... The computing power supply index at each time step. For the first The edge cloud node in the ... The idle computing power coefficient at each time step For the first The edge cloud node in the ... Network congestion at each time step.

6. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 1, characterized in that, The method of adjusting the correlation based on the number of router devices in the packet forwarding path to obtain the attention coefficient includes: extracting the packet forwarding path between each edge cloud node using the border gateway protocol; counting the number of router devices traversed between any two edge cloud nodes and using the corresponding number of router devices as the routing hop count; and setting the corresponding attention coefficient to zero in response to the routing hop count being greater than the maximum effective hop count threshold.

7. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 6, characterized in that, The method of adjusting the correlation based on the number of router devices in the packet forwarding path to obtain the attention coefficient further includes: in response to the routing hop count being less than or equal to the maximum effective hop count threshold, combining the routing hop count with the correlation to obtain the attention coefficient.

8. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 7, characterized in that, The attention coefficients satisfy the following relationship: ; In the formula, For the first The edge cloud node and the first Attention coefficient among neighboring cloud nodes For the first The edge cloud node and the first The correlation between neighboring cloud nodes For the first The edge cloud node and the first Number of routing hops between neighboring cloud nodes This is the threshold for the maximum number of valid hops.

9. The machine learning-based elastic scheduling method for emerging cloud resources according to claim 1, characterized in that, The process of fusing the second sequences of each adjacent node using attention coefficients to obtain a fused feature vector, and then inputting the fused feature vector into a long short-term memory network model to output a computing power demand index, includes: weighting the corresponding second sequences with non-zero attention coefficients as weights to obtain a fused feature vector containing spatial structure attributes; constructing an input sequence from the fused feature vectors corresponding to each time step within a historical preset time window; inputting the input sequence into a pre-trained long short-term memory network model for temporal evolution feature extraction, and outputting the load prediction value for the next time step; and performing max-min normalization on the load prediction value to obtain the computing power demand index for the next time step.

10. An emerging cloud resource elastic scheduling system based on machine learning, characterized in that, include: A processor and a memory, the memory storing computer program instructions that, when executed by the processor, implement the machine learning-based elastic scheduling method for emerging cloud resources according to any one of claims 1-9.