Computing power dynamic allocation cdnai service providing method and system
By using a time-series analysis and feedback-driven hierarchical computing power allocation decision model, the problem of lagging computing power allocation in CDN and AI services has been solved, achieving efficient and low-cost computing power resource management and improving the service quality and resource utilization of the CDN network.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 南京臻鹏网络科技有限公司
- Filing Date
- 2026-04-29
- Publication Date
- 2026-06-19
AI Technical Summary
Existing computing power allocation solutions for CDN and AI services lack in-depth analysis of traffic trends, resulting in computing power allocation decisions lagging behind actual demand changes. They are unable to adapt to business scenarios with non-stationary characteristics, leading to resource waste or service interruptions, and are unable to distinguish between deterministic peaks and random fluctuations.
By extracting features from historical traffic data through time-series analysis rules, generating traffic prediction results, constructing a hierarchical computing power structure and allocation decision model, realizing dynamic allocation of computing power resources for CDN nodes, and driving the model to adaptively evolve through feedback information to accurately match the supply and demand relationship of computing power.
It improves the accuracy of traffic prediction, reduces the risk of resource idleness and response delay, enhances the service continuity and computing power utilization of CDN nodes, reduces operation and maintenance complexity, and improves the return on investment.
Smart Images

Figure CN122247993A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of CDN and AI computing power scheduling technology, and in particular to a method and system for providing CDNAI services with dynamic computing power allocation. Background Technology
[0002] In the field of CDN (Content Delivery Network) and AI service integration, traditional computing power allocation schemes generally adopt static quotas or reactive scheduling based on real-time load. In practice, service providers pre-configure a fixed proportion of AI computing resources for each CDN node, such as allocating GPUs, TPUs, and other computing units based on node bandwidth capacity or historical peak access times. During runtime, the system monitors metrics such as node CPU / GPU utilization and request response latency, sets overload thresholds, and triggers expansion operations once these thresholds are reached, using methods such as migrating nearby nodes or supplementing resources in the cloud to cope with instantaneous traffic surges. This model relies on manual experience to configure rules and lacks in-depth analysis of traffic trends and extraction of long-term patterns.
[0003] Existing conventional practices have two significant limitations. First, computing power allocation decisions lag far behind actual demand changes. Because responsive scheduling relies on real-time state triggers, during periods of sudden traffic surges but before thresholds are met, node computing power is already insufficient, leading to queuing and timeouts for AI inference requests, resulting in noticeable service degradation for users. Especially when traffic has periodic peaks (such as hot events or e-commerce promotions), nodes cannot reserve computing power in advance, and frequent expansion operations instead trigger resource contention and network fluctuations. Second, static quota mechanisms cannot adapt to the non-stationary nature of computing power demand. AI services in different business scenarios (such as image recognition, speech transcription, and real-time recommendations) have varying requirements for computing power density and latency tolerance. Traditional practices simply aggregate and allocate resources by node, causing high-value services and low-priority tasks to compete for the same computing power resources, resulting in an overall utilization rate often below 40%. At the same time, prediction errors are completely ignored—the system does not distinguish between "deterministic peaks" and "random fluctuations." When actual traffic deviates from the preset threshold, either over-configuration leads to waste, or under-configuration causes service interruptions. This extensive management approach makes it difficult to balance resource costs and service quality. Given the increasing scarcity of edge computing power, a more refined dynamic allocation mechanism is urgently needed. Summary of the Invention
[0004] This invention provides a method and system for providing CDNAI services with dynamic computing power allocation, which can solve the problems in the prior art.
[0005] A first aspect of the present invention provides a method for providing CDNAI services with dynamic allocation of computing power, comprising: Historical traffic data and current network status data within the target area are acquired. Based on time series analysis rules, features are extracted from the historical traffic data to generate a traffic prediction result that includes predicted traffic values and prediction time windows. Based on the traffic prediction results, the AI computing power resources of CDN nodes are hierarchically identified to obtain a computing power hierarchical structure. Based on the traffic prediction results and the computing power hierarchical structure, an allocation decision model that reflects the relationship between prediction uncertainty and computing power supply and demand is constructed. The allocation decision model maps the prediction error boundary to computing power reserve space and generates computing power allocation instructions that include computing power allocation schemes and inter-layer migration strategies. Based on the computing power allocation instructions, computing power resources are allocated to each CDN node, and actual traffic data and computing power usage data are extracted from the allocated operating status to form feedback information. The feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The error boundary and reserve space mapping logic in the allocation decision model are reconstructed through the quantitative mapping relationship, and the prediction logic of the time series analysis rule is driven to undergo adaptive evolution.
[0006] Based on time-series analysis rules, features are extracted from the historical traffic data to generate traffic prediction results that include predicted traffic values and prediction time windows, including: Historical traffic data is decomposed into multiple scales according to the time dimension to identify periodic fluctuation patterns and non-periodic mutation patterns. Based on the repetition interval of the periodic fluctuation patterns and the triggering conditions of the non-periodic mutation patterns, a temporal evolution law representation is constructed. Based on the stability index of the periodic fluctuation pattern and the predictability index of the non-periodic mutation pattern, the effective range of the impact of different patterns on future flow is determined, forming the boundary of the pattern's domain. Based on the boundary of the mode's scope, the temporal evolution law is projected over time to generate a probability distribution of traffic changes within a future time period. The flow rate value with the highest probability density is extracted from the probability distribution as the predicted flow rate value, and the boundary of the mode scope is mapped to the effective time range of the predicted flow rate value to obtain the prediction time window. This completes feature extraction and generates a flow rate prediction result containing the predicted flow rate value and the prediction time window.
[0007] Based on the traffic prediction results, the AI computing power resources of CDN nodes are hierarchically identified to obtain a computing power hierarchy structure. Based on the traffic prediction results and the computing power hierarchy structure, an allocation decision model reflecting the relationship between prediction uncertainty and computing power supply and demand is constructed, including: Extract the distribution of prediction confidence and the gradient of change of prediction traffic value from the traffic prediction results. Based on the dispersion of the prediction confidence distribution, classify the AI computing resources of each CDN node into risk levels. Label the computing resources of different risk levels with hierarchical attributes according to the gradient of change of prediction traffic value, forming a computing power layered structure that includes a deterministic computing power layer and an uncertainty buffer layer. The capacity boundary of the deterministic computing power layer in the computing power hierarchical structure is bound to the stable confidence interval of the predicted confidence distribution, and the capacity elasticity space of the uncertainty buffer layer is associated with the fluctuating confidence interval of the predicted confidence distribution, thereby establishing a confidence-driven computing power supply and demand mapping relationship. Based on the aforementioned computing power supply and demand mapping relationship, an allocation decision model reflecting the relationship between prediction uncertainty and computing power supply and demand matching is constructed.
[0008] The process of binding the capacity boundary of the deterministic computing power layer in the hierarchical computing power structure to the stable confidence interval of the predicted confidence distribution, and associating the capacity elasticity space of the uncertainty buffer layer with the fluctuating confidence interval of the predicted confidence distribution, establishes a confidence-driven computing power supply and demand mapping relationship, including: Extract the confidence variance sequence from the predicted confidence distribution, identify the variance convergence segment and the variance divergence segment based on the changing trend of the confidence variance sequence, mark the confidence interval corresponding to the variance convergence segment as the stable confidence interval, and mark the confidence interval corresponding to the variance divergence segment as the fluctuating confidence interval. The confidence boundary value of the stable confidence interval is mapped to the deterministic boundary of the traffic demand. Based on the deterministic boundary, the baseline load capacity required by the deterministic computing layer is calculated, and the baseline load capacity is set as the capacity boundary of the deterministic computing layer, thereby realizing the binding between the capacity boundary and the stable confidence interval. The confidence deviation of the fluctuation confidence interval is mapped to the uncertainty range of traffic demand, and the overload capacity change space that the uncertainty buffer layer needs to absorb is calculated based on the uncertainty range. The overload capacity variation space is divided into an elastic expansion interval and an elastic contraction interval, and the elastic expansion interval and the elastic contraction interval together constitute the capacity elastic space of the uncertainty buffer layer, thereby establishing a confidence-driven computing power supply and demand mapping relationship.
[0009] Based on the aforementioned computing power allocation instructions, computing power resources are allocated to each CDN node, and feedback information is formed by extracting actual traffic data and computing power usage data from the allocated operating status, including: The computing power allocation scheme and inter-layer migration strategy in the computing power allocation instruction are analyzed. The target computing power quota of the deterministic computing power layer and the uncertain buffer layer in each CDN node are determined according to the computing power allocation scheme. The timing scheduling rules for the transfer of computing power resources between layers are determined according to the inter-layer migration strategy. According to the time-series scheduling rules, the computing resources of each CDN node are migrated between layers, and the migrated computing resources are redistributed according to the target computing power quota. After the computing resources are allocated, the operating status of each CDN node is continuously monitored, and actual traffic data reflecting the actual arrival of traffic and computing power usage data reflecting the actual consumption of computing power are extracted from the operating status. The actual traffic data and the computing power usage data are timestamped to establish a correlation between the arrival time of traffic and the consumption time of computing power. Based on the correlation, the actual traffic data and the computing power usage data are encapsulated into feedback information containing time-series correlation features.
[0010] Establishing a quantitative mapping relationship between prediction deviation and actual resource consumption using the feedback information, and reconstructing the error boundary and reserve space mapping logic in the allocation decision model through the quantitative mapping relationship includes: Extract actual traffic data and predicted traffic values from traffic prediction results from the feedback information, calculate the distribution characteristics of prediction deviation, and extract computing power usage data and computing power quotas of each layer in the computing power hierarchical structure from the feedback information to calculate the structural characteristics of actual resource consumption. Based on the distribution characteristics of the predicted deviation, the deviation concentration interval and the deviation discrete interval are identified. Based on the structural characteristics of the actual resource consumption, the resource saturation level and the resource redundancy level are identified. A cross-correlation matrix between the deviation interval and the resource level is constructed. By establishing a quantitative mapping relationship between prediction deviation and actual resource consumption through the correlation strength in the cross-correlation matrix, the current error boundary and reserve space mapping logic are extracted from the allocation decision model. The driving degree in the quantitative mapping relationship is used as the basis for reconstruction, and the boundary range of the error boundary and the spatial scale of the reserve space are adjusted in a coordinated manner to complete the reconstruction of the error boundary and reserve space mapping logic in the allocation decision model.
[0011] Based on the distribution characteristics of the predicted deviation, the concentrated and discrete intervals of the deviation are identified. Based on the structural characteristics of the actual resource consumption, the resource saturation level and resource redundancy level are identified. The cross-correlation matrix between the deviation interval and the resource level is constructed as follows: Entropy values are calculated based on the distribution characteristics of the prediction deviation. The degree of deviation clustering is determined based on the magnitude of the entropy value. Time intervals with entropy values less than a preset entropy threshold are marked as deviation concentration intervals, and time intervals with entropy values greater than the preset entropy threshold are marked as deviation dispersion intervals. The structural characteristics of actual resource consumption are traversed hierarchically, and the resource utilization saturation of each level is calculated. Based on the comparison between the resource utilization saturation and the preset saturation benchmark, the levels with saturation exceeding the preset saturation benchmark are marked as resource saturated levels, and the levels with saturation not reaching the preset saturation benchmark are marked as resource redundant levels. Use the deviation concentration interval and the deviation discrete interval as row indices, and the resource saturation level and the resource redundancy level as column indices to initialize the cross-association matrix framework; For each row and column intersection position in the cross-association matrix framework, the consumption response intensity of the corresponding resource level within the corresponding deviation interval is calculated. The consumption response intensity is then used as the association weight to fill the corresponding position in the cross-association matrix framework to construct the cross-association matrix between the deviation interval and the resource level.
[0012] A second aspect of this invention provides a CDNAI service provision system with dynamic computing power allocation, comprising: The prediction unit is used to acquire historical traffic data and current network status data within the target area, extract features from the historical traffic data based on time series analysis rules, and generate a traffic prediction result containing the predicted traffic value and the prediction time window. The allocation decision unit is used to classify and identify the AI computing power resources of CDN nodes according to the traffic prediction results to obtain a computing power layer structure. Based on the traffic prediction results and the computing power layer structure, an allocation decision model that reflects the prediction uncertainty and the matching relationship between computing power supply and demand is constructed. The allocation decision model generates computing power allocation instructions that include computing power allocation schemes and inter-layer migration strategies by mapping the prediction error boundary to computing power reserve space. The feedback evolution unit is used to perform computing power resource allocation on each CDN node based on the computing power allocation instruction, and extract actual traffic data and computing power usage data from the allocated operating status to form feedback information. The feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The quantitative mapping relationship is used to reconstruct the error boundary and reserve space mapping logic in the allocation decision model, and drive the prediction logic of the time series analysis rule to undergo adaptive evolution.
[0013] A third aspect of the present invention provides an electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0014] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0015] Time-series analysis and prediction based on historical traffic data and current network status data significantly improves the accuracy of traffic estimation, providing a highly reliable basis for the pre-allocation of AI computing resources. The combination of prediction results and a hierarchical computing power structure enables a shift from static allocation to dynamic mapping of computing resources, effectively reducing the risks of resource idleness and response delays caused by traffic bursts or decay. The allocation decision model precisely matches the supply and demand relationship of computing power by mapping the prediction error boundary to computing power reserve space, avoiding resource waste or overload under traditional fixed quotas, and improving the service continuity and stability of CDN nodes under traffic fluctuations.
[0016] The feedback-driven adaptive evolution mechanism enables the mapping logic between the error boundary and reserve space of the allocation decision model to be continuously optimized based on actual operational data. The quantitative mapping relationship between prediction deviation and actual resource consumption provides a closed-loop correction path for the logical iteration of time-series analysis rules, allowing the system to adapt to changes in network traffic patterns without manual intervention. This self-learning capability not only reduces operational complexity but also gradually converges resource allocation deviations over long-term operation, increasing computing power utilization to near the theoretical optimal value and significantly reducing computing power procurement and electricity costs.
[0017] The hierarchical identification and inter-layer migration strategy of computing resources endows CDN nodes with dynamic scheduling capabilities among heterogeneous AI tasks. Different levels of computing power can be flexibly switched based on task urgency and prediction confidence, achieving tiered protection for high-value, high-latency-sensitive tasks and low-priority batch tasks. This mechanism maximizes the utilization of idle computing power at edge nodes while ensuring the quality of critical services, comprehensively improving the ROI of the entire CDN network. Overall, this method transforms computing power allocation from passive response to proactive prediction and adaptive optimization, providing an efficient and low-cost operational paradigm for large-scale AI service deployment. Attached Figure Description
[0018] Figure 1 This is a flowchart illustrating the method for providing CDNAI service with dynamic computing power allocation according to an embodiment of the present invention. Figure 2 This is a flowchart illustrating the decision-making process for hierarchical allocation of AI computing power at CDN nodes in an embodiment of the present invention. Detailed Implementation
[0019] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0020] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0021] Figure 1 This is a flowchart illustrating the method for providing CDNAI services with dynamic allocation of computing power according to an embodiment of the present invention.
[0022] The CDNAI service delivery methods for dynamic allocation of computing power include: Historical traffic data and current network status data within the target area are acquired. Based on time series analysis rules, features are extracted from the historical traffic data to generate a traffic prediction result that includes predicted traffic values and prediction time windows. Based on the traffic prediction results, the AI computing power resources of CDN nodes are hierarchically identified to obtain a computing power hierarchical structure. Based on the traffic prediction results and the computing power hierarchical structure, an allocation decision model that reflects the relationship between prediction uncertainty and computing power supply and demand is constructed. The allocation decision model maps the prediction error boundary to computing power reserve space and generates computing power allocation instructions that include computing power allocation schemes and inter-layer migration strategies. Based on the computing power allocation instructions, computing power resources are allocated to each CDN node, and actual traffic data and computing power usage data are extracted from the allocated operating status to form feedback information. The feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The error boundary and reserve space mapping logic in the allocation decision model are reconstructed through the quantitative mapping relationship, and the prediction logic of the time series analysis rule is driven to undergo adaptive evolution.
[0023] In one optional implementation, feature extraction is performed on the historical traffic data based on time-series analysis rules to generate a traffic prediction result containing predicted traffic values and prediction time windows, including: Historical traffic data is decomposed into multiple scales according to the time dimension to identify periodic fluctuation patterns and non-periodic mutation patterns. Based on the repetition interval of the periodic fluctuation patterns and the triggering conditions of the non-periodic mutation patterns, a temporal evolution law representation is constructed. Based on the stability index of the periodic fluctuation pattern and the predictability index of the non-periodic mutation pattern, the effective range of the impact of different patterns on future flow is determined, forming the boundary of the pattern's domain. Based on the boundary of the mode's scope, the temporal evolution law is projected over time to generate a probability distribution of traffic changes within a future time period. The flow rate value with the highest probability density is extracted from the probability distribution as the predicted flow rate value, and the boundary of the mode scope is mapped to the effective time range of the predicted flow rate value to obtain the prediction time window. This completes feature extraction and generates a flow rate prediction result containing the predicted flow rate value and the prediction time window.
[0024] In real-world applications, the traffic carried by CDN nodes, such as video streaming, web browsing, and file downloads, often exhibits complex time-series characteristics. To achieve accurate traffic prediction, in-depth feature extraction of historical traffic data is necessary. First, the historical traffic data for the past three months within the target region is decomposed according to different time granularities. Taking the historical traffic curve of a specific CDN node as an example, this curve records the bandwidth consumption value every 5 minutes, and these data points constitute a continuous time series. Wavelet transform is applied to this time series, setting five different decomposition scales, corresponding to time granularities of five minutes, hours, days, weeks, and months, respectively.
[0025] At the hourly scale, a consistently higher flow rate than the baseline was observed between 10 AM and 10 PM daily. This daily recurrence of flow rate variations with a relatively fixed time span indicates a periodic fluctuation pattern. Fourier spectral analysis determined the dominant frequency period of this periodic fluctuation to be 24 hours, with amplitude fluctuations ranging from 1.5 to 2.2 times the baseline flow rate. Simultaneously, at the weekly scale, the flow rate peak from Friday evening to Sunday was found to be significantly higher than on weekdays. This seven-day cycle of flow rate fluctuation also falls under the category of periodic fluctuations. The repetition intervals of all identified periodic components were statistically analyzed, and their period length, amplitude range, and phase information were recorded.
[0026] Meanwhile, multiple sudden traffic spikes were detected in the residual components at daily and hourly scales. These spikes typically lasted between 15 and 45 minutes, with peak traffic reaching 3 to 5 times the normal level. By correlating with external event logs, it was found that these abrupt changes were often related to specific triggering conditions such as the release of trending videos, popular online searches, and breaking news reports. For example, during a live sports event, traffic jumped from the normal 80 Gbit / s to 320 Gbit / s within five minutes; this rapid and drastic change in a short period is an example of a non-periodic abrupt change pattern. For these abrupt change patterns, feature vectors of their triggering conditions were extracted, including dimensions such as event type, time of occurrence, duration, and scope of impact, to construct a knowledge base of abrupt change events.
[0027] Based on the above identification results, a temporal evolution law representation is constructed. For periodic fluctuation patterns, a family of superimposed sine functions is used for mathematical modeling, representing each periodic component as a sine wave with a specific frequency, amplitude, and initial phase. Superimposing all periodic components forms a smooth periodic baseline curve, which reflects the flow rate variation trend over time under normal conditions. For aperiodic abrupt change patterns, a conditional trigger function is used. When a specific trigger condition is detected, a time-limited pulse function is superimposed on the baseline curve. The shape of this pulse function is determined by the statistical characteristics of historical abrupt events. Combining the periodic baseline curve with the conditional trigger function yields a complete temporal evolution law representation model.
[0028] To determine the extent of the impact of different patterns on future traffic flow, their stability and predictability need to be assessed. For periodic fluctuation patterns, the repeatability of the pattern over the past 30 periods is calculated. Specifically, the traffic envelope for each period is extracted, and the correlation coefficient of the envelope shape between adjacent periods is calculated. When the correlation coefficient is consistently higher than 0.9, the periodic pattern is considered to have high stability, and its impact on the future can extend to the next 10 periods. For example, if the stability index of a daily periodic pattern is 0.93, its effective range can extend to the next 10 days. For a weekly periodic pattern, if the stability index is 0.87, the effective range is the next 8 weeks. Conversely, when the stability index is lower than 0.7, the periodic pattern is considered unstable, and its range is shortened to within the next 3 periods.
[0029] For non-periodic mutation patterns, their predictability index is determined by the detectability of triggering conditions and response time. When the triggering conditions of a certain type of mutation event can be perceived 12 hours in advance through social media buzz, news release announcements, etc., the predictability index of this mutation pattern is high and can be considered in advance in the prediction. For example, if the start time of a scheduled live sports event is announced 48 hours in advance, the predictability index is close to 1, and the corresponding mutation pulse can be directly superimposed in the traffic prediction for that period. However, for completely random sudden events, such as unannounced system failures or cyberattacks, their predictability index is close to zero and cannot be reflected in the prediction in advance; they can only be dealt with by leaving margins for error. Based on historical data, the predictability index is divided into three levels: high predictability corresponds to an index greater than 0.8, with an impact range covering the next 72 hours; medium predictability corresponds to an index between 0.4 and 0.8, with an impact range of the next 24 hours; and low predictability corresponds to an index less than 0.4, which is not included in the deterministic prediction range.
[0030] Based on the aforementioned stability and predictability indicators, the boundaries of the model's scope are determined. For highly stable diurnal patterns, the scope boundary is set to extend 240 hours forward from the current time; for moderately stable weekly patterns, the scope boundary is extended 1344 hours forward from the current time. For highly predictable abrupt events, the scope boundary is extended 12 hours before and after the event's predicted occurrence time. The scope boundaries of all models are then marked on the timeline, forming a set of segmented time intervals.
[0031] Based on the boundary of the model's scope, the temporal evolution pattern is represented by temporal projection. The next 72 hours are selected as the target prediction period, which is divided into 144 time slices with a granularity of five minutes. For each time slice, all periodic and abrupt change patterns covering that time slice are identified based on its temporal location. The contribution values of these patterns are then summed to obtain the baseline value for the flow prediction of that time slice. Since periodic patterns themselves exhibit amplitude fluctuations, and abrupt change events are probabilistic, an uncertainty interval is added to the flow prediction value for each time slice. The upper and lower bounds of this interval are determined by the standard deviation of historical prediction errors. For time periods with higher stability, the uncertainty interval is narrower; for time periods where abrupt change events occur, the uncertainty interval is significantly wider.
[0032] Using Monte Carlo simulation, 1000 random samples were taken for each time slice. During each sample, an amplitude correction factor was randomly selected based on the historical fluctuation distribution of the periodic pattern, and whether to trigger a sudden event pulse was randomly determined based on the probability of the sudden event. The results of the 1000 samples were statistically summarized to plot a frequency distribution histogram of the flow rate for each time slice. This histogram represents the probability distribution of flow rate changes for that time slice. For a specific time slice, if the probability distribution shows a unimodal shape with the peak value at 120 Gbit / s, it indicates that the flow rate at that moment is near that value. If the probability distribution shows a bimodal shape, it indicates that there are two flow rate states at that moment, which need to be considered simultaneously in subsequent decisions.
[0033] From the probability distribution of each time slice, the flow rate value with the highest probability density is extracted as the predicted flow rate value for that time slice. Specifically, the flow rate interval with the highest frequency in the histogram is statistically analyzed, and the midpoint of that interval is taken as the final predicted value. For example, if the probability distribution of a certain time slice shows that the probability of the flow rate falling within the 115 Gbit / s to 125 Gbit / s interval is 35%, higher than all other intervals, then the predicted flow rate value for that time slice is determined to be 120 Gbit / s. Connecting the predicted flow rate values of all 144 time slices forms a complete flow rate prediction curve for the next 72 hours.
[0034] To define the effective time range for each predicted flow value, the previously determined pattern scope boundary is mapped to a prediction time window. For time periods dominated by daily cyclical patterns, the length of the prediction time window is equal to the scope boundary of that daily cyclical pattern; that is, predictions for the same time period within the next 10 days are all valuable. For time periods affected by abrupt events, the prediction time window only covers 24 hours before and after the event; beyond this range, the prediction reliability drops sharply. In this way, each predicted flow value is accompanied by a clear time validity period, enabling subsequent computing power allocation decisions to accurately grasp the applicable scope of the prediction.
[0035] After the above processing, a structured traffic prediction result is generated, containing a sequence of predicted traffic values for 144 time slices, each with a corresponding prediction time window identifier. For example, the 30th time slice corresponds to the next 15 hours, with a predicted traffic value of 138 Gbit / s and a prediction time window identifier of "dominated by daily cycles, valid for 10 days". The 90th time slice corresponds to the next 45 hours, coinciding with the expected hotspot event period, with a predicted traffic value of 280 Gbit / s and a prediction time window identifier of "overlapping sudden events, valid for 24 hours". This prediction result not only provides specific values for future traffic but also clarifies the confidence level and applicable time limit of each prediction value, providing accurate input information for the subsequent dynamic allocation of computing resources.
[0036] In one optional implementation, the AI computing resources of CDN nodes are hierarchically identified based on the traffic prediction results to obtain a computing power hierarchy structure. Based on the traffic prediction results and the computing power hierarchy structure, an allocation decision model reflecting the relationship between prediction uncertainty and computing power supply and demand is constructed, including: Extract the distribution of prediction confidence and the gradient of change of prediction traffic value from the traffic prediction results. Based on the dispersion of the prediction confidence distribution, classify the AI computing resources of each CDN node into risk levels. Label the computing resources of different risk levels with hierarchical attributes according to the gradient of change of prediction traffic value, forming a computing power layered structure that includes a deterministic computing power layer and an uncertainty buffer layer. The capacity boundary of the deterministic computing power layer in the computing power hierarchical structure is bound to the stable confidence interval of the predicted confidence distribution, and the capacity elasticity space of the uncertainty buffer layer is associated with the fluctuating confidence interval of the predicted confidence distribution, thereby establishing a confidence-driven computing power supply and demand mapping relationship. Based on the aforementioned computing power supply and demand mapping relationship, an allocation decision model reflecting the relationship between prediction uncertainty and computing power supply and demand matching is constructed.
[0037] In the process of classifying and identifying CDN nodes based on their computing resources, the first step is to extract the gradient of the predicted confidence distribution and the predicted traffic value from the traffic prediction results. The predicted confidence distribution reflects the reliability of the prediction results, and its numerical form is presented as a probability density distribution curve for different time windows. When the predicted traffic value for a certain time period is 500Gbps, its corresponding confidence distribution exhibits a normal distribution characteristic within the interval [450 Gbps, 550 Gbps], with the central peak corresponding to the traffic value. The dispersion of the distribution characterizes the intensity of the prediction uncertainty. By calculating the standard deviation or variance of this distribution, the stability level of the prediction can be quantified. The larger the standard deviation, the wider the range of confidence for the predicted value, and the higher the probability that the actual traffic will deviate from the predicted value.
[0038] The gradient of the predicted flow rate is obtained by calculating the difference between the predicted flow rates between adjacent time windows. When starting from time... At the time When the predicted traffic volume jumps from 400 Gbps to 600 Gbps, the gradient for that period shows a sharp upward trend, indicating a rapid increase in traffic demand. Conversely, if the predicted traffic volume decreases from 700 Gbps to 500 Gbps, the gradient exhibits a negative decreasing characteristic. By cumulatively analyzing the gradients over multiple consecutive time windows, the fluctuation cycles and abrupt change points of traffic demand can be identified. This characteristic information provides a quantitative basis for subsequent risk classification.
[0039] After obtaining the dispersion of the prediction confidence distribution, the AI computing resources of each CDN node are risk-classified. The specific classification strategy is based on the concentration index of the confidence distribution. When the standard deviation of the prediction confidence distribution corresponding to a node is less than a set first threshold, the traffic prediction result of that node is judged as high-confidence, and its AI computing resources are classified as low-risk. These nodes have relatively small fluctuations in traffic demand, their historical traffic curves show stable periodic patterns, and the probability of deviation between actual and predicted traffic is low. For nodes with a standard deviation between the first and second thresholds, their prediction results have a moderate degree of uncertainty, and the corresponding AI computing resources are marked as medium-risk. When the standard deviation exceeds the second threshold, the prediction confidence distribution exhibits significant dispersion characteristics, and the node faces a higher possibility of sudden traffic changes or abnormal fluctuations; the corresponding computing resources are classified as high-risk.
[0040] After risk classification, computing resources of different risk levels are labeled with hierarchical attributes based on the gradient of predicted traffic changes. For low-risk computing resources, the gradient of traffic demand changes typically remains within a flat range. These resources are labeled as the deterministic computing layer, undertaking the function of stable basic computing power supply. The capacity configuration of the deterministic computing layer is strictly aligned with the central peak of the predicted traffic value, without setting redundant reserve space, to maximize resource utilization efficiency. The traffic change gradient corresponding to medium-risk computing resources exhibits periodic fluctuation characteristics. These resources are labeled as the elastic adjustment layer, used to cope with traffic fluctuations within a foreseeable range. The capacity of the elastic adjustment layer is dynamically configured according to the fluctuation amplitude of the change gradient. When the absolute value of the gradient increases, the capacity of this layer expands accordingly to cover potential demand increments.
[0041] For high-risk computing resources, the corresponding traffic change gradient exhibits irregular and abrupt characteristics, with historical data showing abnormal patterns of short-term traffic surges or drops. These resources are designated as an uncertainty buffer layer, specifically designed to handle sudden traffic spikes exceeding prediction boundaries. The capacity design of the uncertainty buffer layer employs a reservation mechanism; its size is not directly tied to specific predicted traffic values, but rather configured probabilistically based on the statistical distribution of historical anomalies. If a node has historically experienced peak traffic exceeding the predicted value by 30%, and this occurrence occurs at a frequency of 5%, the uncertainty buffer layer needs to reserve at least 30% additional computing capacity to ensure service quality is maintained even in abnormal scenarios.
[0042] Through the aforementioned grading and labeling process, a hierarchical computing power structure is formed, comprising a deterministic computing power layer and an uncertainty buffer layer. Logically, this structure divides the AI computing power resources of CDN nodes into three layers: the deterministic computing power layer at the bottom provides stable basic computing power support; the elastic adjustment layer in the middle layer absorbs periodic fluctuations; and the uncertainty buffer layer at the top layer serves as the last line of defense in abnormal scenarios. The computing power resources within each layer are further subdivided based on the geographical location, hardware configuration, and historical load of the node, forming a multi-dimensional resource identification system.
[0043] When constructing the computing power supply and demand mapping relationship, the capacity boundary of the deterministic computing power layer in the computing power hierarchy is bound to the stable confidence interval of the prediction confidence distribution. The stable confidence interval corresponds to the range of values in the prediction confidence distribution where the cumulative probability density reaches a specific threshold. For example, the traffic range [480 Gbps, 520 Gbps] corresponding to the cumulative probability reaching 80% is the stable confidence interval. The capacity boundary of the deterministic computing power layer is set as the maximum computing power value that can meet all traffic demands within this interval. Assuming that processing 1 Gbps of traffic requires 10 computing power units, the deterministic computing power layer needs to be configured with 5200 computing power units to cover the upper limit demand of 520 Gbps. This binding mechanism ensures that in most normal scenarios, the resource supply of the deterministic computing power layer can accurately match the actual traffic demand, avoiding computing power waste or insufficient supply.
[0044] Simultaneously, the capacity elasticity space of the uncertainty buffer layer is correlated with the fluctuation confidence interval of the prediction confidence distribution. The fluctuation confidence interval corresponds to the tail distribution region outside the stable confidence interval, reflecting the uncertainty boundary of the prediction result. When the cumulative probability expands from 80% to 95%, the traffic range expands to [420 Gbps, 580 Gbps], and the fluctuation confidence intervals are the two segments [420 Gbps, 480 Gbps] and [520 Gbps, 580 Gbps]. The capacity elasticity space of the uncertainty buffer layer needs to cover the additional computing power requirements corresponding to the fluctuation confidence interval. For the above example, an additional 600 computing power units need to be reserved to cope with traffic fluctuation scenarios below 480 Gbps or above 520 Gbps. The configuration of the capacity elasticity space adopts a dynamic adjustment strategy. When the prediction confidence distribution of multiple consecutive time windows tends to converge, the fluctuation confidence interval narrows, and the elasticity space shrinks accordingly; conversely, when the distribution dispersion increases, the elasticity space expands synchronously.
[0045] Through the aforementioned binding and association process, a confidence-driven computing power supply and demand mapping relationship is established. This mapping relationship transforms abstract prediction confidence indicators into specific computing power capacity configuration schemes, enabling computing power resource allocation decisions to quantitatively reflect the degree of uncertainty in prediction results. A dynamic adjustment mechanism is embedded in the mapping relationship. When a significant change in the prediction confidence distribution is detected in real time, the capacity boundaries and elasticity spaces at each level immediately trigger a recalculation process, ensuring that the computing power supply and demand matching relationship remains synchronized with the latest state of the prediction results.
[0046] Based on the mapping relationship between computing power supply and demand, an allocation decision model reflecting the relationship between prediction uncertainty and the matching relationship between computing power supply and demand is constructed. This model takes multidimensional features of the prediction confidence distribution as input, including the boundary values of the stable confidence interval, the dispersion of the fluctuating confidence interval, the statistical distribution of historical prediction errors, and the load index of the current network state. Internally, the model constructs a multi-layered decision logic. The first layer targets the deterministic computing power layer, generating precise computing power allocation values based on the center value of the stable confidence interval to ensure the determinism of basic computing power supply. The second layer targets the elastic adjustment layer, generating time-segmented computing power adjustment schemes based on the width of the fluctuating confidence interval and the periodic characteristics of the change gradient, realizing the dynamic scaling of computing power resources. The third layer targets the uncertainty buffer layer, calculating the activation conditions and capacity release strategies of the buffer layer by combining the probability of historical anomalies and the deviation of the current prediction error.
[0047] The allocation decision model achieves quantitative management of uncertainty by mapping the prediction error boundary to computing power reserve space. The prediction error boundary is obtained by statistically analyzing the deviation distribution between historical prediction values and actual traffic values. When the deviation corresponding to the 95th percentile of the historical error is ±50 Gbps, this value is the prediction error boundary. The model converts this boundary value into the capacity requirement of computing power reserve space, reserving 500 computing power units as error buffer reserves according to the conversion coefficient between traffic and computing power. The allocation of reserve space is preferentially tilted towards the uncertainty buffer layer, while retaining some fast response capability in the elastic adjustment layer, forming a multi-layered fault-tolerant system.
[0048] The model output includes computing power allocation instructions containing a computing power allocation scheme and an inter-layer migration strategy. The computing power allocation scheme clearly defines the number of computing power resources configured for each CDN node at each layer under different time windows, presented in tabular or vector form for easy direct invocation by the execution system. The inter-layer migration strategy defines the rules for the flow of computing power resources between different layers. When the load of the deterministic computing power layer exceeds the capacity boundary, some resources of the elastic adjustment layer are immediately deployed to supplement it; when the fluctuation confidence interval continues to narrow, idle resources of the uncertainty buffer layer are deployed to the elastic adjustment layer to improve the overall resource utilization. The migration strategy embeds constraint parameters for triggering conditions and execution latency to ensure the timeliness and smoothness of inter-layer allocation and avoid system oscillations caused by frequent switching.
[0049] In one optional implementation, the capacity boundary of the deterministic computing power layer in the computing power hierarchy is bound to the stable confidence interval of the predicted confidence distribution, and the capacity elasticity space of the uncertainty buffer layer is associated with the fluctuating confidence interval of the predicted confidence distribution, establishing a confidence-driven computing power supply and demand mapping relationship, including: Extract the confidence variance sequence from the predicted confidence distribution, identify the variance convergence segment and the variance divergence segment based on the changing trend of the confidence variance sequence, mark the confidence interval corresponding to the variance convergence segment as the stable confidence interval, and mark the confidence interval corresponding to the variance divergence segment as the fluctuating confidence interval. The confidence boundary value of the stable confidence interval is mapped to the deterministic boundary of the traffic demand. Based on the deterministic boundary, the baseline load capacity required by the deterministic computing layer is calculated, and the baseline load capacity is set as the capacity boundary of the deterministic computing layer, thereby realizing the binding between the capacity boundary and the stable confidence interval. The confidence deviation of the fluctuation confidence interval is mapped to the uncertainty range of traffic demand, and the overload capacity change space that the uncertainty buffer layer needs to absorb is calculated based on the uncertainty range. The overload capacity variation space is divided into an elastic expansion interval and an elastic contraction interval, and the elastic expansion interval and the elastic contraction interval together constitute the capacity elastic space of the uncertainty buffer layer, thereby establishing a confidence-driven computing power supply and demand mapping relationship.
[0050] After traffic prediction is completed, it is necessary to establish a precise correspondence between the confidence characteristics of the prediction results and the hierarchical configuration of computing resources. The prediction confidence distribution reflects the reliability of the traffic prediction, and it contains differences between stable and fluctuating regions. By analyzing the statistical characteristics of the confidence distribution, different confidence interval types can be identified, and these intervals can then be associated with different levels in the computing resource hierarchy.
[0051] When extracting the confidence variance sequence from the prediction confidence distribution, it is necessary to calculate the variance value of the confidence score for each prediction point in the time series. Specifically, for consecutive prediction points within a time window, the dispersion of their confidence scores is calculated to form a variance sequence. This variance sequence can reveal the changing pattern of prediction stability over time. When the variance value remains at a low level, it indicates that the prediction results are highly consistent; when the variance value increases significantly, it indicates that the uncertainty of the prediction results increases. Trend analysis of the variance sequence is performed using a sliding window approach, setting the window length to 5 to 10 times the prediction time granularity, and calculating the first difference of the variance within each window. When the difference values of multiple consecutive windows are all negative or close to zero, the time period is determined to be a variance convergence segment; when the difference values of multiple consecutive windows are positive and exceed a preset threshold, the time period is determined to be a variance divergence segment. Within the time interval corresponding to the variance convergence segment, the prediction confidence remains stable, the confidence score values within this interval are concentrated, and the standard deviation is small. The upper and lower bounds of the confidence level in this interval are used as the basis for defining the stable confidence interval, typically ranging from -0.5 to +0.5 standard deviations of the mean confidence level. Within the time interval corresponding to the variance divergence segment, the predicted confidence level exhibits a fluctuating trend, with a discrete distribution of confidence level values and a large standard deviation. The upper and lower bounds of the confidence level in this interval are used as the basis for defining the fluctuating confidence interval, typically ranging from -1.5 to +1.5 standard deviations of the mean confidence level.
[0052] A stable confidence interval reflects the predictable portion of traffic demand, and the confidence boundary values within this interval can be transformed into deterministic boundaries for traffic demand. The traffic prediction value corresponding to the lower boundary confidence value of the stable confidence interval is taken as the minimum demand for deterministic load, and the traffic prediction value corresponding to the upper boundary confidence value is taken as the maximum demand for deterministic load. The baseline load capacity that the deterministic computing layer needs to support is determined by these two boundary values. Assuming the lower boundary confidence of the stable confidence interval is 0.75, corresponding to a traffic prediction of 800 requests per second, and the upper boundary confidence is 0.85, corresponding to a traffic prediction of 950 requests per second, then the baseline load capacity of the deterministic computing layer should be able to support a processing demand of 800 to 950 requests per second. By mapping the confidence range of the stable confidence interval to a numerical range of traffic processing capacity, the number of computing units required for the deterministic computing layer can be calculated. When the processing capacity of each computing unit is known, the required number of computing units can be obtained by dividing the deterministic boundary of traffic demand by the processing capacity of a single computing unit. This quantity is used as the capacity boundary of the deterministic computing layer to ensure that the resource allocation at this level can stably cope with the high-confidence traffic demand. The capacity boundary is set using a rounding-up principle to ensure that the actual configured computing resources are slightly higher than the theoretical calculated value, avoiding resource shortages under boundary conditions.
[0053] The fluctuation confidence interval reflects the uncertainty of traffic demand, and the deviation of the confidence level within this interval can be converted into the uncertainty range of traffic demand. The deviation of the confidence level within the fluctuation confidence interval is used as a quantitative indicator of traffic fluctuation to calculate the extent to which actual traffic exceeds or falls below the predicted value. Assuming the confidence level deviation of the fluctuation confidence interval is between -15% and +20%, and the corresponding traffic prediction baseline is 1000 requests per second, then the uncertainty range of traffic demand is 850 to 1200 requests per second. The overload capacity variation space that the uncertainty buffer layer needs to absorb is determined by this uncertainty range. The overload capacity variation space refers to the additional computing power reserve that needs to be prepared when the deterministic computing power layer capacity is fixed. This reserve needs to cover all cases where traffic demand exceeds the deterministic boundary. When calculating the overload capacity variation space, the upper limit of the uncertainty boundary is subtracted from the upper limit of the uncertainty range to obtain the positive overload capacity demand; the lower limit of the uncertainty range is subtracted from the lower limit of the deterministic boundary to obtain the negative overload capacity demand. Positive overload capacity demand corresponds to the additional computing power that needs to be quickly mobilized in scenarios of sudden traffic surges, while negative overload capacity demand corresponds to the redundant computing power that can be released in scenarios of sudden traffic drops.
[0054] The overload capacity variation space is divided into elastic expansion and elastic contraction zones to achieve bidirectional adjustment capabilities of computing resources. The elastic expansion zone corresponds to positive overload capacity demand; computing resources within this zone are reserved and ready to be activated when actual traffic exceeds the deterministic boundary. The upper limit of the elastic expansion zone's capacity is determined by the maximum deviation of the fluctuation confidence interval. Assuming the maximum deviation corresponds to a traffic increment of 250 requests per second, and a single computing unit's processing capacity is 50 requests per second, then the elastic expansion zone requires 5 reserved computing units. The elastic contraction zone corresponds to negative overload capacity demand; computing resources within this zone can be released based on a decrease in actual traffic. The lower limit of the elastic contraction zone's capacity is determined by the minimum deviation of the fluctuation confidence interval. Assuming the minimum deviation corresponds to a traffic reduction of 150 requests per second, then the elastic contraction zone can release the resources of 3 computing units. The elastic expansion and elastic contraction zones together constitute the capacity elastic space of the uncertainty buffer layer; the total amount of this space reflects the flexible allocation capability of computing resources. The configuration of capacity elasticity space follows the principle of asymmetry. The capacity of the expansion range is usually greater than that of the contraction range because the impact of a sudden increase in traffic on service quality is greater than that of a sudden drop in traffic, and more uplink adjustment space needs to be reserved.
[0055] Through the aforementioned mapping mechanism, the statistical characteristics of the confidence distribution directly drive the hierarchical allocation logic of computing resources. A rigid binding is formed between the stable confidence interval and the capacity boundary of the deterministic computing layer, ensuring reliable protection of the traffic demand corresponding to high-confidence predictions. A flexible correlation is formed between the fluctuating confidence interval and the capacity elasticity space of the uncertainty buffer layer, ensuring effective absorption of traffic fluctuations corresponding to low-confidence predictions. This confidence-driven computing supply and demand mapping relationship quantifies the uncertainty of predictions into configuration parameters for computing resources, achieving a precise conversion between the probabilistic characteristics at the prediction level and the resource allocation at the execution level. In actual operation, when the prediction confidence distribution changes, the capacity configuration of the hierarchical computing structure will be adjusted synchronously to maintain the dynamic consistency of the mapping relationship.
[0056] In one optional implementation, the computing power allocation is performed on each CDN node based on the computing power allocation command, and feedback information is formed by extracting actual traffic data and computing power usage data from the allocated operating status, including: The computing power allocation scheme and inter-layer migration strategy in the computing power allocation instruction are analyzed. The target computing power quota of the deterministic computing power layer and the uncertain buffer layer in each CDN node are determined according to the computing power allocation scheme. The timing scheduling rules for the transfer of computing power resources between layers are determined according to the inter-layer migration strategy. According to the time-series scheduling rules, the computing resources of each CDN node are migrated between layers, and the migrated computing resources are redistributed according to the target computing power quota. After the computing resources are allocated, the operating status of each CDN node is continuously monitored, and actual traffic data reflecting the actual arrival of traffic and computing power usage data reflecting the actual consumption of computing power are extracted from the operating status. The actual traffic data and the computing power usage data are timestamped to establish a correlation between the arrival time of traffic and the consumption time of computing power. Based on the correlation, the actual traffic data and the computing power usage data are encapsulated into feedback information containing time-series correlation features.
[0057] The parsing process of the allocation instructions involves a structured decomposition of the computing power allocation scheme and the inter-layer migration strategy. The computing power allocation scheme includes a resource quota description for each CDN node. This description uses the node identifier as an index to link the baseline resource amount of the deterministic computing power layer and the elastic resource amount of the uncertain buffer layer. The target computing power quota for the deterministic computing power layer is determined by analyzing the stable components of the predicted traffic value. This quota needs to meet the requirements of AI inference tasks under steady-state load, specifically ensuring the processing capacity for computationally intensive tasks such as image recognition and content analysis. The target computing power quota for the uncertain buffer layer is set based on the traffic fluctuation amplitude within the prediction time window. This quota is used to cope with the surge in instantaneous computing demand caused by sudden traffic spikes, and its capacity is directly related to the prediction error boundary. The inter-layer migration strategy, with time as the main axis, defines the timing and amount of computing power resource transfer between the two layers. The time-series scheduling rules are implemented through a conditional trigger mechanism. When the computing power utilization rate of the uncertain buffer layer exceeds a preset threshold, some resources are automatically allocated from the deterministic computing power layer for inter-layer supplementation.
[0058] The execution of inter-layer migration operations relies on the resource scheduler within the node. This scheduler maintains a state mapping table of computing resources, recording the current ownership level and availability status of each computing unit. When migration is triggered, the scheduler first filters out computing units occupied by lower-priority tasks in the deterministic computing layer, suspends the execution of these tasks, saves their context information, and marks the released computing resources as migrateable. Subsequently, according to the migration amount parameter in the timing scheduling rules, a specified number of computing units are remapped to the uncertain buffer layer, the resource ownership labels are updated, and usage restrictions are removed. After migration, the computing resources are redistributed according to the target computing quota. The deterministic computing layer prioritizes long-term, stable AI service requests, allocating computing resources to tasks with low latency sensitivity, such as content recommendation and video transcoding. The uncertain buffer layer prioritizes responding to AI inference requests with high real-time requirements, such as user behavior prediction and real-time image processing, managing the order of computing resource usage after allocation through a task priority queue.
[0059] After the allocation is completed, the operational status monitoring is achieved through data acquisition agents deployed on each CDN node. These agents collect multi-dimensional operational metrics of the nodes at a fixed sampling period. The extraction of actual traffic data includes statistics on the number of inbound requests and identification of traffic types. The acquisition agent records the total number of requests arriving at the node within each time slice, while also labeling the content type tags carried by the requests, such as image requests, video streaming requests, or interactive AI service requests. For the actual arrival of traffic, the acquisition agent also records the timestamp of the request arrival, accurate to the millisecond level, providing benchmark data for subsequent timestamp alignment. The extraction of computing power usage data involves end-to-end tracking of the computing resource consumption process. The acquisition agent monitors the utilization rate of each computing unit, task execution time, and energy consumption metrics, specifically recording the number of computing cycles and memory bandwidth usage consumed from receipt to completion of each AI inference task. Actual computing power consumption also includes dynamic changes in inter-layer resources. The acquisition agent continuously tracks the resource usage ratio between the deterministic computing power layer and the uncertain buffer layer, recording computing power reallocation events caused by migration operations and their impact scope.
[0060] Timestamp alignment addresses the synchronization issue between actual traffic data and computing power usage data in the time dimension. Since there is a time offset between traffic arrival and computing power consumption, a correspondence needs to be established between the two. The alignment process first extracts the request arrival timestamp sequence from the actual traffic data, using it as a time baseline. Then, it retrieves the task execution records associated with each request from the computing power usage data, matching them using request identifiers to determine the start and end times of the AI inference task triggered by that request. The correlation between traffic arrival time and computing power consumption time is represented by a mapping between the time difference and resource consumption. Specifically, it calculates the total latency from each request's arrival at the node to completion of computation, as well as the total amount of computing power resources consumed within that latency. Establishing this correlation also needs to consider the impact of task queuing delays. When node computing power resources are strained, some requests will wait in a buffer queue. The alignment process needs to include queuing time in the time difference calculation, distinguishing the proportion of network transmission latency, queuing latency, and actual computation latency.
[0061] The encapsulation of feedback information organizes the aligned data into a structured format. Temporal correlation features are represented by multi-dimensional vectors, with each vector element corresponding to a snapshot of traffic and computing power status within a time slice. The encapsulation process first aggregates actual traffic data by time slice, calculating the total number of requests arriving, request type distribution, and peak traffic times within each time slice. Simultaneously, computing power usage data is aggregated by the same time slice, calculating the average utilization, peak utilization, and inter-layer migration trigger count for the deterministic computing power layer and the uncertain buffer layer within each time slice. Temporal correlation features also include traffic and computing power matching efficiency indicators. By comparing the deviation between predicted traffic values and actual traffic data, and combining this with computing power quota usage, the execution effect of the allocation plan is quantitatively evaluated. The encapsulated feedback information is appended with metadata tags, annotating the start and end times of the collection period, node geographical location, network topology status, and other contextual information, providing complete data input for subsequent prediction model optimization and decision logic reconstruction. The transmission of feedback information adopts an incremental update mechanism, transmitting only data items that have changed relative to the previous period. This reduces data transmission overhead while ensuring real-time feedback, supporting rapid iteration and continuous optimization of the allocation decision model.
[0062] In one optional implementation, the feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The mapping relationship is then used to reconstruct the error boundary and reserve space mapping logic in the allocation decision model, including: Extract actual traffic data and predicted traffic values from traffic prediction results from the feedback information, calculate the distribution characteristics of prediction deviation, and extract computing power usage data and computing power quotas of each layer in the computing power hierarchical structure from the feedback information to calculate the structural characteristics of actual resource consumption. Based on the distribution characteristics of the predicted deviation, the deviation concentration interval and the deviation discrete interval are identified. Based on the structural characteristics of the actual resource consumption, the resource saturation level and the resource redundancy level are identified. A cross-correlation matrix between the deviation interval and the resource level is constructed. By establishing a quantitative mapping relationship between prediction deviation and actual resource consumption through the correlation strength in the cross-correlation matrix, the current error boundary and reserve space mapping logic are extracted from the allocation decision model. The driving degree in the quantitative mapping relationship is used as the basis for reconstruction, and the boundary range of the error boundary and the spatial scale of the reserve space are adjusted in a coordinated manner to complete the reconstruction of the error boundary and reserve space mapping logic in the allocation decision model.
[0063] During the operation of the CDNAI service, the continuous accumulation of feedback information provides real data support for the optimization of prediction models and allocation strategies. Feedback information collected from the operational status after allocation execution typically includes two core components: actual traffic data and computing power usage data. Actual traffic data records the actual number of user requests, bandwidth usage, and service response latency at each time point. Computing power usage data reflects the operational status of each CDN node at different times, including GPU computing resources consumed, memory usage, and the number of concurrent inference tasks. By comparing these actual traffic data with the predicted traffic values in the traffic prediction results time-by-time, the deviation value at each prediction time point can be obtained. The distribution characteristics of the prediction deviation are quantified using statistical methods, calculating statistics such as the mean, variance, skewness, and kurtosis of the deviation sequence. These statistics reveal whether the prediction error is concentrated in certain specific time periods or exhibits a discrete distribution throughout the entire prediction period. When the deviation variance is small and the kurtosis is high, it indicates that the prediction error is mainly concentrated in a specific interval; when the deviation variance is large and the distribution is relatively flat, it indicates that the prediction error exhibits a discrete distribution characteristic.
[0064] During the concurrent analysis of computing power usage data, the actual computing power consumption of each CDN node at each moment is extracted from the feedback information and compared with the computing power quota of each layer in the computing power hierarchy. The computing power hierarchy typically includes a basic service layer, an elastic scaling layer, and a peak emergency layer, each with a set upper limit on its computing power quota. The structural characteristics of actual resource consumption are reflected by calculating the resource utilization rate of each layer, defined as the ratio of actual consumed computing power to the quota of that layer. When the resource utilization rate of a layer is consistently close to or reaches 100%, that layer is identified as a resource saturation layer, indicating that the computing power quota of that layer cannot meet the actual demand. When the resource utilization rate of a layer is consistently below a set threshold, such as 50%, that layer is identified as a resource redundancy layer, indicating that the quota allocation of that layer is too high, resulting in idle computing power resources. By traversing the utilization rate data of all layers, a resource consumption feature vector reflecting the operational status of the entire computing power hierarchy can be constructed.
[0065] After obtaining the distribution characteristics of prediction errors and the structural characteristics of actual resource consumption, the concentrated and discrete intervals of prediction errors are further identified. The method for identifying concentrated intervals is to divide the entire prediction period into several time periods and calculate the local variance of the prediction error in each time period. When the local variance of a certain time period is lower than a specific proportion of the global variance, such as 60%, that time period is marked as a concentrated interval, meaning that the prediction error is relatively stable and controllable within that time period. Conversely, when the local variance of a certain time period is significantly higher than the global variance, that time period is marked as a discrete interval, indicating that the prediction error fluctuates greatly and is difficult to predict within that time period. The identification of resource saturation levels is based on the persistence of resource utilization. If the resource utilization of a certain level exceeds 90% in multiple consecutive sampling periods, that level is marked as a resource saturation level. The identification of resource redundancy levels is based on the long-term low resource utilization. If the average resource utilization of a certain level is lower than 40% within a statistical period, that level is marked as a resource redundancy level.
[0066] The purpose of constructing the cross-correlation matrix is to reveal the intrinsic relationship between prediction error characteristics and resource consumption characteristics. The matrix's row dimensions correspond to the type of deviation interval, including concentrated and discrete deviation intervals, while the column dimensions correspond to the type of resource level, including resource saturation and resource redundancy levels. Each element in the matrix represents the correlation strength between a specific deviation interval and a specific resource level. The correlation strength is obtained by calculating the co-occurrence frequency and correlation coefficient of the two over time. Specifically, the calculation method involves statistically analyzing the utilization rate distribution of each resource level within the concentrated and discrete deviation intervals, and then quantifying the degree of correlation by calculating conditional probability and mutual information. For example, if the frequency of occurrence of resource saturation levels is significantly higher in the discrete deviation interval than in the concentrated deviation interval, the correlation strength between the two is higher, indicating that the increase in prediction error directly leads to resource shortages at certain levels.
[0067] The quantization mapping relationship is established through the correlation strength values in the cross-correlation matrix, mapping the statistical characteristics of the deviation interval to the adjustment requirements of the resource level. When the correlation strength between the discrete deviation interval and the resource saturation level exceeds a set threshold, it indicates that the increased prediction uncertainty will lead to a severe shortage of computing resources. In this case, it is necessary to expand the boundary range of the error boundary and increase the spatial scale of the reserve space. When the correlation strength between the concentrated deviation interval and the resource redundancy level is high, it indicates that computing resources are over-allocated when the prediction error is small. In this case, it is necessary to narrow the boundary range of the error boundary and reduce the spatial scale of the reserve space. The mathematical expression of the quantization mapping relationship is a functional relationship between the correlation strength and the adjustment magnitude; the higher the correlation strength, the larger the adjustment magnitude.
[0068] The current error boundary and reserve space mapping logic is extracted from the allocation decision model. This logic defines how much computing power reserve space should be reserved for each level under a given prediction error boundary. The original mapping logic sets fixed mapping coefficients based on historical experience; for example, the reserve space increases by 5% for every 10% increase in prediction error. During the reconstruction process, the driving degree in the quantified mapping relationship is used as the basis for reconstruction. The driving degree reflects the degree of deviation of the actual operating data from the original mapping logic. The calculation method is to compare the resource requirements predicted by the quantified mapping relationship with the resource requirements calculated by the original mapping logic and calculate the relative deviation rate between the two. This deviation rate is the driving degree. When the driving degree exceeds a set threshold, such as 15%, the reconstruction process of the mapping logic is triggered.
[0069] The adjustment of the error boundary range is based on the distribution characteristics of prediction deviations. If the proportion of time occupied by the discrete deviation interval increases, it indicates an increase in prediction uncertainty. In this case, the upper limit of the error boundary is expanded outward, with the expansion magnitude proportional to the growth rate of the deviation variance. If the proportion of time occupied by the concentrated deviation interval increases, it indicates improved prediction stability. In this case, the upper limit of the error boundary is contracted inward, with the contraction magnitude proportional to the decrease rate of the deviation variance. The adjustment of the spatial scale of the reserve space is based on the structural characteristics of actual resource consumption. If the number of resource saturation levels increases or the duration of saturation lengthens, it indicates insufficient reserve space. In this case, the computing power reserve quota for each level is increased, with the increase magnitude proportional to the degree of resource saturation. If the number of resource redundancy levels increases or the redundancy rate increases, it indicates an excess of reserve space. In this case, the computing power reserve quota for each level is reduced, with the reduction magnitude proportional to the resource redundancy rate.
[0070] The coordinated adjustment mechanism ensures that the adjustment of the error boundary and the reserve space remain consistent, avoiding the contradiction of the error boundary expanding without a corresponding increase in the reserve space. The coordination rule is set so that the adjustment magnitude of the error boundary and the adjustment magnitude of the reserve space maintain a specific proportional relationship, determined based on the dominant correlation strength in the cross-correlation matrix. If the correlation strength between the deviation discrete interval and the resource saturation level is the highest, then for every 1% expansion of the error boundary, the reserve space needs to increase synchronously by 0.8% to 1.2%, with the specific increase determined by the refined numerical value of the correlation strength. Through this coordinated adjustment, the mapping logic between the error boundary and the reserve space in the allocation decision model is reconstructed, enabling the model to more accurately reflect the true relationship between prediction uncertainty and the matching of computing power supply and demand.
[0071] The reconstructed mapping logic takes effect immediately and is applied to the next round of computing power allocation decision generation. The new mapping logic will drive the allocation decision model to generate computing power allocation schemes that better meet actual needs, reducing resource waste or service quality degradation caused by prediction errors. As feedback information continues to accumulate and the quantified mapping relationship is constantly updated, the reconstruction process of the mapping logic exhibits dynamic iterative characteristics, enabling the entire CDNAI service system to continuously optimize.
[0072] In one optional implementation, the process of identifying concentrated and discrete deviation intervals based on the distribution characteristics of the prediction deviation, identifying resource saturation and redundancy levels based on the structural characteristics of actual resource consumption, and constructing a cross-correlation matrix between deviation intervals and resource levels includes: Entropy values are calculated based on the distribution characteristics of the prediction deviation. The degree of deviation clustering is determined based on the magnitude of the entropy value. Time intervals with entropy values less than a preset entropy threshold are marked as deviation concentration intervals, and time intervals with entropy values greater than the preset entropy threshold are marked as deviation dispersion intervals. The structural characteristics of actual resource consumption are traversed hierarchically, and the resource utilization saturation of each level is calculated. Based on the comparison between the resource utilization saturation and the preset saturation benchmark, the levels with saturation exceeding the preset saturation benchmark are marked as resource saturated levels, and the levels with saturation not reaching the preset saturation benchmark are marked as resource redundant levels. Use the deviation concentration interval and the deviation discrete interval as row indices, and the resource saturation level and the resource redundancy level as column indices to initialize the cross-association matrix framework; For each row and column intersection position in the cross-association matrix framework, the consumption response intensity of the corresponding resource level within the corresponding deviation interval is calculated. The consumption response intensity is then used as the association weight to fill the corresponding position in the cross-association matrix framework to construct the cross-association matrix between the deviation interval and the resource level.
[0073] like Figure 2 As shown, the method includes: After obtaining the prediction deviation sequence of the CDN system within a specific observation period, a quantitative analysis of its distribution characteristics is performed. The observation period is divided into several time windows with fixed time granularity, each containing multiple prediction deviation sample points. For the set of prediction deviation samples within each time window, the information entropy value of its distribution is calculated. The entropy value is calculated by discretizing the deviation value range into several intervals, statistically analyzing the probability of samples appearing in each interval, and then calculating the Shannon entropy based on the probability distribution. When the prediction deviations within a certain time window exhibit highly clustered characteristics, their corresponding probability distribution tends to concentrate in a few discrete intervals, resulting in a decrease in entropy value; when the prediction deviations exhibit dispersed characteristics within the time window, the probability distribution tends to be uniform, and the entropy value increases accordingly.
[0074] A preset entropy threshold is set as the judgment benchmark. This threshold is determined through statistical analysis of historical data and reflects the typical entropy level of the prediction deviation distribution under normal operating conditions. The calculated entropy value corresponding to each time window is compared with the preset entropy threshold. If the entropy value of the time window is less than the preset entropy threshold, it indicates that the prediction deviation within the window exhibits a concentrated distribution characteristic, and the time window is marked as a deviation concentration interval. Such intervals often correspond to the systematic shift of the prediction model under specific scenarios. If the entropy value of the time window is greater than the preset entropy threshold, it indicates that the prediction deviation within the window exhibits a discrete distribution characteristic, and the time window is marked as a deviation discrete interval. Such intervals usually reflect prediction fluctuations caused by sudden events or random disturbances.
[0075] The structural characteristic analysis of actual resource consumption is based on the established computing power hierarchy, which divides the AI computing power resources of CDN nodes into multiple levels, each corresponding to different service priorities and resource configuration scales. The hierarchy is traversed hierarchically, from the bottom basic computing layer to the top core scheduling layer, extracting the actual resource usage and resource configuration capacity of each level within the current observation period. The resource utilization saturation of each level is calculated, obtained as the ratio of actual usage to configuration capacity; a ratio closer to 1 indicates that the level's resources are nearing full load.
[0076] A preset saturation benchmark is set as the threshold for classifying resource levels. This benchmark value is typically set between 0.75 and 0.85 to balance the need for efficient resource utilization with the requirement of reserving emergency buffer space. During the traversal, the resource utilization saturation calculated for each level is compared with the preset saturation benchmark. If the saturation of a level exceeds the preset saturation benchmark, it indicates that the resources at that level are close to or have reached their utilization limit, posing a risk of resource supply shortage, and the level is marked as a resource-saturated level. If the saturation of a level does not reach the preset saturation benchmark, it indicates that there are still many idle resources at that level, indicating resource redundancy, and the level is marked as a resource-redundant level. This marking mechanism establishes a binary classification system for resource level states, providing structured input for subsequent cross-correlation analysis.
[0077] After completing the labeling of deviation intervals and resource levels, a cross-association matrix framework is constructed. This matrix adopts a two-dimensional table structure. The row dimension corresponds to the time interval type of the predicted deviation, including all labeled deviation concentration intervals and deviation dispersion intervals, with each specific time window serving as an independent row index. The column dimension corresponds to the hierarchical classification of computing resources, including all labeled resource saturation levels and resource redundancy levels, with each specific level number serving as an independent column index. During initialization, all element values of the matrix are set to zero, forming a blank cross-association matrix framework. The dimensionality of the matrix is determined by the product of the number of actually identified deviation intervals and the number of resource levels.
[0078] For each row and column intersection in the cross-correlation matrix framework, the consumption response intensity of the corresponding resource level within the corresponding deviation interval is calculated. The specific calculation process is as follows: A specific deviation interval (corresponding to a row in the matrix) is selected, and the actual resource consumption sequence for that specific resource level (corresponding to a column in the matrix) within that time interval is extracted. The response sensitivity of this consumption sequence relative to the predicted flow change within the deviation interval is calculated, i.e., the ratio of the change in resource consumption to the change in predicted deviation. A time lag factor correction is introduced to examine the response delay of resource consumption adjustment after the prediction deviation occurs, and the response intensity of different time lag intervals is weighted and accumulated. Normalization is performed by combining the duration of the deviation interval with the baseline consumption level of the resource level to eliminate the influence of differences in time scale and level size, resulting in a standardized consumption response intensity value.
[0079] The calculated consumption response intensity is used as the correlation weight and filled into the elements at the corresponding row and column intersection positions in the cross-correlation matrix framework. This weight value reflects the actual consumption driving force of a specific type of prediction bias (centralized or discrete) on a specific state of resource level (saturated or redundant). After traversing and filling all row and column intersection positions of the matrix, a complete cross-correlation matrix is formed. The numerical distribution of this matrix reveals the coupling relationship between the distribution pattern of prediction bias and the structural pattern of resource consumption: if the matrix element value corresponding to the concentrated bias interval and the resource saturation level is high, it indicates that the systematic bias of the prediction directly leads to the continuous tension of resources at a specific level; if the matrix element value corresponding to the discrete bias interval and the resource saturation level is high, it indicates that sudden prediction fluctuations have triggered instantaneous overload of the resource level; if the matrix element value corresponding to the concentrated bias interval and the resource redundancy level is low, it indicates that the prediction bias has failed to effectively activate the redundancy resource mobilization mechanism.
[0080] The construction of the cross-correlation matrix provides a quantitative basis for subsequent model reconstruction. By analyzing the distribution of high-weight elements in the matrix, we can identify which types of prediction bias have the most significant impact on which resource levels, thus determining the error boundary mapping logic that needs to be adjusted. By analyzing the distribution of low-weight elements in the matrix, we can identify the weak correlation between prediction bias and resource consumption, guiding the optimization direction of the reserve space mapping logic. The numerical values of the matrix elements are directly used to calculate the weight coefficients of each mapping path in the allocation decision model. Combinations of high-response-intensity bias intervals and resource levels will receive higher computing power reservation priority after model reconstruction, while combinations of low-response-intensity combinations will correspondingly reduce the reservation quota, achieving precise resource allocation strategy adjustment based on empirical data.
[0081] In practical applications, the construction of the cross-correlation matrix can be performed periodically. Within each update cycle, the latest prediction deviation data and resource consumption data are extracted, and interval labeling, hierarchical labeling, and response intensity calculations are recalculated to generate an updated cross-correlation matrix. The matrix for the current cycle is compared with the matrices from historical cycles using difference analysis. The evolution direction of the prediction deviation and resource consumption correlation pattern is determined by the changing trends of matrix element values, identifying long-term drift or periodic fluctuations in the system's operating status. When matrix element values show a continuous upward trend over multiple consecutive cycles, a deep diagnostic process for the corresponding deviation interval and resource level is triggered to investigate root causes such as hardware failures, network congestion, or changes in business load patterns. When matrix element values fluctuate drastically, the real-time monitoring frequency of the deviation interval and resource level combination is increased to improve the response speed of anomaly detection.
[0082] The numerical characteristics of the cross-correlation matrix can also be used to drive hierarchical management of computing power allocation strategies. Based on the response intensity ranges of different row and column combinations in the matrix, all combinations of deviation intervals and resource levels are divided into three levels: high-sensitivity, medium-sensitivity, and low-sensitivity. For high-sensitivity combinations, an automatic triggering mechanism is configured. When a prediction deviation of the corresponding type is detected, the computing power expansion or migration process for the corresponding resource level is immediately initiated without manual approval. For medium-sensitivity combinations, an early warning notification mechanism is set up to push the deviation detection results and resource level status to operations and maintenance personnel, who then make a manual decision on whether to execute resource allocation. For low-sensitivity combinations, only deviation and consumption data are recorded for post-event analysis, without triggering real-time allocation actions. This hierarchical management strategy ensures rapid processing of high-response scenarios while avoiding excessive intervention in weakly correlated scenarios, reducing system allocation overhead and stability risks.
[0083] A second aspect of this invention provides a CDNAI service provision system with dynamic computing power allocation, comprising: The prediction unit is used to acquire historical traffic data and current network status data within the target area, extract features from the historical traffic data based on time series analysis rules, and generate a traffic prediction result containing the predicted traffic value and the prediction time window. The allocation decision unit is used to classify and identify the AI computing power resources of CDN nodes according to the traffic prediction results to obtain a computing power layer structure. Based on the traffic prediction results and the computing power layer structure, an allocation decision model that reflects the prediction uncertainty and the matching relationship between computing power supply and demand is constructed. The allocation decision model generates computing power allocation instructions that include computing power allocation schemes and inter-layer migration strategies by mapping the prediction error boundary to computing power reserve space. The feedback evolution unit is used to perform computing power resource allocation on each CDN node based on the computing power allocation instruction, and extract actual traffic data and computing power usage data from the allocated operating status to form feedback information. The feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The quantitative mapping relationship is used to reconstruct the error boundary and reserve space mapping logic in the allocation decision model, and drive the prediction logic of the time series analysis rule to undergo adaptive evolution.
[0084] A third aspect of the present invention provides an electronic device, comprising: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0085] A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0086] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.
[0087] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for providing CDNAI services with dynamic allocation of computing power, characterized in that, include: Historical traffic data and current network status data within the target area are acquired. Based on time series analysis rules, features are extracted from the historical traffic data to generate a traffic prediction result that includes predicted traffic values and prediction time windows. Based on the traffic prediction results, the AI computing power resources of CDN nodes are hierarchically identified to obtain a computing power hierarchical structure. Based on the traffic prediction results and the computing power hierarchical structure, an allocation decision model that reflects the relationship between prediction uncertainty and computing power supply and demand is constructed. The allocation decision model maps the prediction error boundary to computing power reserve space and generates computing power allocation instructions that include computing power allocation schemes and inter-layer migration strategies. Based on the computing power allocation instructions, computing power resources are allocated to each CDN node, and actual traffic data and computing power usage data are extracted from the allocated operating status to form feedback information. The feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The error boundary and reserve space mapping logic in the allocation decision model are reconstructed through the quantitative mapping relationship, and the prediction logic of the time series analysis rule is driven to undergo adaptive evolution.
2. The method according to claim 1, characterized in that, Based on time-series analysis rules, features are extracted from the historical traffic data to generate traffic prediction results that include predicted traffic values and prediction time windows, including: Historical traffic data is decomposed into multiple scales according to the time dimension to identify periodic fluctuation patterns and non-periodic mutation patterns. Based on the repetition interval of the periodic fluctuation patterns and the triggering conditions of the non-periodic mutation patterns, a temporal evolution law representation is constructed. Based on the stability index of the periodic fluctuation pattern and the predictability index of the non-periodic mutation pattern, the effective range of the impact of different patterns on future flow is determined, forming the boundary of the pattern's domain. Based on the boundary of the mode's scope, the temporal evolution law is projected over time to generate a probability distribution of traffic changes within a future time period. The flow rate value with the highest probability density is extracted from the probability distribution as the predicted flow rate value, and the boundary of the mode scope is mapped to the effective time range of the predicted flow rate value to obtain the prediction time window. This completes feature extraction and generates a flow rate prediction result containing the predicted flow rate value and the prediction time window.
3. The method according to claim 1, characterized in that, Based on the traffic prediction results, the AI computing power resources of CDN nodes are hierarchically identified to obtain a computing power hierarchy structure. Based on the traffic prediction results and the computing power hierarchy structure, an allocation decision model reflecting the relationship between prediction uncertainty and computing power supply and demand is constructed, including: Extract the distribution of prediction confidence and the gradient of change of prediction traffic value from the traffic prediction results. Based on the dispersion of the prediction confidence distribution, classify the AI computing resources of each CDN node into risk levels. Label the computing resources of different risk levels with hierarchical attributes according to the gradient of change of prediction traffic value, forming a computing power layered structure that includes a deterministic computing power layer and an uncertainty buffer layer. The capacity boundary of the deterministic computing power layer in the computing power hierarchical structure is bound to the stable confidence interval of the predicted confidence distribution, and the capacity elasticity space of the uncertainty buffer layer is associated with the fluctuating confidence interval of the predicted confidence distribution, thereby establishing a confidence-driven computing power supply and demand mapping relationship. Based on the aforementioned computing power supply and demand mapping relationship, an allocation decision model reflecting the relationship between prediction uncertainty and computing power supply and demand matching is constructed.
4. The method according to claim 3, characterized in that, The process of binding the capacity boundary of the deterministic computing power layer in the hierarchical computing power structure to the stable confidence interval of the predicted confidence distribution, and associating the capacity elasticity space of the uncertainty buffer layer with the fluctuating confidence interval of the predicted confidence distribution, establishes a confidence-driven computing power supply and demand mapping relationship, including: Extract the confidence variance sequence from the predicted confidence distribution, identify the variance convergence segment and the variance divergence segment based on the changing trend of the confidence variance sequence, mark the confidence interval corresponding to the variance convergence segment as the stable confidence interval, and mark the confidence interval corresponding to the variance divergence segment as the fluctuating confidence interval. The confidence boundary value of the stable confidence interval is mapped to the deterministic boundary of the traffic demand. Based on the deterministic boundary, the baseline load capacity required by the deterministic computing layer is calculated, and the baseline load capacity is set as the capacity boundary of the deterministic computing layer, thereby realizing the binding between the capacity boundary and the stable confidence interval. The confidence deviation of the fluctuation confidence interval is mapped to the uncertainty range of traffic demand, and the overload capacity change space that the uncertainty buffer layer needs to absorb is calculated based on the uncertainty range. The overload capacity variation space is divided into an elastic expansion interval and an elastic contraction interval, and the elastic expansion interval and the elastic contraction interval together constitute the capacity elastic space of the uncertainty buffer layer, thereby establishing a confidence-driven computing power supply and demand mapping relationship.
5. The method according to claim 1, characterized in that, Based on the aforementioned computing power allocation instructions, computing power resources are allocated to each CDN node, and feedback information is formed by extracting actual traffic data and computing power usage data from the allocated operating status, including: The computing power allocation scheme and inter-layer migration strategy in the computing power allocation instruction are analyzed. The target computing power quota of the deterministic computing power layer and the uncertain buffer layer in each CDN node are determined according to the computing power allocation scheme. The timing scheduling rules for the transfer of computing power resources between layers are determined according to the inter-layer migration strategy. According to the time-series scheduling rules, the computing resources of each CDN node are migrated between layers, and the migrated computing resources are redistributed according to the target computing power quota. After the computing resources are allocated, the operating status of each CDN node is continuously monitored, and actual traffic data reflecting the actual arrival of traffic and computing power usage data reflecting the actual consumption of computing power are extracted from the operating status. The actual traffic data and the computing power usage data are timestamped to establish a correlation between the arrival time of traffic and the consumption time of computing power. Based on the correlation, the actual traffic data and the computing power usage data are encapsulated into feedback information containing time-series correlation features.
6. The method according to claim 1, characterized in that, Establishing a quantitative mapping relationship between prediction deviation and actual resource consumption using the feedback information, and reconstructing the error boundary and reserve space mapping logic in the allocation decision model through the quantitative mapping relationship includes: Extract actual traffic data and predicted traffic values from traffic prediction results from the feedback information, calculate the distribution characteristics of prediction deviation, and extract computing power usage data and computing power quotas of each layer in the computing power hierarchical structure from the feedback information to calculate the structural characteristics of actual resource consumption. Based on the distribution characteristics of the predicted deviation, the deviation concentration interval and the deviation discrete interval are identified. Based on the structural characteristics of the actual resource consumption, the resource saturation level and the resource redundancy level are identified. A cross-correlation matrix between the deviation interval and the resource level is constructed. By establishing the correlation strength in the cross-correlation matrix, a quantitative mapping relationship between prediction deviation and actual resource consumption is established. The current error boundary and reserve space mapping logic are extracted from the allocation decision model. The driving degree in the quantitative mapping relationship is used as the basis for reconstruction. The boundary range of the error boundary and the spatial scale of the reserve space are adjusted in a coordinated manner to complete the reconstruction of the error boundary and reserve space mapping logic in the allocation decision model.
7. The method according to claim 6, characterized in that, Based on the distribution characteristics of the predicted deviation, the concentrated and discrete intervals of the deviation are identified. Based on the structural characteristics of the actual resource consumption, the resource saturation level and resource redundancy level are identified. The cross-correlation matrix between the deviation interval and the resource level is constructed as follows: Entropy values are calculated based on the distribution characteristics of the prediction deviation. The degree of deviation clustering is determined based on the magnitude of the entropy value. Time intervals with entropy values less than a preset entropy threshold are marked as deviation concentration intervals, and time intervals with entropy values greater than the preset entropy threshold are marked as deviation dispersion intervals. The structural characteristics of actual resource consumption are traversed hierarchically, and the resource utilization saturation of each level is calculated. Based on the comparison between the resource utilization saturation and the preset saturation benchmark, the levels with saturation exceeding the preset saturation benchmark are marked as resource saturated levels, and the levels with saturation not reaching the preset saturation benchmark are marked as resource redundant levels. Use the deviation concentration interval and the deviation discrete interval as row indices, and the resource saturation level and the resource redundancy level as column indices to initialize the cross-association matrix framework; For each row and column intersection position in the cross-association matrix framework, the consumption response intensity of the corresponding resource level within the corresponding deviation interval is calculated. The consumption response intensity is then used as the association weight to fill the corresponding position in the cross-association matrix framework to construct the cross-association matrix between the deviation interval and the resource level.
8. A CDNAI service provision system with dynamic computing power allocation, used to implement the method as described in any one of claims 1-7, characterized in that, include: The prediction unit is used to acquire historical traffic data and current network status data within the target area, extract features from the historical traffic data based on time series analysis rules, and generate a traffic prediction result containing the predicted traffic value and the prediction time window. The allocation decision unit is used to classify and identify the AI computing power resources of CDN nodes according to the traffic prediction results to obtain a computing power layer structure. Based on the traffic prediction results and the computing power layer structure, an allocation decision model that reflects the prediction uncertainty and the matching relationship between computing power supply and demand is constructed. The allocation decision model generates computing power allocation instructions that include computing power allocation schemes and inter-layer migration strategies by mapping the prediction error boundary to computing power reserve space. The feedback evolution unit is used to perform computing power resource allocation on each CDN node based on the computing power allocation instruction, and extract actual traffic data and computing power usage data from the allocated operating status to form feedback information. The feedback information is used to establish a quantitative mapping relationship between prediction deviation and actual resource consumption. The quantitative mapping relationship is used to reconstruct the error boundary and reserve space mapping logic in the allocation decision model, and drive the prediction logic of the time series analysis rule to undergo adaptive evolution.
9. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 7.
10. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 7.