A method for estimating an application execution latency in a compute offload
By establishing a CPU cycle count model and using online analysis and histogram techniques to estimate task demand distribution, the problem of latency estimation for computationally intensive tasks was solved, efficient computation offloading decisions were achieved, and the utilization efficiency and energy savings of edge computing resources were improved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- GUANGZHOU RES INST OF XIAN UNIV OF ELECTRONIC SCI & TECH
- Filing Date
- 2023-03-15
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies struggle to accurately estimate the execution latency of computationally intensive tasks, making it difficult for mobile devices to make effective computation offloading decisions under conditions of limited resources and latency sensitivity, thus affecting the trade-off between QoS and energy consumption.
By abstracting the application into a configuration file, a CPU cycle count model is established. Online analysis and histogram techniques are used to estimate the task demand distribution, obtain the cumulative probability distribution function, and calculate the minimum CPU cycle count to support computational offloading decisions.
It effectively reduces the computational complexity and resource consumption of computation offloading decisions, and improves the utilization efficiency and energy saving of edge computing resources.
Smart Images

Figure CN116414667B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computational unloading technology, specifically a method for estimating application execution latency during computational unloading. Background Technology
[0002] Compute offloading is a key technology in edge computing. It aims to optimize the use of edge computing, storage, and communication resources while considering constraints such as limited resources, energy consumption, and network latency. Due to the resource limitations of user devices, applications can be offloaded to other devices for execution, while meeting latency and energy consumption requirements. Battery-powered mobile devices, such as mobile phones and laptops, are expected to become important devices for compute offloading applications. For the tasks executed on the device, they can be divided into two types based on their functionality: compute-intensive tasks and I / O-intensive tasks. Compute-intensive tasks require high-performance computing devices to perform a large amount of computation and consume a significant amount of time, resulting in high system load and utilization, while read / write I / O is completed in a short time. Common use cases for compute-intensive tasks include image processing applications, big data applications, and artificial intelligence applications. Tasks involving network and disk I / O are I / O-intensive tasks, characterized by low CPU consumption (because I / O speed is much lower than CPU and memory speed). I / O-intensive tasks involve waiting for input / output devices during processing. In this case, due to operating system scheduling, the program enters a sleep state, resulting in low CPU utilization. I / O-intensive tasks are currently used in scenarios such as web applications and large-scale system monitoring.
[0003] Computation-intensive tasks are better suited for offloading computation compared to I / O-intensive tasks for the following reasons:
[0004] (1) The processing time of I / O intensive tasks usually depends on the external system, and it is impossible to accurately estimate the task completion time.
[0005] (2) Computation-intensive tasks typically consume a lot of CPU resources, resulting in high energy consumption of the equipment. Therefore, they have a high benefit for equipment with limited resources or time sensitivity.
[0006] For example, mobile devices need to improve QoS (Quality of Service) while simultaneously conserving energy; however, these two goals are at odds. For better QoS, system resources (such as the CPU) typically require higher performance but consume more energy. Conversely, for energy efficiency, system resources should consume less energy, resulting in lower performance. Therefore, mobile device operating systems need to manage resources in a QoS-aware and energy-efficient manner, providing the flexibility to balance QoS and energy. The rise of edge computing has made this trade-off possible in mobile devices, allowing applications to execute on the most suitable device with minimal system overhead. Given the resource limitations of user devices, improving the efficiency of edge computing resource utilization and reducing energy consumption has become crucial. Summary of the Invention
[0007] To address the problems existing in the prior art, this invention proposes a method for estimating application execution latency during uninstallation, comprising the following steps:
[0008] The application that needs to make computational unloading decisions is abstracted into a configuration file containing application characteristic parameters and a CPU cycle count model is established.
[0009] The cumulative probability distribution function is obtained by estimating the demand distribution of application tasks.
[0010] The estimation result is calculated based on the cumulative probability distribution function.
[0011] Specifically, estimating the demand distribution of application tasks to obtain the cumulative probability distribution function includes analyzing the task cycle and obtaining the cumulative probability distribution function; the task cycle analysis adopts an online analysis method; the acquisition of the cumulative probability distribution function is achieved by estimating the probability of each subtask using the number of cycles using histogram technology.
[0012] Further, the configuration file is represented as A(L,W,T); where L is the amount of input data calculated and unloaded by the application, W is the number of CPU cycles required for the application to execute, and T is the deadline for completing the execution of the application.
[0013] Furthermore, the CPU cycle number model is expressed as W = LX; where X is a random variable with an empirical distribution.
[0014] Furthermore, the specific cycle of the analysis task includes:
[0015] Divide the application task into multiple sub-tasks;
[0016] Add a loop counter to the process control block of each of the subtasks;
[0017] During the execution of the subtask, the loop counter monitors and records the number of CPU cycles used by each subtask.
[0018] Furthermore, obtaining the cumulative probability distribution function specifically includes:
[0019] Use an analysis window to track the number of cycles used by the n subtasks, where n is a positive integer;
[0020] Create a histogram based on the number of cycles used by the subtasks;
[0021] The cumulative probability distribution function of the application execution cycle is obtained by approximating the histogram.
[0022] Furthermore, creating a histogram based on the number of cycles used by the subtask includes:
[0023] C min and C max These are respectively set to the minimum and maximum number of cycles used by the n sub-tasks;
[0024] The numerical range is [C min C max The number of cycles mentioned in the text is divided into t groups of equal size, and (a) is set to... i-1 ,a i ) represents the i-th group, where i is a positive integer, a i-1 Let a be the minimum number of cycles in the i-th group. i The maximum number of cycles in the i-th group;
[0025] The number of cycles used for calculating subtasks based on the grouping does not exceed a. i The probability is calculated, and a histogram is created based on the probability and the number of cycles used by the n subtasks.
[0026] Furthermore, the step of calculating the estimation result based on the cumulative probability distribution function specifically includes:
[0027] Obtain the input data volume and statistical performance requirements for meeting the deadline;
[0028] The estimation result is calculated based on the cumulative probability distribution function, statistical performance requirement parameters, and input data volume; the estimation result represents the minimum number of CPU cycles required for an application to execute if it meets the statistical performance requirement parameters.
[0029] Furthermore, the method is implemented through a decision-making procedure; the steps of the decision-making procedure include:
[0030] Program segmentation: The application to be processed is divided into multiple sub-tasks.
[0031] Add a counter: Add a loop counter to the process control block of each of the subtasks. The loop counter is used to monitor and record the start, exit and end times of each of the subtasks.
[0032] Analysis window tracking: Use an analysis window to track the start, exit, and end times of each subtask, and calculate the number of CPU cycles consumed by each subtask accordingly.
[0033] Probability estimation: Based on the number of CPU cycles consumed by the subtasks, a histogram and cumulative probability distribution function are established to find the minimum number of CPU cycles required for the application to execute, which meets the statistical performance requirements.
[0034] Calculate the unloading decision: Calculate the unloading decision based on the minimum number of CPU cycles.
[0035] Compared with the prior art, the beneficial effects of the present invention are as follows:
[0036] This invention establishes a model of the number of execution cycles required by describing the parameters of the application, and uses online analysis technology and histogram technology to estimate the number of CPU cycles required by the application through probabilistic statistical methods, providing effective support for computational unloading decisions;
[0037] The online analysis and histogram techniques employed in this invention have low computational complexity, effectively saving the computational resources and energy required for computation offloading decisions on resource-limited edge devices, thereby improving the utilization efficiency of edge computing resources. Attached Figure Description
[0038] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0039] Figure 1 This is a flowchart of the method for estimating application execution latency during uninstallation according to the present invention;
[0040] Figure 2 This is a schematic diagram of some monitoring results of the loop counter in this invention;
[0041] Figure 3 This is the cumulative distribution probability histogram of the number of cycles in this invention;
[0042] Figure 4This is the cumulative distribution probability histogram of the number of cycles when searching for the number of cycles based on the cumulative probability in this invention;
[0043] Figure 5 This is a flowchart illustrating the decision-making process of the present invention. Detailed Implementation
[0044] The technical solution of the present invention will be clearly and completely described below with reference to the embodiments. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of the present invention.
[0045] Resource-constrained mobile devices or low-power devices are typically designed to support multiple power modes, trading performance for energy efficiency; they can also change speed (i.e., frequency and corresponding voltage) and power during processor operation. Dynamic voltage / frequency scaling (DVS) is a common mechanism for saving energy by reducing CPU speed, with the main goal of minimizing energy consumption without reducing application performance. The effectiveness of DVS technology depends on the ability to predict application CPU demand, and compute offloading technology also needs to predict application CPU demand. Overestimation wastes CPU and energy, while underestimation reduces application performance. Generally, there are three methods for predicting application CPU demand: (1) monitoring the CPU utilization of all applications, etc.; (2) using the worst-case CPU demand of a single application; and (3) monitoring the runtime CPU usage of each application. Due to the dynamic changes in CPU, the first two methods are not suitable for compute-intensive task demands. The first method may violate the latency constraints of compute-intensive tasks, while the second method is too conservative. Therefore, this solution adopts the third method and integrates it into compute offloading scheduling.
[0046] The invention is further described below with reference to the accompanying drawings, but this does not limit the scope of the invention in any way.
[0047] Please see Figure 1 This invention proposes a method for calculating the estimated application execution latency during uninstallation, specifically including the following steps:
[0048] S1. Abstract the application that needs to make computational unloading decisions into a configuration file containing application characteristic parameters and establish a CPU cycle count model.
[0049] When making decisions about computational offloading, the task needs to know the execution time of the application being offloaded. The application's execution latency is related to the device's CPU frequency and the number of CPU cycles required for the application's execution. A model that fully describes all aspects of the application is complex and must consider many details. However, such high detail can make the problem mathematically difficult and not meaningfully helpful for engineering practice. To make it feasible for engineering use, the application making computational offloading decisions is abstracted into a profile with three parameters. The application profile is denoted as A(L,W,T), which includes: the amount of input data L (in bits), the number of CPU cycles W, and the deadline T. Specifically, the amount of input data L is the amount of input data under computational offloading, the number of CPU cycles W is the number of CPU cycles required for the application to execute, and the deadline T is the deadline for completing the application execution. When the application executes on the device, energy consumption is determined by the CPU workload. The workload is measured by the number of CPU cycles W required by the application, which depends on the amount of input data L in the application and the complexity of the algorithm. In the method of this invention, the number of CPU cycles is modeled as a random variable W = LX, where X is a random variable with an empirical distribution.
[0050] S2. Estimate the demand distribution of application tasks to obtain the cumulative probability distribution function.
[0051] It should be noted that the method of this invention estimates the demand distribution of application tasks, rather than the instantaneous demand of application tasks, for two reasons: first, the demand distribution is more stable and predictable than the instantaneous demand; second, estimating the number of CPU cycles based on the demand distribution of tasks provides statistical performance guarantees, which is sufficient for compute-intensive applications.
[0052] Estimating the demand distribution of application tasks to obtain the cumulative probability distribution function includes analyzing the task cycle and obtaining the cumulative probability distribution function.
[0053] Specifically, the analysis task cycle employs an online analysis method. Online analysis is a data analysis technique primarily used for quickly querying, analyzing, and summarizing large amounts of data, presenting it to users in an interactive manner to help them quickly understand trends, correlations, and anomalies in the data, thereby supporting decision-making and business processes. In an operating system, a process is an instance of a program in execution, and its static description consists of three parts: a process control block, a program segment, and a dataset. In one embodiment, the application task is divided into multiple subroutine tasks, and a loop counter is added to the process control block of each subroutine. During the execution of the subroutine, the loop counter monitors and records the number of CPU cycles used during execution; the sum of these used CPU cycles represents the total number of CPU cycles executed by the application. Figure 2 As shown, in each subtask, a loop counter monitors the start, exit, and end times of the program execution. The i-th task starts at time c1, exits after c2-c1 cycles, restarts at c3, and completes at c4. Finally, the number of cycles completed for the i-th task is calculated.
[0054] The cumulative probability distribution function is obtained by estimating the probability of each subtask using a histogram technique. Specifically, an analysis window tracks the number of cycles used by n subtasks of the application, where the parameter n is specified by the user or set to a default value. Let C... min and C max These represent the minimum and maximum number of cycles in the window, respectively. [C] min C max The subtasks in the [ ] are divided into t groups of equal size using an average number of cycles.
[0055] C min =a0<a1<a ... <a t-1 <a t =C max
[0056] Set n i To fall into the i-th group (a i-1 ,a i The exponentiation of ), where i is a positive integer, a i-1 Let a be the minimum number of cycles in the i-th group. i This represents the maximum number of cycles in the i-th group; it can be understood that n i / n indicates that the number of cycles used in the execution is within the range (a i-1 ,a i The probability of ), and This indicates that the number of cycles used by the subtask does not exceed a. i The probability is the cumulative sum of probabilities within the first i intervals, which constitutes a probability such as... Figure 3 The histogram shown above approximates the cumulative probability distribution function of the application's execution cycle from a probabilistic perspective.
[0057] F(x) = Pr(X≤x)
[0058] Where X is the random variable with the above empirical distribution, and Pr(X≤x) represents the probability that the random variable X is less than or equal to x.
[0059] Existing techniques typically employ gamma distribution modeling to estimate the cumulative probability distribution function (PDF) of application task demands. The PDF probability density function in a gamma distribution is determined by shape and scale parameters. However, accurately calculating these shape and scale parameters is computationally complex, consuming significant computational resources and energy on resource-constrained edge devices. In contrast, histogram-based estimation of the cumulative probability distribution function of application task demands eliminates the need for configuring shape and scale parameters. Furthermore, histogram technology readily reflects changes in application demand in real-time without consuming substantial resources.
[0060] S3. The estimation result is obtained by calculating based on the cumulative probability distribution function.
[0061] In the computational unloading process, the deadline T represents the time requirement for executing the unloading procedure on the device after unloading. We use a statistical performance requirement parameter ρ to represent the probability of meeting the application execution deadline T, as they must meet soft real-time performance requirements. Typically, application developers or users can specify the parameter ρ based on application characteristics or user preferences. Further, let W be the number of CPU cycles allocated to each application. ρ It satisfies the deadline T with probability ρ, hence the parameter ρ is also called ACP (Application Completion Probability). Before the application runs to its deadline T, the processor will execute the uninstaller with maximum performance. Based on the cumulative probability distribution function, the application task will not exceed the allocated W. ρ The probability of a period being at least ρ is expressed as:
[0062] F W (W ρ ) = Pr(W≤W ρ )≥ρ
[0063] Using the above formula, the number of CPU cycles required for the program to run under a given parameter ρ can be obtained as follows:
[0064]
[0065] in, For F W (W ρ The inverse function of ), The number of cycles required to run when the parameter ρ is given in units of bits.
[0066] To find the minimum number of CPU cycles W required for the application to execute when the input data size is L. ρ We use histogram techniques to estimate and find the minimum number of cycles W with a cumulative probability of at least ρ. ρ Please see. Figure 4When the cumulative probability is ρ, the minimum number of CPU cycles required for the corresponding application to execute is a. m That is, a m Satisfy the following formula:
[0067] F W (a m )=Pr(X≤a m )≥ρ.
[0068] In one embodiment, the method provided by the present invention is implemented through a decision-making procedure.
[0069] Please refer to Figure 5 The decision-making process includes the following steps:
[0070] Program segmentation: The application to be processed is divided into multiple subtasks, while maintaining the integrity of the functions of each subtask during the segmentation process, so as to facilitate calculation and unloading.
[0071] Add a counter: Add a loop counter to the process control block of each subtask. This counter will monitor and record the start, exit, and end times of each subtask.
[0072] Analysis window tracking: Use an analysis window to track the start, exit, and end times of each subtask, and calculate the number of CPU cycles consumed by each subtask accordingly.
[0073] Probability estimation: Construct a histogram of the number of CPU cycles used for each subtask and a cumulative probability distribution function of the application execution to find the minimum number of CPU cycles required for the application to execute, which meets the statistical performance requirements.
[0074] Calculate the unload decision: Calculate the unload decision based on the minimum number of CPU cycles required for the application to execute.
[0075] This invention aims to solve the problem of difficulty in obtaining the number of CPU cycles required for application execution when calculating unloading decisions. It establishes a model of the number of execution cycles required by describing the parameters of the application, and uses online analysis technology and histogram technology to estimate the number of CPU cycles required by the application through probabilistic statistical methods, thus providing effective support for calculating unloading decisions.
[0076] The above description is merely an example and illustration of the structure of the present invention. Those skilled in the art can make various modifications or additions to the specific embodiments described, or use similar methods to replace them, as long as they do not deviate from the structure of the invention or exceed the scope defined in the claims, all of which should fall within the protection scope of the present invention.
Claims
1. A method for estimating application execution latency during uninstallation, characterized in that: Including steps: The application that needs to make computational unloading decisions is abstracted into a configuration file containing application characteristic parameters and a CPU cycle count model is established. The cumulative probability distribution function is obtained by estimating the demand distribution of application tasks. The estimation result is calculated based on the cumulative probability distribution function. Specifically, estimating the demand distribution of application tasks to obtain the cumulative probability distribution function includes analyzing the task cycle and obtaining the cumulative probability distribution function; the task cycle analysis adopts an online analysis method; the cumulative probability distribution function is obtained by estimating the probability of each subtask using the number of cycles using histogram technology. The configuration file is represented as A(L,W,T); where L is the amount of input data calculated and unloaded by the application, W is the number of CPU cycles required for the application to execute, and T is the deadline for completing the application execution. The CPU cycle count model is expressed as W=LX; where X is a random variable with an empirical distribution; The specific cycle of the analysis task includes: Divide the application task into multiple subtasks; Add a loop counter to the process control block of each of the subtasks; During the execution of the subtask, the loop counter monitors and records the number of CPU cycles used by each subtask; The obtained cumulative probability distribution function specifically includes: Use an analysis window to track the number of cycles used by the n subtasks, where n is a positive integer; Create a histogram based on the number of cycles used by the subtasks; The cumulative probability distribution function of the application execution cycle is obtained by approximating the histogram; The estimation result calculated based on the cumulative probability distribution function specifically includes: Obtain the input data volume and statistical performance requirements for meeting the deadline; The estimation result is calculated based on the cumulative probability distribution function, statistical performance requirement parameters, and input data volume; the estimation result represents the minimum number of CPU cycles required for an application to execute if it meets the statistical performance requirement parameters.
2. The method according to claim 1, characterized in that: The step of creating a histogram based on the number of cycles used by the subtask includes: Will and These are respectively set to the minimum and maximum number of cycles used by the n subtasks; The numerical range is [ , The number of cycles mentioned in the text is divided into t groups of equal size, and (the number of cycles is set to...) , () represents the i-th group, where i is a positive integer. The minimum number of cycles in the i-th group. The maximum number of cycles in the i-th group; The number of cycles used for the sub-tasks calculated according to the grouping shall not exceed [a certain number]. The probability is calculated, and a histogram is created based on the probability and the number of cycles used by the n subtasks.
3. The method according to claim 1, characterized in that: The method is executed through a decision-making procedure; The execution steps of the decision-making procedure include: Program segmentation: The application to be processed is divided into multiple sub-tasks; Add a counter: Add a loop counter to the process control block of each of the subtasks. The loop counter is used to monitor and record the start, exit and end times of each of the subtasks. Analysis window tracking: Use an analysis window to track the start, exit, and end times of each subtask, and calculate the number of CPU cycles consumed by each subtask accordingly; Probability estimation: Based on the number of CPU cycles consumed by the subtasks, a histogram and cumulative probability distribution function are established to find the minimum number of CPU cycles required for the application to execute, which meets the statistical performance requirements. Calculate the unloading decision: Calculate the unloading decision based on the minimum number of CPU cycles.