A method and system for dynamic task scheduling on a heterogeneous computing platform for unmanned aerial vehicles (UAVs)

By acquiring the resource status parameters of heterogeneous computing units in real time, dynamically allocating tasks based on multi-dimensional resource adaptation coefficients, and matching preset operator templates, the problem of task scheduling and execution being disconnected in the dynamic flight environment of UAV heterogeneous computing platforms is solved, thereby improving task processing efficiency and real-time performance.

CN122309059APending Publication Date: 2026-06-30JINAN GOLDENWORLD HIGHWAY INDUSTRY DEVELOPMENT CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
JINAN GOLDENWORLD HIGHWAY INDUSTRY DEVELOPMENT CO LTD
Filing Date
2026-02-28
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing heterogeneous computing platforms for unmanned aerial vehicles (UAVs) cannot flexibly schedule tasks in dynamic flight environments, resulting in low task processing efficiency. Furthermore, task scheduling and execution optimization are disconnected, failing to meet real-time requirements.

Method used

By acquiring resource status parameters of heterogeneous computing units in real time, tasks are dynamically allocated based on multi-dimensional resource adaptation coefficients and matched with preset operator templates to achieve pre-optimization of tasks and maximize hardware utilization.

Benefits of technology

It improves the task processing efficiency of the UAV heterogeneous computing platform, enhances its robustness and task execution reliability in dynamic environments, and meets the real-time requirements of dynamic tasks.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122309059A_ABST
    Figure CN122309059A_ABST
Patent Text Reader

Abstract

This application provides a dynamic task scheduling method and system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs), belonging to the field of electronic digital data processing technology. The method can acquire in real time heterogeneous resource status parameter sets of at least two different types of heterogeneous computing units onboard the UAV; based on each heterogeneous resource status parameter set and each task to be executed, it determines the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit; wherein, the multi-dimensional resource adaptation coefficients are used to characterize the degree of resource adaptation when each heterogeneous computing unit executes the same task. Based on each multi-dimensional resource adaptation coefficient and the heterogeneous resource status parameter set, each task to be executed is dynamically allocated to the corresponding target heterogeneous computing unit; based on the task allocation result and a preset set of operator templates, it matches the operator template corresponding to the task to be executed, so that the target heterogeneous computing unit executes the task according to the matched operator template.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of electronic digital data processing technology, and in particular to a dynamic task scheduling method and system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs). Background Technology

[0002] With the widespread application of drones in fields such as power line inspection, environmental monitoring, and emergency rescue, edge computing scenarios are seeing the emergence of various computing tasks with significantly different computational loads. These include perception tasks such as high-definition image recognition and multi-source sensor data fusion, as well as decision-making tasks such as path planning and obstacle avoidance. Existing drone-borne heterogeneous computing platforms are designed to accommodate the diverse needs of these different tasks.

[0003] However, existing heterogeneous computing scheduling technologies for UAVs still suffer from limitations in flexible and dynamic task scheduling in real-world dynamic flight environments, impacting task processing efficiency. Current technologies often rely on querying the resource utilization of heterogeneous computing platforms for task scheduling, failing to accurately perceive the true state of these resources. This can lead to insufficient precision in scheduling decisions, and using a single factor for task scheduling can result in overly simplistic task allocation. Furthermore, existing technologies often employ a linear model of allocation followed by optimization, resulting in a fragmented approach that cannot meet the real-time requirements of dynamic UAV missions. Summary of the Invention

[0004] To address the aforementioned issues, this application provides a method and system for dynamic task scheduling on a heterogeneous computing platform for unmanned aerial vehicles (UAVs), enabling the UAV heterogeneous computing platform to flexibly and dynamically schedule computing tasks and improve task processing efficiency.

[0005] In a first aspect, embodiments of this application provide a dynamic task scheduling method for a heterogeneous computing platform for unmanned aerial vehicles (UAVs), the method comprising: Real-time acquisition of heterogeneous resource status parameter sets of at least two different types of heterogeneous computing units on the UAV; Based on the heterogeneous resource state parameter groups and the tasks to be executed, the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit are determined; wherein, the multi-dimensional resource adaptation coefficients are used to characterize the degree of resource adaptation when each heterogeneous computing unit executes the same task to be executed. Based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter group, each of the tasks to be executed is dynamically allocated to the corresponding target heterogeneous computing unit. Based on the task allocation results and the preset operator template set, the operator template corresponding to the task to be executed is matched, so that the target heterogeneous computing unit executes the task to be executed according to the matched operator template.

[0006] In one implementation of this application, based on the heterogeneous resource state parameter groups and the tasks to be executed, the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit are determined, specifically including: Based on each of the tasks to be executed, a corresponding multi-dimensional feature vector is generated; wherein, the multi-dimensional feature vector includes the following dimensions: parallelism, allowed execution time, accuracy requirement, and task computation amount; Based on the preset resource adaptation weight reorganization corresponding to the heterogeneous computing unit and the heterogeneous resource state parameter group, the multidimensional feature vector is weighted and calculated to determine the multidimensional resource adaptation coefficient based on the weighted calculation result.

[0007] In one implementation of this application, the heterogeneous resource status parameter set includes at least preset parallel processing capability parameters, preset latency performance parameters, current utilization parameters, preset precision support parameters, and real-time power consumption parameters; based on the preset resource adaptation weight reorganization corresponding to the heterogeneous computing unit and the heterogeneous resource status parameter set, the multi-dimensional feature vector is weighted and calculated to determine the multi-dimensional resource adaptation coefficient based on the weighted calculation result, specifically including: The first coefficient is determined based on the product of the preset parallel processing capability parameter and the parallelism feature value in the multidimensional feature vector; The second coefficient is determined based on the ratio of the preset latency performance parameter to the allowed execution time feature value in the multidimensional feature vector; Based on the first weight, the first coefficient, the second coefficient, and the current utilization rate parameter in the preset resource adaptation right reorganization, calculate the third coefficient corresponding to the first resource adaptation item; The fourth coefficient is determined based on the product of the preset accuracy support parameter and the accuracy requirement feature value in the multidimensional feature vector; Calculate the fifth coefficient corresponding to the second resource adaptation item based on the second weight in the preset resource adaptation weight reorganization, the fourth coefficient, and the real-time power consumption parameter; The multi-dimensional resource adaptation coefficient is determined based on the sum of the third coefficient and the fifth coefficient.

[0008] In one implementation of this application, based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter group, each task to be executed is dynamically allocated to a corresponding target heterogeneous computing unit, specifically including: Based on the multi-dimensional resource adaptation coefficients, a resource adaptation coefficient matrix corresponding to the task to be executed is generated; Based on the resource adaptation coefficient matrix and preset rules, the target heterogeneous computing unit corresponding to the task to be executed is iteratively selected and allocated, and the virtual state parameters corresponding to each heterogeneous computing unit are updated after each allocation. Based on the updated virtual state parameters, the remaining computing power of the target heterogeneous computing unit is determined. When the remaining computing power is less than the task computing amount of any unassigned task to be executed, the corresponding unassigned task to be executed is split into subtasks, and the split subtasks are assigned to the corresponding target heterogeneous computing unit. The scheduling priority of each split subtask is greater than the scheduling priority of the newly arrived task to be executed.

[0009] In one implementation of this application, the method further includes: Update the multi-dimensional resource adaptation coefficients corresponding to each of the tasks to be executed according to the preset resource adaptation period; When the decrease value of the multi-dimensional resource adaptation coefficient after task allocation is greater than a preset threshold, or when the remaining computing power of the target heterogeneous computing unit is less than the task computing amount of the allocated task to be executed, a task migration prompt message is generated to trigger task migration; wherein, the task migration includes at least migrating the remaining subtasks to other target heterogeneous computing units selected again according to the preset rules after completing the currently executing subtask.

[0010] In one implementation of this application, based on the task allocation result and a preset set of operator templates, matching the operator template corresponding to the task to be executed specifically includes: Based on the task allocation results, determine the type identifier of the target heterogeneous computing unit and the operator type corresponding to the task to be executed; Based on the type identifier and the operator type, a corresponding basic operator template is matched from the preset operator template set; wherein, the basic operator template defines a preset optimized computing structure corresponding to the target heterogeneous computing unit; Based on the task precision requirements corresponding to the task to be executed, the precision of the matched basic operator template is configured to generate the operator template corresponding to the task to be executed.

[0011] In one implementation of this application, the heterogeneous computing unit includes at least two of the following types: graphics processor, field-programmable gate array, digital signal processor, and neural network processor.

[0012] In one implementation of this application, after the target heterogeneous computing unit executes the task to be executed according to the matched operator template, the method further includes: Obtain the actual performance data of the target heterogeneous computing unit after executing the task to be executed; wherein the actual performance data includes at least one of the following: real-time power consumption of task execution, task execution latency, average computing unit utilization, and execution result error rate; The actual performance data is compared with a preset standard list to determine the calibration parameters in the preset standard list based on the comparison results. The heterogeneous resource status parameter group and / or the multi-dimensional resource adaptation coefficient are then corrected based on the calibration parameters.

[0013] In one implementation of this application, the heterogeneous resource status parameter set of at least two different types of heterogeneous computing units onboard the UAV is acquired in real time, specifically including: The underlying hardware performance counter data of the heterogeneous computing unit is read through the direct memory access interface at a preset sampling period. By using a preset adaptive Kalman filter, the initial state parameters corresponding to the underlying hardware performance counter data are filtered to obtain the current resource state parameters; The current resource status parameters are input into a preset lightweight prediction model for a preset number of consecutive periods to obtain the predicted resource status values ​​for a preset future time period; wherein, the predicted resource status values ​​are used to calculate the predicted multi-dimensional resource adaptation coefficient to determine whether to update the multi-dimensional resource adaptation coefficient; The heterogeneous resource status parameter group is determined based on the current resource status parameters and the corresponding predicted resource status values.

[0014] Secondly, embodiments of this application also provide a dynamic task scheduling system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs), the system comprising: The acquisition module is used to acquire the real-time collected heterogeneous resource status parameter set; The determination module is used to determine the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit based on the heterogeneous resource status parameter group and each task to be executed; wherein, the multi-dimensional resource adaptation coefficients are used to characterize the degree of resource adaptation when each of the heterogeneous computing units executes the same task to be executed. The allocation module is used to dynamically allocate each of the tasks to be executed to the corresponding target heterogeneous computing units based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter group. The matching module is used to match the operator template corresponding to the task to be executed based on the task allocation result and the preset operator template set, so that the target heterogeneous computing unit executes the task to be executed according to the matched operator template.

[0015] Compared with the prior art, the significant advantages of this application are as follows: Through the above technical solution, this application can determine the resource status parameters of different types of heterogeneous computing units in a UAV and evaluate the resource adaptability of each heterogeneous computing unit when executing tasks from multiple dimensions. This enables the perception and analysis of heterogeneous resource status for dynamic allocation of task computing load. Furthermore, while scheduling and allocating tasks, this application also matches operator templates corresponding to the tasks. Instead of optimizing after task allocation, pre-optimized operator templates are used to meet dynamic task execution requirements, maximizing hardware utilization and significantly improving the energy efficiency and overall performance of UAV edge computing. This allows the UAV heterogeneous computing platform to flexibly and dynamically schedule computing tasks, improving task processing efficiency. Attached Figure Description

[0016] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings: Figure 1 This is a flowchart illustrating a dynamic task scheduling method for a heterogeneous computing platform for unmanned aerial vehicles (UAVs) according to an embodiment of this application. Figure 2 This is a schematic diagram of the structure of a dynamic task scheduling system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs) according to an embodiment of this application. Detailed Implementation

[0017] To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this application, and not all of them. Based on the embodiments in this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.

[0018] Existing heterogeneous computing scheduling technologies for unmanned aerial vehicles (UAVs) still suffer from limitations in flexible and dynamic task scheduling in real-world dynamic flight environments, impacting task processing efficiency. Current technologies often rely on querying the resource utilization of heterogeneous computing platforms for task scheduling, failing to accurately perceive the true state of these resources. This can lead to insufficient precision in scheduling decisions, and using a single factor for task scheduling can result in overly simplistic task allocation. Furthermore, existing technologies often employ a linear approach of allocating tasks first and then optimizing, resulting in a fragmented collaboration that cannot meet the real-time requirements of dynamic UAV missions.

[0019] Based on this, the embodiments of this application provide a dynamic task scheduling method and system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs) to solve the technical problems of current UAV heterogeneous computing scheduling technology, such as perception limitations, inability to make accurate task scheduling decisions, and low task processing efficiency caused by the separation of task scheduling and execution optimization.

[0020] The various embodiments of this application are described in detail below with reference to the accompanying drawings.

[0021] This application provides a dynamic task scheduling method for a heterogeneous computing platform for unmanned aerial vehicles (UAVs). This method is applied to a heterogeneous computing platform for UAVs. Figure 1 As shown, the method may include steps S101-S104: S101, real-time acquisition of heterogeneous resource status parameter groups of at least two different types of heterogeneous computing units on the UAV.

[0022] It should be noted that the execution entity of the dynamic task scheduling method for UAV heterogeneous computing platforms can be the UAV's onboard computer chip or the UAV's onboard chip's central processing unit (CPU). In some application scenarios, the execution entity can also be an edge server or a ground station, which is determined by the user according to the actual use case, and is not specifically limited here.

[0023] The drone used in this application includes at least two different types of heterogeneous computing units, including but not limited to: Graphics Processing Unit (GPU), Field Programmable Gate Array (FPGA), Digital Signal Processor (DSP), and Neural Network Processor. This application does not specify the number or type of heterogeneous computing units onboard the drone. The drone heterogeneous computing platform includes, for example, at least a CPU, GPU, and FPGA, forming the hardware components of the drone's onboard edge heterogeneous computing platform.

[0024] In this embodiment of the application, the real-time acquisition of heterogeneous resource status parameter sets of at least two different types of heterogeneous computing units onboard the UAV specifically includes: The underlying hardware performance counter data of the heterogeneous computing units is read through the Direct Memory Access (DMA) interface at a preset sampling period. A preset adaptive Kalman filter is used to filter the underlying hardware performance counter data to obtain the current resource status parameters. A preset number of consecutive periods of the current resource status parameters are input into a preset lightweight prediction model to obtain predicted resource status values ​​for a preset future time period. These predicted resource status values ​​are used to calculate multi-dimensional resource adaptation coefficients to determine whether to update them. Based on the current resource status parameters and the corresponding predicted resource status values, a heterogeneous resource status parameter group is determined.

[0025] In other words, when obtaining the heterogeneous resource status parameter group of heterogeneous computing units, the DMA interface can be called to read the underlying hardware performance counter data of each heterogeneous computing unit at a preset sampling period. This preset sampling period can be set based on expert experience, such as 10 milliseconds, and this application does not make a specific limitation on it. Among them, the underlying hardware performance counter data has different data types for different heterogeneous computing units. For example, for GPUs, it includes data such as the active cycle of the Stream Multi-processor (SM) and the memory read bandwidth; for FPGAs, the read data includes at least logic unit utilization, bus throughput, pipeline status, etc.

[0026] Subsequently, data filtering is performed using a preset adaptive Kalman filter. Before filtering, this application also calculates the initial state parameters from the raw read low-level hardware performance counter data. For example, taking a GPU as an example, the GPU's corresponding SM working sampling time within a preset sampling period is obtained from the low-level hardware performance counter data. Then, an initial GPU utilization parameter is obtained using a preset normalization factor and the SM working sampling time, and added to the initial state parameters. Specifically, the difference between adjacent SM working sampling times is calculated, and the ratio of this difference to the preset normalization factor is used as the initial GPU utilization parameter. This preset normalization factor is predefined as the theoretical maximum number of cycles that the heterogeneous computing unit can achieve at the current operating frequency within the preset sampling period. The specific number of cycles is determined by the user based on expert experience, and this normalization factor is not limited here. The initial temperature parameter and initial power consumption parameter can be determined by reading the raw physical data built into each heterogeneous computing unit.

[0027] Next, the initial state parameters are filtered using a preset adaptive Kalman filter. These initial state parameters may include initial utilization parameters, initial temperature parameters, and initial power consumption parameters. The initial temperature parameter affects the preset parallel processing capability parameters and preset latency performance parameters of this application. This application filters the aforementioned initial state parameters by updating the state transition matrix obtained through recursive least squares update based on the state sequence acquired after collecting data for multiple cycles according to a preset sampling period, thus obtaining the current resource state parameters.

[0028] Subsequently, this application also statistically analyzes the current resource state parameters for a consecutive preset number of periods. Using a pre-trained and pruned lightweight prediction model (such as a Long Short-Term Memory network (LSTM)), it infers the predicted values ​​of the resource state parameters for a future preset time period, such as the next 100 milliseconds. The obtained current resource state parameters and predicted resource state values ​​are then used to construct a heterogeneous resource state parameter set. The lightweight prediction model is an LSTM model with a single hidden layer. It is trained using a training sample set containing parameters such as utilization, temperature, and power consumption collected from actual UAV flight scenarios, along with manually labeled data. The manually labeled data represents the predicted value for each parameter in the training sample set. The model is trained using the training sample set until the loss function value is less than a preset value, resulting in the trained lightweight prediction model. This preset value can be set by the user according to the actual usage scenario, and the loss function can use the mean squared error formula. Within a consecutive preset number of periods, current resource state parameters, including initial utilization parameters, initial temperature parameters, and initial power consumption parameters, are collected. These parameters are then used to generate parameter sets corresponding to the same time, such as S. k =[U k P k T k ], S k U represents the parameter set at time k (or the kth period). k P represents the current utilization parameter. k T represents the initial power consumption parameter. k This represents the initial temperature parameter. A state matrix is ​​constructed from the parameter set over a continuously preset number of periods. For example, if the preset number of periods is m, the state matrix would be S = [S1, S2, ..., S...]. m The state matrix is ​​used as input to the lightweight prediction model, which outputs the predicted values ​​of resource state parameters every 10 milliseconds within a 100-millisecond time period. These values ​​can include temperature prediction sequences, utilization prediction sequences, and power consumption prediction sequences within the 100-millisecond time period.

[0029] It should be noted that the resource status prediction value of this application can only be generated after a preset number of sampling periods, such as 5. Before that, the multi-dimensional resource adaptation coefficient is not updated based on the resource status prediction value. The preset number can be set according to the actual use scenario and is not specifically limited here. In addition, this application can also pre-set an update period comparison list, which records several update periods corresponding to several state change rates. This update period is used to update the multi-dimensional resource adaptation coefficient. The state change rate is obtained by quantifying the change amplitude of the resource status prediction value and the current resource status parameter within a preset future time period. The state change rate is inversely proportional to the update period; the faster the state change rate, the shorter the update period; the slower the state change rate, the longer the update period.

[0030] The heterogeneous resource status parameter set obtained in this application includes at least preset parallel processing capability parameters, preset latency performance parameters, current utilization parameters, preset precision support parameters, and real-time power consumption parameters. The preset parallel processing capability parameters, preset latency performance parameters, and preset precision support parameters can be pre-set in the heterogeneous computing unit performance parameter list, and different performance parameter sets can exist for different temperature conditions. The specific settings can be configured by the user according to the actual usage scenario, and are not specifically limited here. The preset parallel processing capability parameters, preset latency performance parameters, and preset precision support parameters are specifically obtained from the heterogeneous computing unit's manual, or can be set during actual use, and are not specifically limited here. For example, the preset parallel processing capability parameter is calibrated according to the hardware specifications of the heterogeneous computing unit, with a value of 1.0 for GPU, 0.6 for FPGA, and 0.8 for DSP. This value reflects the relative ability of different hardware to handle parallel tasks. The preset latency performance parameter can be specifically calibrated according to the minimum response latency of the heterogeneous computing unit, with a value of 1.0 for FPGA, 0.7 for GPU, and 0.9 for DSP. The preset precision support parameters can be calibrated according to the computational precision level supported by the heterogeneous computing units. GPUs support floating-point calculations, so we set it to 1.0; FPGAs are good at fixed-point calculations, so we set it to 0.8; CPUs also have complete floating-point arithmetic units, so we set it to 1.0.

[0031] S102, based on the state parameter groups of each heterogeneous resource and each task to be executed, determine the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit.

[0032] Among them, the multi-dimensional resource adaptation coefficient is used to characterize the degree of resource adaptation when each heterogeneous computing unit executes the same task.

[0033] In this embodiment of the application, based on the state parameter groups of each heterogeneous resource and each task to be executed, the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit are determined, specifically including: For each task to be executed, a corresponding multi-dimensional feature vector is generated. This multi-dimensional feature vector includes the following dimensions: parallelism, allowed execution time, accuracy requirements, and task computational load. Based on the preset resource adaptation weighting and the heterogeneous resource state parameter set corresponding to the heterogeneous computing unit, the multi-dimensional feature vector is weighted and calculated to determine the multi-dimensional resource adaptation coefficients based on the weighted calculation results.

[0034] In other words, after obtaining each task to be executed, the tasks are quantified in terms of features, specifically from four dimensions: parallelism, allowable execution time, accuracy requirements, and task computation. The specific values ​​for the four feature quantification dimensions can be pre-defined by the user using a lookup table containing the specific values ​​for each dimension for different tasks. Specifically, parallelism is the proportion of parallelizable computations in a task to the total computation. In practical applications, the user can pre-quantify this within the range [0, 1] based on expert knowledge and the degree of parallelism of the task. Perception tasks (such as YOLO object detection) have higher parallelism, with values ​​ranging from 0.8 to 1.0; decision-making tasks (such as A* path planning) have lower parallelism, with values ​​ranging from 0.1 to 0.4. The specific values ​​are determined by analyzing the task's operator type and computational dependencies. Allowable execution time is pre-normalized according to the task's real-time requirements. The maximum allowable delay time is divided by a preset baseline delay (e.g., 100ms) to obtain a normalized value within the range [0, 1]. A smaller value indicates a more stringent delay requirement. Accuracy requirements are pre-quantified based on the reciprocal of the allowable error rate for the task. For example, if the task requires a recognition accuracy of ≥95%, then T3=0.95; if the task requires a positioning error of ≤5cm, then it is converted to a value in the [0,1] interval using a pre-mapping table. The computational load of the task can be quantified in gigabit operations per second (GOPS). By analyzing the computational graph of the task offline, the total number of multiply-accumulate operations of all operators is counted, and then divided by the required execution time of the task, the computational load value is obtained. For example, the computational load of YOLOv5 object detection per frame is approximately 80 GOPS.

[0035] For example, for object detection tasks, the parallelism is... The allowed execution time is The accuracy requirement is The task computation amount is ,in, This represents one billion operations per second. Subsequently, a four-dimensional feature vector is generated for each task to be executed. Furthermore, by utilizing the preset resource adaptation weighting reorganization corresponding to the heterogeneous computing unit and the aforementioned heterogeneous resource state parameter set, the aforementioned multidimensional feature vector is weighted and calculated to obtain the multidimensional resource adaptation coefficient.

[0036] More specifically, this application performs weighted calculations on multi-dimensional feature vectors based on preset resource adaptation weighting reorganization and heterogeneous resource state parameter groups corresponding to heterogeneous computing units, in order to determine multi-dimensional resource adaptation coefficients based on the weighted calculation results, specifically including: The first coefficient is determined based on the product of the preset parallel processing capability parameter and the parallelism feature value in the multidimensional feature vector. The second coefficient is determined based on the ratio of the preset latency performance parameter to the allowed execution time feature value in the multidimensional feature vector. The third coefficient corresponding to the first resource adaptation item is calculated based on the first weight, first coefficient, second coefficient, and current utilization parameter in the preset resource adaptation weight reassembly. The fourth coefficient is determined based on the product of the preset accuracy support parameter and the accuracy requirement feature value in the multidimensional feature vector. The fifth coefficient corresponding to the second resource adaptation item is calculated based on the second weight, fourth coefficient, and real-time power consumption parameter in the preset resource adaptation weight reassembly. The multidimensional resource adaptation coefficient is determined based on the sum of the third and fifth coefficients.

[0037] The specific formula for calculating the multi-dimensional resource adaptation coefficient is as follows:

[0038] in, Indicates the first The heterogeneous computing unit Multi-dimensional resource adaptation coefficients for each pending task; , These are the first and second weights in the pre-defined resource adaptation weight reorganization. The two weights can be set based on expert experience. Specifically, they can be determined by the experimental results of experts in constructing test scenarios for heterogeneous computing scenarios under tasks such as object detection and semantic segmentation. No specific limitations are made here. For the first Preset parallel processing capability parameters for each heterogeneous computing unit; Indicates the first Parallelism feature values ​​of each task to be executed; Indicates the first Preset latency performance parameters for each heterogeneous computing unit; Indicates the first The allowed execution time characteristic value of each pending task; Indicates the first Current utilization parameters of each heterogeneous computing unit; This is a preset constant used to avoid the denominator being 0; Indicates the first Preset precision support parameters for each heterogeneous computing unit; Indicates the first The accuracy requirement feature value of each task to be executed; Indicates the first Real-time power consumption parameters of each heterogeneous computing unit. The first coefficient; The second coefficient; It is the third coefficient; It is the fourth coefficient; It is the fifth coefficient.

[0039] By calculating the multi-dimensional resource adaptation coefficient, which integrates multiple factors, the resource adaptation degree of each heterogeneous computing unit to the task under the combined influence of various dimensions can be determined. By quantifying the task into a multi-dimensional feature vector containing parallelism, latency, precision, and computational load, and then weighting and fusing it with the characteristic parameters of heterogeneous computing resources (parallelism capability, latency performance, and precision support), a precise quantified task-resource adaptation coefficient is obtained, overcoming the one-sidedness of decision-making based on a single indicator.

[0040] S103 dynamically allocates each task to be executed to the corresponding target heterogeneous computing unit based on the multi-dimensional resource adaptation coefficients and heterogeneous resource status parameter groups.

[0041] In this embodiment, based on multi-dimensional resource adaptation coefficients and heterogeneous resource status parameter groups, each task to be executed is dynamically allocated to the corresponding target heterogeneous computing unit, specifically including: Based on the resource adaptation coefficients of various dimensions, a resource adaptation coefficient matrix corresponding to the tasks to be executed is generated. According to the resource adaptation coefficient matrix and preset rules, target heterogeneous computing units corresponding to the tasks to be executed are iteratively selected for allocation, and the virtual state parameters corresponding to each heterogeneous computing unit are updated after each allocation. Based on the updated virtual state parameters, the remaining computing power of the target heterogeneous computing units is determined. When the remaining computing power is less than the task computation amount of any unassigned task to be executed, the corresponding unassigned task to be executed is split into subtasks, and the split subtasks are allocated to the corresponding target heterogeneous computing units. The scheduling priority of each split subtask is higher than the scheduling priority of newly arrived tasks to be executed.

[0042] In other words, after calculating the aforementioned multi-dimensional resource adaptation coefficients, this application adds each multi-dimensional resource adaptation coefficient to the resource adaptation coefficient matrix, with the tasks to be executed as rows and heterogeneous computing units as columns. Subsequently, according to the priority order of the tasks to be executed, target heterogeneous computing units are allocated to each task in sequence. Specifically, the maximum value among the multi-dimensional resource adaptation coefficients corresponding to the task to be executed is iteratively selected from the resource adaptation coefficient matrix, and the task-resource pair corresponding to the maximum value is determined, thereby obtaining the target heterogeneous computing unit. While allocating the tasks to be executed to the heterogeneous computing units, the virtual state parameters of the heterogeneous computing units, such as their resource computing power parameters, are also updated synchronously. The virtual state parameters are internally maintained predictive parameters used to simulate the resource state after task allocation, including not only the expected resource computing power parameters, but also the expected utilization rate, power consumption, etc. after task allocation. The virtual state parameters can be obtained by updating the current resource state parameters of the heterogeneous computing unit based on the amount of computation required for the execution of the task to be executed, the resulting temperature changes and power consumption changes, etc. The specific method for determining the virtual state parameters can be set by the user based on the actual use scenario, or it can be determined by the expert system by analyzing each task to be assigned to the target heterogeneous computing unit. This application does not make specific limitations in this regard.

[0043] Subsequently, the drone will calculate, based on the updated virtual state parameters, whether the remaining computing power of the current target heterogeneous computing unit is less than the computational load of each pending task that is about to be allocated to that heterogeneous computing unit but has not yet been executed. If the remaining computing power is insufficient to allocate the pending tasks, the pending tasks can be further split into multiple subtasks. The scheduling priority of the split subtasks is higher than the scheduling priority of newly arrived pending tasks. Based on the split subtasks, it is determined whether the remaining computing power is greater than the computational load corresponding to the subtask. If so, the subtask is allocated to the target heterogeneous computing unit; otherwise, the multi-dimensional resource adaptation coefficient of the task is re-determined and redistributed. If no heterogeneous computing units have remaining computing power, the task will be reassigned after the target heterogeneous computing unit finishes its current task. The parallelism characteristic value of the unallocated pending tasks that have undergone subtask splitting is greater than a predetermined value. This predetermined value can be understood as a lower limit set by the user to distinguish high-parallelism tasks, and is based on actual usage scenarios; it is not specifically limited here.

[0044] To illustrate the subtask splitting, for example, if a newly arrived large image stitching task cannot be handled by any resource with insufficient remaining computing power, the scheduling module will determine whether its parallelism belongs to a high parallelism task. If so, it will be split into 4 subtasks, and the association IDs between the subtasks will be marked, giving them a higher priority than subsequent new tasks.

[0045] This application achieves global optimization allocation by combining a resource adaptation coefficient matrix with virtual state parameters, and splits subtasks when resource computing power is insufficient, which can avoid resource deadlock and task blocking caused by excessive task size.

[0046] Furthermore, during the dynamic allocation of tasks, the resources of heterogeneous computing units change dynamically, affecting the tasks in execution. Therefore, this application also provides the following embodiments, specifically including: Based on a preset resource adaptation cycle, the multi-dimensional resource adaptation coefficients corresponding to each task to be executed are updated. When it is determined that the decrease in the multi-dimensional resource adaptation coefficients after task allocation is greater than a preset threshold, or when it is determined that the remaining computing power of the target heterogeneous computing unit is less than the task computing volume of the allocated task to be executed, a task migration prompt message is generated to trigger task migration. Task migration includes at least migrating the remaining subtasks to other target heterogeneous computing units selected iteratively according to preset rules after completing the currently executing subtask.

[0047] In other words, this application pre-sets a resource adaptation period, and updates the multi-dimensional resource adaptation coefficients according to this pre-set period. When the multi-dimensional resource adaptation coefficients are periodically recalculated and a significant decrease in the coefficient's downward trend is detected (i.e., the decrease value exceeds a pre-set threshold), or the remaining computing power of the heterogeneous computing units is insufficient to support the task computation volume of the allocated tasks to be executed, task migration is triggered. The pre-set threshold can be set based on expert experience in actual use cases, and is not specifically limited here.

[0048] Typically, the above situations mostly occur when the temperature in the actual operating environment of the UAV rises, affecting the computing power of the heterogeneous computing unit. At this time, the above technical solution can proactively and smoothly redistribute tasks to more suitable hardware, avoiding task interruption or failure caused by hardware overheating, frequency reduction, or failure, thus enhancing the robustness of the UAV heterogeneous computing platform in dynamically changing environments and the reliability of task execution.

[0049] S104, based on the task allocation result and the preset operator template set, match the operator template corresponding to the task to be executed, so that the target heterogeneous computing unit can execute the task to be executed according to the matched operator template.

[0050] The pre-defined operator template set in this application includes pre-defined basic operator templates for various heterogeneous computing units of the UAV. Each basic operator template defines a pre-defined optimized computing structure corresponding to the target heterogeneous computing unit. For example, when the target heterogeneous computing unit is a graphics processor, it configures the thread organization method, shared memory usage method, and parallel computing mode; when the target heterogeneous computing unit is a field-programmable gate array (FPGA), it configures the pipeline depth, the number of parallel computing units, and the data flow interface; when the target heterogeneous computing unit is a neural network processor (NNZ), it configures the dedicated computing array and data flow for neural network layer operators; and when the target heterogeneous computing unit is a digital signal processor (DSP), it configures the multiply-accumulate unit array and memory access mode for streaming signal processing. The basic operator templates are pre-developed, configured, and stored in the UAV's onboard storage module during actual use.

[0051] Basic operator templates include 2D convolution operators, max pooling operators, matrix multiplication operators, coordinate transformation operators, Kalman filter operators, etc. The specific settings are determined according to the characteristics of the mission performed by the UAV, and no specific limitations are made here.

[0052] In this embodiment of the application, based on the task allocation result and the preset operator template set, the operator template corresponding to the task to be executed is matched, specifically including: Based on the task allocation results, the type identifier of the target heterogeneous computing unit and the operator type corresponding to the task to be executed are determined. Based on the type identifier and operator type, the corresponding basic operator template is matched from the preset operator template set. The basic operator template defines a preset optimized computing structure corresponding to the target heterogeneous computing unit. Based on the task accuracy requirements corresponding to the task to be executed, the accuracy of the matched basic operator template is configured to generate the operator template corresponding to the task to be executed.

[0053] In other words, after allocating the target heterogeneous computing units for the task to be executed, the type identifier of the target heterogeneous computing unit, such as GPU or FPGA, will be determined. At the same time, the operator types required to execute the task will be parsed. Subsequently, based on the type identifier and operator type, the basic operator templates that can execute the corresponding tasks on the target heterogeneous computing units will be matched from the preset operator template set.

[0054] It should also be noted that the basic operator template includes template variations to meet the actual accuracy requirements of different tasks. Accuracy configuration is performed as follows: After obtaining the basic operator template, it is dynamically quantized according to the accuracy requirements of the task to be executed, generating a quantized operator execution body. The quantized operator execution body undergoes accuracy verification; if the verification fails, the quantization bit width is increased and requantization is performed. The dynamic quantization process specifically refers to selecting different quantization bit widths to configure the basic operator template so that it conforms to the corresponding quantization bit width and meets the task accuracy requirements: When the task accuracy requirement is greater than or equal to a preset first threshold, a first quantization bit width, such as INT16 (16-bit integer), is used to configure the basic operator template; when the task accuracy requirement is less than a preset second threshold, a second quantization bit width, such as INT4, is used to configure the basic operator template, and operator pruning is enabled; when the task accuracy requirement is between the preset first threshold and the preset second threshold, a third quantization bit width, such as INT8, is used. The first and second thresholds can be set by the user based on expert experience in actual use cases, and are not specifically limited here. The first, second, and third quantization bit widths can be set in the actual scenario and are not specifically limited here. For example, the type of quantization bit width can also include 32-bit single-precision floating-point (FP32) and 16-bit single-precision floating-point.

[0055] In one embodiment of this application, after the target heterogeneous computing unit executes the task to be executed according to the matched operator template, the method further includes: Obtain the actual performance data of the target heterogeneous computing unit after executing the task. The actual performance data includes at least one of the following: real-time power consumption during task execution, task execution latency, average computing unit utilization, and execution result error rate. Compare the actual performance data with a preset standard list. Based on the comparison results, determine the calibration parameters in the preset standard list, and use these calibration parameters to correct the heterogeneous resource status parameter group and / or multi-dimensional resource adaptation coefficients.

[0056] In other words, this application can collect real-time actual performance data related to the hardware of the target heterogeneous computing unit while it is executing the task to be executed. A preset standard list stores performance prediction values ​​before the task is executed. The actual performance data is compared with the data of the corresponding attributes in the preset standard list to determine whether the actual performance data is better than the expected result. If not, calibration parameters for model fine-tuning are determined from the preset standard list. The heterogeneous resource state parameter set is then updated using the calibration parameters, or the multi-dimensional resource adaptation coefficients are recalculated, or the multi-dimensional resource adaptation coefficients are recalculated while updating the heterogeneous resource state parameter set.

[0057] Through the above technical solution, this application can determine the resource status parameters of different types of heterogeneous computing units in a UAV and evaluate the resource adaptability of each heterogeneous computing unit when executing tasks from multiple dimensions. This enables the perception and analysis of heterogeneous resource status for dynamic allocation of task computing load. Furthermore, while scheduling and allocating tasks, this application also matches operator templates corresponding to the tasks. Instead of optimizing after task allocation, pre-optimized operator templates are used to meet dynamic task execution requirements, maximizing hardware utilization and significantly improving the energy efficiency and overall performance of UAV edge computing. This allows the UAV heterogeneous computing platform to flexibly and dynamically schedule computing tasks, improving task processing efficiency.

[0058] Figure 2 This application provides a schematic diagram of the structure of a dynamic task scheduling system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs), as shown in the embodiments below. Figure 2 As shown, the UAV heterogeneous computing platform dynamic task scheduling system 200 includes: The acquisition module 201 is used to acquire the heterogeneous resource status parameter set collected in real time. The determination module 202 is used to determine the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit based on the heterogeneous resource status parameter set and each task to be executed. The multi-dimensional resource adaptation coefficients characterize the resource adaptation degree when each heterogeneous computing unit executes the same task. The allocation module 203 is used to dynamically allocate each task to be executed to the corresponding target heterogeneous computing unit based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter set. The matching module 204 is used to match the operator template corresponding to the task to be executed based on the task allocation result and a preset operator template set, so that the target heterogeneous computing unit executes the task according to the matched operator template.

[0059] The various embodiments in this application are described in a progressive manner. Similar or identical parts between embodiments can be referred to mutually. Each embodiment focuses on describing the differences from other embodiments. In particular, the system embodiments are basically similar to the method embodiments, so the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0060] The systems and methods provided in this application are one-to-one correspondences. Therefore, the system also has similar beneficial technical effects as its corresponding method. Since the beneficial technical effects of the method have been described in detail above, the beneficial technical effects of the system will not be repeated here.

[0061] It should also be noted that the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

[0062] The above description is merely an embodiment of this application and is not intended to limit this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principle of this application should be included within the scope of the claims of this application.

Claims

1. A dynamic task scheduling method for a heterogeneous computing platform for unmanned aerial vehicles (UAVs), characterized in that, The method includes: Real-time acquisition of heterogeneous resource status parameter sets of at least two different types of heterogeneous computing units on the UAV; Based on the heterogeneous resource state parameter groups and the tasks to be executed, the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit are determined; wherein, the multi-dimensional resource adaptation coefficients are used to characterize the degree of resource adaptation when each heterogeneous computing unit executes the same task to be executed. Based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter group, each of the tasks to be executed is dynamically allocated to the corresponding target heterogeneous computing unit. Based on the task allocation results and the preset operator template set, the operator template corresponding to the task to be executed is matched, so that the target heterogeneous computing unit executes the task to be executed according to the matched operator template.

2. The method according to claim 1, characterized in that, Based on the heterogeneous resource state parameter groups and each task to be executed, determine the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit, specifically including: Based on each of the tasks to be executed, a corresponding multi-dimensional feature vector is generated; wherein, the multi-dimensional feature vector includes the following dimensions: parallelism, allowed execution time, accuracy requirement, and task computation amount; Based on the preset resource adaptation weight reorganization corresponding to the heterogeneous computing unit and the heterogeneous resource state parameter group, the multidimensional feature vector is weighted and calculated to determine the multidimensional resource adaptation coefficient based on the weighted calculation result.

3. The method according to claim 2, characterized in that, The heterogeneous resource status parameter group includes at least preset parallel processing capability parameters, preset latency performance parameters, current utilization parameters, preset accuracy support parameters, and real-time power consumption parameters. Based on the preset resource adaptation weight reorganization corresponding to the heterogeneous computing unit and the heterogeneous resource state parameter group, the multidimensional feature vector is weighted and calculated to determine the multidimensional resource adaptation coefficient based on the weighted calculation result, specifically including: The first coefficient is determined based on the product of the preset parallel processing capability parameter and the parallelism feature value in the multidimensional feature vector; The second coefficient is determined based on the ratio of the preset latency performance parameter to the allowed execution time feature value in the multidimensional feature vector; Based on the first weight, the first coefficient, the second coefficient, and the current utilization rate parameter in the preset resource adaptation right reorganization, calculate the third coefficient corresponding to the first resource adaptation item; The fourth coefficient is determined based on the product of the preset accuracy support parameter and the accuracy requirement feature value in the multidimensional feature vector; Calculate the fifth coefficient corresponding to the second resource adaptation item based on the second weight in the preset resource adaptation weight reorganization, the fourth coefficient, and the real-time power consumption parameter; The multi-dimensional resource adaptation coefficient is determined based on the sum of the third coefficient and the fifth coefficient.

4. The method according to claim 1, characterized in that, Based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter group, each task to be executed is dynamically allocated to the corresponding target heterogeneous computing unit, specifically including: Based on the multi-dimensional resource adaptation coefficients, a resource adaptation coefficient matrix corresponding to the task to be executed is generated; Based on the resource adaptation coefficient matrix and preset rules, the target heterogeneous computing unit corresponding to the task to be executed is iteratively selected and allocated, and the virtual state parameters corresponding to each heterogeneous computing unit are updated after each allocation. Based on the updated virtual state parameters, the remaining computing power of the target heterogeneous computing unit is determined. When the remaining computing power is less than the task computing amount of any unassigned task to be executed, the corresponding unassigned task to be executed is split into subtasks, and the split subtasks are assigned to the corresponding target heterogeneous computing unit. The scheduling priority of each split subtask is greater than the scheduling priority of the newly arrived task to be executed.

5. The method according to claim 4, characterized in that, The method further includes: Update the multi-dimensional resource adaptation coefficients corresponding to each of the tasks to be executed according to the preset resource adaptation period; When the decrease value of the multi-dimensional resource adaptation coefficient after task allocation is greater than a preset threshold, or when the remaining computing power of the target heterogeneous computing unit is less than the task computing amount of the allocated task to be executed, a task migration prompt message is generated to trigger task migration; wherein, the task migration includes at least migrating the remaining subtasks to other target heterogeneous computing units selected again according to the preset rules after completing the currently executing subtask.

6. The method according to claim 1, characterized in that, Based on the task allocation results and the preset operator template set, the operator template corresponding to the task to be executed is matched, specifically including: Based on the task allocation results, determine the type identifier of the target heterogeneous computing unit and the operator type corresponding to the task to be executed; Based on the type identifier and the operator type, a corresponding basic operator template is matched from the preset operator template set; wherein, the basic operator template defines a preset optimized computing structure corresponding to the target heterogeneous computing unit; Based on the task precision requirements corresponding to the task to be executed, the precision of the matched basic operator template is configured to generate the operator template corresponding to the task to be executed.

7. The method according to claim 1, characterized in that, The heterogeneous computing unit includes at least two of the following types: graphics processor, field-programmable gate array, digital signal processor, and neural network processor.

8. The method according to claim 1, characterized in that, After the target heterogeneous computing unit executes the task to be executed according to the matched operator template, the method further includes: Obtain the actual performance data of the target heterogeneous computing unit after executing the task to be executed; wherein, the actual performance data includes at least one of the following: real-time power consumption of task execution, task execution latency, average computing unit utilization, and execution result error rate; The actual performance data is compared with a preset standard list to determine the calibration parameters in the preset standard list based on the comparison results. The heterogeneous resource status parameter group and / or the multi-dimensional resource adaptation coefficient are then corrected based on the calibration parameters.

9. The method according to claim 1, characterized in that, Real-time acquisition of heterogeneous resource status parameter sets for at least two different types of heterogeneous computing units onboard the UAV, specifically including: The underlying hardware performance counter data of the heterogeneous computing unit is read through the direct memory access interface at a preset sampling period. By using a preset adaptive Kalman filter, the initial state parameters corresponding to the underlying hardware performance counter data are filtered to obtain the current resource state parameters; The current resource status parameters are input into a preset lightweight prediction model for a preset number of consecutive periods to obtain the predicted resource status values ​​for a preset future time period; wherein, the predicted resource status values ​​are used to calculate the predicted multi-dimensional resource adaptation coefficient to determine whether to update the multi-dimensional resource adaptation coefficient; The heterogeneous resource status parameter group is determined based on the current resource status parameters and the corresponding predicted resource status values.

10. A dynamic task scheduling system for a heterogeneous computing platform for unmanned aerial vehicles (UAVs), characterized in that, The system includes: The acquisition module is used to acquire the real-time collected heterogeneous resource status parameter set; The determination module is used to determine the multi-dimensional resource adaptation coefficients corresponding to each heterogeneous computing unit based on the heterogeneous resource status parameter group and each task to be executed; wherein, the multi-dimensional resource adaptation coefficients are used to characterize the degree of resource adaptation when each of the heterogeneous computing units executes the same task to be executed. The allocation module is used to dynamically allocate each of the tasks to be executed to the corresponding target heterogeneous computing units based on the multi-dimensional resource adaptation coefficients and the heterogeneous resource status parameter group. The matching module is used to match the operator template corresponding to the task to be executed based on the task allocation result and the preset operator template set, so that the target heterogeneous computing unit executes the task to be executed according to the matched operator template.