Resource allocation method, system, device, medium and program product for terminal equipment

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By constructing a multi-objective optimization model and adopting an evolutionary optimization algorithm, the resource allocation of terminal devices is dynamically adjusted, solving the resource scheduling problem of model inference tasks and concurrent tasks in terminal devices, realizing efficient coordination and allocation of resources, and meeting the operational needs of different tasks.

CN122220105APending Publication Date: 2026-06-16RDA MICROELECTRONICS SHANGHAICO LTD

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: RDA MICROELECTRONICS SHANGHAICO LTD
Filing Date: 2026-04-10
Publication Date: 2026-06-16

Smart Images

Figure CN122220105A_ABST

Patent Text Reader

Abstract

Embodiments of the present application provide a resource allocation method, system, device, medium and program product of a terminal device. The method comprises: collecting resource state information of the terminal device, and collecting task state information of a model inference task and at least one concurrent task respectively; obtaining task priorities of the model inference task and the at least one concurrent task, and determining resource requirement information of the model inference task and each concurrent task through resource requirement analysis; constructing a multi-objective optimization model based on the resource state information, the task state information, the task priorities and the resource requirement information, and solving the multi-objective optimization model by using a preset evolutionary optimization algorithm to obtain a resource allocation strategy; and dynamically allocating and adjusting resources of the terminal device according to the resource allocation strategy. The present application achieves the effect of taking into account the model inference task processing performance and the concurrent task response requirement.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computers, and more particularly to a resource allocation method, system, device, medium, and program product for a terminal device. Background Technology

[0002] With the development of artificial intelligence technology, the application of large-scale models in terminal devices is gradually increasing. Terminal devices include mobile terminals, in-vehicle devices, smart home devices, and various embedded devices. By deploying relevant models on terminal devices, localized data processing can be achieved in application scenarios such as voice processing, image recognition, and environmental perception, meeting the needs of some businesses for responsiveness and local processing capabilities. At the same time, terminal devices typically need to handle multiple tasks in parallel, including communication, audio and video processing, sensor data processing, and system services.

[0003] In existing technologies, to support multitasking in terminal devices, resource management typically employs methods such as pre-configured resource quotas, threshold-based adjustments, or scheduling based on the current operating status. For example, processing resource usage ratios, storage space usage ranges, and communication resource usage ranges can be pre-set for different types of tasks; when device load or battery power reaches certain conditions, some tasks can be subject to rate limiting, frequency reduction, termination, or resource relinquishment. Alternatively, differentiated scheduling methods can be used for different tasks based on the current resource usage.

[0004] Existing technologies can achieve resource management in a multi-tasking environment to a certain extent, but their processing usually relies on preset rules or current status information. When the device's operating status is constantly changing, there are many types of tasks, and there is resource competition between tasks, the resource adjustment process is difficult to take into account the operational needs of different tasks. Summary of the Invention

[0005] This application provides a resource allocation method, system, device, medium, and program product for a terminal device, which aims to balance the performance of model inference tasks and the response requirements of concurrent tasks.

[0006] In a first aspect, embodiments of this application provide a resource allocation method for a terminal device, including:

[0007] Collect resource status information of terminal devices, and collect task status information of model inference task and at least one concurrent task respectively;

[0008] Obtain the task priorities of the model inference task and the at least one concurrent task, and determine the resource requirement information of the model inference task and each of the concurrent tasks through resource requirement analysis;

[0009] Based on the resource status information, the task status information, the task priority, and the resource demand information, a multi-objective optimization model is constructed, and a preset evolutionary optimization algorithm is used to solve the multi-objective optimization model to obtain a resource allocation strategy.

[0010] According to the resource allocation strategy, the resources of the terminal device are dynamically allocated and adjusted to complete the resource allocation for the model inference task and each of the concurrent tasks.

[0011] In one possible implementation, the step of obtaining the task priorities of the model inference task and the at least one concurrent task, and determining the resource requirement information of the model inference task and each of the concurrent tasks through resource requirement analysis, includes:

[0012] By parsing the task attribute information of the model inference task, the importance information and real-time information corresponding to the model inference task are obtained; and by reading the task attribute information of each concurrent task, the importance information and real-time information corresponding to each concurrent task are extracted.

[0013] Based on the importance information and real-time information, a preset priority determination rule is used to obtain the task priorities corresponding to the model inference task and each concurrent task.

[0014] Based on the task attribute information, obtain the initial resource requirement information of the model inference task and each of the concurrent tasks;

[0015] Based on the task status information and the resource status information, the initial resource requirement information is corrected to determine the resource requirement information corresponding to the model inference task and each of the concurrent tasks; wherein, the resource requirement information includes at least one of computing resource requirements, memory resource requirements and network resource requirements.

[0016] In one possible implementation, the step of constructing a multi-objective optimization model based on the resource status information, the task status information, the task priority, and the resource requirement information includes:

[0017] Based on the task priority and resource requirement information, determine the resource allocation variables corresponding to the model inference task and each of the concurrent tasks;

[0018] Based on the task status information and the resource requirement information, construct the model inference performance objective function and the concurrent task response objective function;

[0019] Based on the resource status information, a device energy consumption objective function is constructed, and based on the resource status information, the resource constraints of the multi-objective optimization model are determined.

[0020] Based on the resource allocation variables, the model inference performance objective function, the concurrent task response objective function, the device energy consumption objective function, and the resource constraints, the multi-objective optimization model is constructed.

[0021] In one possible implementation, the evolutionary optimization algorithm includes a genetic optimization algorithm or a non-dominated sorting genetic optimization algorithm;

[0022] Accordingly, a preset evolutionary optimization algorithm is used to solve the multi-objective optimization model to obtain a resource allocation strategy, including:

[0023] Based on the resource allocation variables in the multi-objective optimization model, multiple resource allocation candidate strategies are generated as an initial candidate strategy set.

[0024] Based on each of the resource allocation candidate strategies, the target evaluation results corresponding to each resource allocation candidate strategy are calculated through the multi-objective optimization model, and the fitness of each resource allocation candidate strategy is determined based on the target evaluation results.

[0025] Based on the fitness of each resource allocation candidate strategy, the initial candidate strategy set is sequentially subjected to selection, crossover, and mutation processes to obtain an updated candidate strategy set.

[0026] If the evolutionary optimization algorithm is a non-dominated sorting genetic optimization algorithm, then the updated candidate strategy set is non-dominated sorted to determine the non-dominated solution set, and when the preset convergence condition is met or the preset number of iterations is reached, the resource allocation strategy is determined from the non-dominated solution set.

[0027] Alternatively, if the evolutionary optimization algorithm is a genetic optimization algorithm, then when the preset convergence condition is met or the preset number of iterations is reached, the resource allocation strategy is determined from the updated candidate strategy set.

[0028] In one possible implementation, the step of dynamically allocating and adjusting the resources of the terminal device according to the resource allocation strategy to complete the resource allocation for the model inference task and each of the concurrent tasks includes:

[0029] Based on the resource allocation strategy, determine the device resource allocation parameters corresponding to the model inference task and each of the concurrent tasks;

[0030] Based on the device resource allocation parameters, the heterogeneous computing resources of the terminal device are adjusted to adjust the computing resources of the model inference task and each of the concurrent tasks;

[0031] And / or based on the device resource allocation parameters, allocate and reclaim the memory resources of the terminal device to adjust the memory resources of the model inference task and each of the concurrent tasks;

[0032] And / or based on the device resource allocation parameters, allocate bandwidth to the network resources of the terminal device to adjust the network resources of the model inference task and each of the concurrent tasks.

[0033] In one possible implementation, the method further includes:

[0034] Based on resource adjustments to the model inference task and each of the concurrent tasks, resource status information of the terminal device is continuously collected, as well as task status information of the model inference task and at least one concurrent task are collected respectively, to obtain resource allocation execution result information.

[0035] Based on the resource allocation execution result information, a resource allocation effect evaluation result is generated; wherein, the resource allocation effect evaluation result includes at least one of the following: task response effect evaluation result, resource utilization effect evaluation result, and energy consumption effect evaluation result;

[0036] Based on the resource allocation effect evaluation results and preset target conditions, determine the feedback adjustment parameters;

[0037] Based on the feedback adjustment parameters, the model parameters of the multi-objective optimization model and / or the solution parameters of the evolutionary optimization algorithm are adjusted to obtain the adjusted multi-objective optimization model and the adjusted evolutionary optimization algorithm.

[0038] The adjusted evolutionary optimization algorithm is used to solve the adjusted multi-objective optimization model to generate a new resource allocation strategy;

[0039] Based on the new resource allocation strategy, update the resources for the model inference task and each of the concurrent tasks.

[0040] Secondly, embodiments of this application provide a device resource allocation system, including:

[0041] The information acquisition module is used to collect resource status information of the terminal device, as well as task status information of the model inference task and at least one concurrent task respectively.

[0042] The information analysis module is used to obtain the task priorities of the model inference task and the at least one concurrent task, and to determine the resource requirement information of the model inference task and each of the concurrent tasks through resource requirement analysis.

[0043] The information processing module is used to construct a multi-objective optimization model based on the resource status information, the task status information, the task priority, and the resource demand information, and to solve the multi-objective optimization model using a preset evolutionary optimization algorithm to obtain a resource allocation strategy.

[0044] The allocation execution module is used to dynamically allocate and adjust the resources of the terminal device according to the resource allocation strategy, so as to complete the resource allocation for the model inference task and each of the concurrent tasks.

[0045] Thirdly, embodiments of this application provide an electronic device, including: a memory and a processor;

[0046] The memory stores computer-executed instructions;

[0047] The processor executes computer execution instructions stored in the memory, causing the processor to perform the first aspect and / or various possible implementations of the first aspect as described above.

[0048] Fourthly, embodiments of this application provide a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the first aspect and / or various possible implementations of the first aspect.

[0049] Fifthly, embodiments of this application provide a computer program product, including a computer program that, when executed by a processor, implements the first aspect and / or various possible implementations of the first aspect.

[0050] In a sixth aspect, this application provides a chip including at least one processor for executing program instructions to perform the methods involved in the first aspect and any possible implementation.

[0051] This application provides a resource allocation method, system, device, medium, and program product for a terminal device. By collecting resource status information of the terminal device and task status information of model inference tasks and concurrent tasks, basic data on the device's currently available resources and the running status of each task can be obtained, enabling subsequent resource allocation to be based on the real-time operating status of the terminal device. Furthermore, by obtaining the task priorities of model inference tasks and concurrent tasks, and combining this with resource requirement analysis to determine the resource requirements of each task, the differences in importance, real-time requirements, and resource consumption among different tasks can be distinguished. This allows the resource allocation process to simultaneously consider task priority and actual resource needs, rather than applying a uniform processing method to different tasks. Based on this, a multi-objective optimization model is constructed based on resource status information, task status information, task priorities, and resource requirement information. A preset evolutionary optimization algorithm is used to solve this multi-objective optimization model, achieving a comprehensive trade-off between model inference task processing performance, concurrent task response requirements, and terminal device resource consumption, resulting in a resource allocation strategy that matches the current operating conditions. Finally, by dynamically allocating and adjusting the resources of the terminal device according to the resource allocation strategy, the model inference task and concurrent task can obtain resource support that is adapted to their operation requirements on the same terminal device. This solves the technical problem that when model inference task and concurrent task are running simultaneously on the terminal device, it is difficult to coordinate and allocate limited resources according to the device resource status and task operation status, so as to balance the processing performance of model inference task and the response requirements of concurrent task. Attached Figure Description

[0052] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.

[0053] Figure 1 This is a schematic diagram of the system architecture provided for an embodiment of this application;

[0054] Figure 2 A flowchart illustrating the resource allocation method for a terminal device provided in an embodiment of this application;

[0055] Figure 3 This is a schematic diagram of the strategy update method provided in an embodiment of this application;

[0056] Figure 4 This is a schematic diagram of the structure of the equipment resource allocation system provided in the embodiments of this application;

[0057] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application.

[0058] The accompanying drawings illustrate specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to particular embodiments. Detailed Implementation

[0059] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0060] Figure 1 This is a schematic diagram of the system architecture provided for an embodiment of this application. Figure 1 As shown, the application scenario of this application may include terminal device 101 and computer device 102, with a communication connection established between them. Terminal device 101 can be a mobile terminal, wearable device, in-vehicle device, smart home device, or other terminal-side device, used to carry out model inference tasks and the execution of at least one concurrent task; computer device 102 can be a server, workstation, or other device with data processing capabilities, used to interact with terminal device 101. The connection relationship in the figure indicates that information transmission can occur between the two, which can be achieved through wired communication or wireless communication.

[0061] In this application scenario, terminal device 101 serves as the primary application object of the resource allocation method. It collects its own resource status information, as well as the task status information of the model inference task and concurrent tasks, and performs resource allocation and adjustment based on status changes during operation. Computer device 102 can act as an external device cooperating with terminal device 101, providing task-related data, model-related data, control information, or strategy-related information to terminal device 101. It can also receive status data, execution result data, or evaluation result data reported by terminal device 101. By setting the communication relationship between terminal device 101 and computer device 102, terminal device 101 can form a collaborative processing environment with external devices while running tasks locally. This application processes the resource status information and task status information of terminal device 101 to generate a resource allocation strategy, and dynamically allocates and adjusts the resources of terminal device 101 accordingly, achieving coordinated resource allocation between model inference tasks and concurrent tasks. This schematic diagram mainly illustrates the application environment of the technical solution of this application and the basic connection relationships between related devices.

[0062] The inventive concept of this application lies in establishing a dynamic resource allocation mechanism based on the operating state of the terminal device to address the resource contention problem that occurs when a model inference task and at least one concurrent task run simultaneously in a terminal device. This mechanism does not directly allocate resources using fixed resource quotas or a single rule. Instead, it first collects the current resource state information of the terminal device, as well as the task state information of the model inference task and the concurrent task, to obtain a real-time state basis for the resource and task environments. Based on this, it further obtains the task priority of each task and, combined with resource demand analysis, determines the resource demand information corresponding to each task, enabling different tasks to be distinguished as different scheduling objects and demand objects before resource allocation.

[0063] After obtaining resource status information, task status information, task priority, and resource requirement information, this application further incorporates the above information into a multi-objective optimization model for modeling processing, and uses a preset evolutionary optimization algorithm to solve the multi-objective optimization model to generate a resource allocation strategy. Its core lies in transforming the resource allocation problem between model inference tasks and concurrent tasks from a traditional static configuration problem into a computable and optimizable multi-objective solution problem. This allows the resource allocation process to simultaneously consider the current resource conditions of the terminal device, task running status, task priority relationships, and the mutual influence relationships between task resource requirements, resulting in a resource allocation result that matches the current operating scenario.

[0064] After obtaining the resource allocation strategy, the resources of the terminal device are further dynamically allocated and adjusted according to the strategy to complete the resource allocation for the model inference task and various concurrent tasks. This transforms the resource allocation in the terminal device from a static, experience-based approach to a dynamic, coordinated approach oriented towards the current operating state, enabling the coordinated allocation of the terminal device's limited resources when model inference tasks and concurrent tasks exist in parallel.

[0065] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will now be described with reference to the accompanying drawings.

[0066] Figure 2 A flowchart illustrating the resource allocation method for a terminal device provided in this application embodiment is shown below. Figure 2 As shown, the method includes:

[0067] S21, collect resource status information of the terminal device, and collect task status information of the model inference task and at least one concurrent task respectively.

[0068] In this embodiment, resource status information is used to represent the resource occupancy, availability, and changes of the terminal device at the current running time. Task status information is used to represent the running status, execution progress, and resource usage status of the model inference task and at least one concurrent task. The data collection process can be executed by the monitoring module, scheduling module, or information collection module associated with the system running status in the terminal device, and continuously acquires corresponding information according to a preset sampling period, event triggering method, or a combination of both, to form a status data foundation that matches the current running environment.

[0069] In practical implementation, the collection of resource status information is a status acquisition based on the overall resource operation of the terminal device, enabling the terminal device to know the current status of available resources and changes in resource load before resource allocation. Separate collection of task status information for the model inference task and at least one concurrent task is used to distinguish the differences in the current operating status of different tasks, avoiding treating multiple tasks as homogeneous objects. By synchronously or correlatedly collecting the resource status information of the terminal device and the task status information of different tasks, a correspondence between resource-side status and task-side status can be established, providing a status input basis for subsequent task priority acquisition, resource demand analysis, multi-objective optimization model construction, and resource allocation strategy generation. This achieves real-time perception of the resource and task environments, providing data support for subsequent coordinated allocation of resources based on the terminal device's operating status.

[0070] S22, obtain the task priorities of the model inference task and at least one concurrent task, and determine the resource requirement information of the model inference task and each concurrent task through resource requirement analysis.

[0071] In this embodiment, task priority is used to represent the order, guarantee level, or scheduling weight of model inference tasks and concurrent tasks in the resource allocation process, while resource demand information is used to characterize the resource requirements of the corresponding task on the terminal device at the current running stage. Based on the collection of resource status information and task status information, different tasks are further distinguished and quantified, so that subsequent resource allocation is no longer uniformly processed based solely on the existence of tasks, but can reflect the differences between different tasks.

[0072] In practical implementation, task priority acquisition is performed separately for the model inference task and at least one concurrent task to establish a comparable priority relationship between tasks. Resource requirement analysis, on the other hand, analyzes the operational needs of the model inference task and each concurrent task to determine the resource requirements for each task. Resource requirement analysis can combine the current task operation status and the current resource environment of the terminal device to determine the degree of resource consumption, changing trends, or allocation tendencies required by the task during subsequent execution, obtaining resource requirement information corresponding to the actual task operation. By acquiring and establishing a correspondence between task priorities and resource requirement information, both the "degree of priority that should be given to different tasks" and the "actual resource allocation situation" can be reflected simultaneously. Obtaining task priorities and determining resource requirement information provides a foundation for subsequently constructing a multi-objective optimization model and generating a resource allocation strategy that matches the task's operational needs.

[0073] S23. Based on resource status information, task status information, task priority and resource demand information, a multi-objective optimization model is constructed, and a preset evolutionary optimization algorithm is used to solve the multi-objective optimization model to obtain the resource allocation strategy.

[0074] In this embodiment, a multi-objective optimization model is used to uniformly represent and correlate multiple objectives related to resource allocation in the terminal device. An evolutionary optimization algorithm is used to iteratively search and optimize different resource allocation results under the constraints of the multi-objective optimization model. A resource allocation strategy is used to represent the resource allocation results corresponding to the model inference task and each concurrent task under the current operating conditions. Based on the obtained resource status information, task status information, task priority, and resource requirement information, multiple types of input information are comprehensively modeled and solved, enabling the resource allocation process to transition from the state perception and requirement analysis stage to the strategy generation stage.

[0075] In practical implementation, the process of constructing a multi-objective optimization model involves uniformly organizing resource status information, task status information, task priority, and resource demand information, and establishing the constraint and influence relationships between these information segments. This allows the allocation requirements of different tasks in a resource-competitive environment to be represented in a model-based manner. By incorporating this information into the multi-objective optimization model, the subsequent solution process no longer processes individual tasks or resources separately, but rather performs a unified analysis of resource allocation for multiple tasks based on the overall operational status. Furthermore, when using a pre-defined evolutionary optimization algorithm to solve the multi-objective optimization model, the candidate resource allocation results are continuously optimized based on the multi-objective relationships and constraints represented in the model to obtain a resource allocation strategy adapted to the current operating conditions of the terminal device.

[0076] S24. Based on the resource allocation strategy, dynamically allocate and adjust the resources of the terminal device to complete the resource allocation for the model inference task and each concurrent task.

[0077] In this embodiment, based on the obtained resource allocation strategy, the aforementioned modeling and solution results are transformed into resource control behaviors oriented towards the actual operation of the terminal device, enabling the model inference task and each concurrent task to obtain corresponding resource support according to the resource allocation strategy. Dynamic allocation and adjustment characterize the process by which the terminal device changes its resource configuration status in real time according to the resource allocation strategy during operation, ensuring that the resource allocation results correspond to the current task operation status and the terminal device's resource environment. In specific implementation, dynamic allocation and adjustment of the terminal device's resources involves applying the allocation results represented in the resource allocation strategy to the execution process of the model inference task and each concurrent task, so that different tasks obtain resource configurations that match their task priority, task status information, and resource requirement information at the current running stage. Since the resource status information and task status information of the terminal device are constantly changing, dynamic allocation and adjustment is not a fixed static process, but rather updates the resource configuration accordingly as the current operating status of the terminal device changes, keeping the resource allocation process adapted to the actual operating environment. In this way, the resource relationships between the model inference task and various concurrent tasks can be coordinated according to the generated resource allocation strategy, reducing the significant deviation between the resource allocation result and the actual operational requirements of the task. In summary, by dynamically allocating and adjusting the resources of the terminal device, the resource allocation strategy is transformed into the actual resource allocation result, enabling the terminal device to coordinate and allocate resources based on the current task operation status and resource changes.

[0078] In one embodiment, the task priorities of the model inference task and at least one concurrent task are obtained, and resource requirement information for the model inference task and each concurrent task is determined through resource requirement analysis, including:

[0079] S221, by parsing the task attribute information of the model inference task, the importance information and real-time information corresponding to the model inference task are obtained, and by reading the task attribute information of each concurrent task, the importance information and real-time information corresponding to each concurrent task are extracted.

[0080] S222, based on importance information and real-time information, a preset priority determination rule is used to obtain the task priorities corresponding to the model inference task and each concurrent task;

[0081] S223, Based on the task attribute information, obtain the initial resource requirement information for the model inference task and each concurrent task;

[0082] S224, Based on the task status information and resource status information, the initial resource requirement information is corrected to determine the resource requirement information corresponding to the model inference task and each concurrent task; wherein, the resource requirement information includes at least one of computing resource requirements, memory resource requirements and network resource requirements.

[0083] In this embodiment, task attribute information is used to characterize the task's own attribute characteristics, such as importance and real-time requirements. Importance information characterizes the degree of guarantee the corresponding task provides within the overall business, while real-time information characterizes the degree of timeliness required for the corresponding task's response. By obtaining the corresponding importance and real-time information for the model inference task and each concurrent task respectively, a foundation for prioritizing different tasks and a basis for demand analysis can be established before tasks enter subsequent resource allocation processing. By selectively extracting attribute information from different tasks, pre-identification of differences between tasks is achieved, providing input for subsequent task priority determination and resource demand analysis.

[0084] For model inference tasks, the importance and real-time information can be obtained by parsing the task attribute information. Similarly, for concurrent tasks, the importance and real-time information can be extracted by reading their respective task attribute information. These concurrent tasks can be communication tasks, audio / video processing tasks, sensor data processing tasks, or background data synchronization tasks, etc. Different types of concurrent tasks differ in terms of business continuity, latency requirements, and resource consumption tendencies; therefore, their corresponding importance and real-time information need to be obtained separately. By processing the task attribute information of model inference tasks and concurrent tasks separately, the subsequent task priority determination results can more accurately reflect the relative urgency and guarantee requirements of each task in the current business environment. This achieves differentiated processing of different task attribute sources, improving the targeting of subsequent task priority determination.

[0085] After obtaining importance and real-time information, a preset priority determination rule is used to obtain the task priorities corresponding to the model inference task and each concurrent task. The preset priority determination rule can be a rule that comprehensively evaluates the importance and real-time information, used to map different tasks to corresponding task priority results. Task priorities can be represented in a hierarchical manner, such as distinguishing between high, medium, and low priorities, or other priority representation methods that can characterize the relative order of tasks. In the operating scenario of terminal devices, communication tasks usually have high real-time requirements, model inference tasks usually need to balance processing performance and response requirements, and background data synchronization tasks have relatively lower requirements for immediate response. Therefore, task priorities corresponding to different tasks can be formed based on the combination of importance and real-time information. Through this process, subsequent resource allocation can no longer be statically differentiated based solely on task category, but can be dynamically differentiated based on the importance and timeliness requirements of tasks in the current scenario.

[0086] Initial resource requirement information is used to characterize the preliminary estimate of the resource requirements of a corresponding task before adjustments based on current task status and resource status information. For model inference tasks, initial resource requirement information characterizes the basic resource requirements of the terminal device during inference execution; for concurrent tasks, it characterizes the basic resource requirements of the terminal device under normal operating conditions. Since different tasks differ in attributes, model inference tasks typically focus more on resource support capabilities during processing, while different types of concurrent tasks have different focuses on latency, continuity, or data processing, thus their initial resource requirement information will also differ. By obtaining initial resource requirement information based on task attribute information, a requirement base corresponding to the task's own attributes can be formed first, providing initial input for subsequent adjustments based on real-time operating status.

[0087] Next, the initial resource requirement information is revised to determine the resource requirements for the model inference task and each concurrent task. Task status information characterizes the current running state of the model inference task and each concurrent task, such as active, standby, or paused. Resource status information characterizes the current resource environment of the terminal device, such as the occupancy of heterogeneous computing resources, memory resources, network resources, power supply status, and temperature status. By incorporating task status information and resource status information into the revision process of the initial resource requirement information, the resource requirement information is no longer a static estimate at the task attribute level, but can further reflect the actual differences between the current operating environment of the terminal device and the current operating stage of each task. For example, when a concurrent task is active, its corresponding resource requirement information can be increased accordingly; when the terminal device is resource-constrained, power supply is limited, or temperature is high, the resource requirement information can be constrained and revised based on task priority, ensuring that high-priority tasks maintain a high level of guarantee, while the resource requirement information for low-priority tasks is appropriately reduced. Through this revision process, the obtained resource requirement information is closer to the actual operating conditions of the terminal device. By introducing task status information and resource status information to correct the initial resource requirement information, dynamic resource requirement information and environmental adaptability are achieved, thereby improving the accuracy of subsequent multi-objective optimization model construction.

[0088] Specifically, resource requirement information includes at least one of computing resource requirements, memory resource requirements, and network resource requirements. Computing resource requirements characterize the degree of computational processing power demanded by the corresponding task on the terminal device; memory resource requirements characterize the degree of data storage and caching space demanded by the corresponding task during operation; and network resource requirements characterize the degree of communication bandwidth demanded by the corresponding task during data transmission. For model inference tasks, resource requirement information may include at least one of computing resource requirements and memory resource requirements; for concurrent tasks requiring continuous data interaction or continuous transmission, resource requirement information may include network resource requirements. By classifying resource requirement information into at least one of the above requirement types, subsequent resource allocation processing can be modeled and adjusted separately for different resource types, providing a clear data foundation for setting resource allocation variables and generating resource allocation strategies in subsequent multi-objective optimization models.

[0089] In one embodiment, a multi-objective optimization model is constructed based on resource status information, task status information, task priority, and resource requirement information, including:

[0090] S231, Based on task priority and resource requirement information, determine the resource allocation variables corresponding to the model inference task and each concurrent task;

[0091] S232, Based on the task status information and resource requirement information, construct the model inference performance objective function and the concurrent task response objective function;

[0092] S233, Based on the resource status information, construct the equipment energy consumption objective function, and based on the resource status information, determine the resource constraints of the multi-objective optimization model;

[0093] S234. Based on resource allocation variables, model inference performance objective function, concurrent task response objective function, equipment energy consumption objective function, and resource constraints, a multi-objective optimization model is constructed.

[0094] In this embodiment, determining the resource allocation variables corresponding to the model inference task and each concurrent task is a process of transforming task scheduling requirements and resource consumption requirements into variable expressions that can participate in model computation. Resource allocation variables are used to characterize the resource acquisition and resource consumption status of the model inference task and each concurrent task on the terminal device, enabling the task priority relationship and resource demand relationship to be quantified and incorporated into the subsequent multi-objective optimization model. Since the model inference task and each concurrent task differ in task priority and resource demand information, determining corresponding resource allocation variables for different tasks allows the resource allocation variables to simultaneously reflect the task guarantee level and resource demand level.

[0095] Specifically, the resource allocation variables corresponding to the model inference task and each concurrent task can characterize their allocation results to different resource types in the terminal device, such as at least one of heterogeneous computing resource allocation, memory resource allocation, and network resource allocation. For tasks with high priority and large resource requirements, the corresponding resource allocation variables can reflect a high resource consumption tendency in subsequent model solving; for tasks with relatively low priority or small resource requirements, the corresponding resource allocation variables can reflect a relatively limited resource consumption tendency. In this way, the resource allocation variables not only reflect the resource requirement information itself, but also reflect the impact of task priority on the order and degree of resource acquisition, enabling a unified representation of the resource competition relationship between the model inference task and each concurrent task at the model level. By setting resource allocation variables separately for different tasks, a quantitative expression of the resource allocation object and allocation degree is achieved.

[0096] After determining the resource allocation variables, a model inference performance objective function and a concurrent task response objective function are constructed. The model inference performance objective function characterizes the performance changes of the model inference task under different resource allocation conditions, while the concurrent task response objective function characterizes the response changes of each concurrent task under different resource allocation conditions. The model inference performance objective function can be constructed around the inference speed and inference accuracy of the model inference task. For example, it can be represented by a combination of inference speed and inference accuracy, such as Maximize(f1) = (inference speed × accuracy). The concurrent task response objective function can be constructed around the response time and task waiting time of the concurrent task. For example, it can be represented by a combination of response time and task waiting time, such as Minimize(f2) = (response time + task waiting time). By constructing the model inference performance objective function and the concurrent task response objective function, a model-based expression of the task processing performance objective and the task response objective is achieved.

[0097] Based on this, a device energy consumption objective function is constructed to characterize the energy consumption changes of the terminal device under different resource allocation conditions. The device energy consumption objective function can be constructed around the usage of heterogeneous computing resources and changes in power supply status. For example, the device energy consumption objective function can be represented by a combination of heterogeneous computing resource utilization and power consumption, such as Minimize(f3) = (heterogeneous computing module utilization × battery consumption). Heterogeneous computing modules include CPU, GPU, and NPU. Resource constraints are used to limit the solution range of the multi-objective optimization model, ensuring that the resource allocation results meet the executable requirements of the terminal device under the current resource environment. Since resource status information can characterize the current occupancy, availability, and changing status of heterogeneous computing resources, memory resources, and network resources, and also reflect power supply and temperature status, resource constraints corresponding to the total resource amount, available resource range, power supply conditions, and temperature conditions can be determined based on the resource status information. By introducing the device energy consumption objective function and resource constraints together, the subsequent model solution can not only focus on the performance of the model inference task and the response of concurrent tasks, but also consider the execution feasibility and energy consumption control requirements of the terminal device under the current resource environment.

[0098] After determining the resource allocation variables, model inference performance objective function, concurrent task response objective function, device energy consumption objective function, and resource constraints, a multi-objective optimization model is constructed based on these criteria. This multi-objective optimization model integrates model inference task processing performance, concurrent task response requirements, device energy consumption control, and resource usage limitations into a single solution object within a unified computational framework. This model simultaneously characterizes the correlation between "improved model inference performance," "improved concurrent task response," and "reduced device energy consumption," enabling a holistic analysis of the resource competition relationship between the model inference task and various concurrent tasks on the terminal device. By constructing a multi-objective optimization model based on resource allocation variables and multiple objective functions and resource constraints, subsequent optimization using a pre-defined evolutionary optimization algorithm can directly address the multi-objective resource allocation problem under the current operating conditions of the terminal device.

[0099] In one specific embodiment, the evolutionary optimization algorithm includes a genetic optimization algorithm or a non-dominated sorting genetic optimization algorithm. The genetic optimization algorithm is suitable for overall search and iterative optimization of resource allocation problems. This type of algorithm can perform evolutionary processes such as selection, crossover, and mutation around candidate resource allocation results, allowing the resource allocation results to gradually tend towards optimization during continuous iteration. For resource allocation problems in terminal devices, there is resource competition between the model inference task and various concurrent tasks, and the resource status information, task status information, task priority, and resource demand information of the terminal device will jointly affect the solution result. Therefore, using a genetic optimization algorithm can continuously adjust the candidate resource allocation results within a large search range, making the obtained resource allocation strategy more adaptable to the current operating state of the terminal device.

[0100] Non-dominated sorting genetic optimization algorithms are suitable for jointly solving multi-objective optimization models. Since multi-objective optimization models simultaneously include model inference performance, concurrent task response, and equipment energy consumption objectives, these objectives often have interdependent relationships. Therefore, the solution process requires not only obtaining a single candidate solution but also identifying the balance relationships among multiple objectives. Non-dominated sorting genetic optimization algorithms introduce non-dominated sorting during the candidate solution evolution process, distinguishing different candidate resource allocation results according to their multi-objective performance and retaining candidate solutions with a balanced advantage across multiple objectives. This type of algorithm can further determine the set of non-dominated solutions corresponding to the multi-objective solution results, providing a foundation for subsequently determining resource allocation strategies from the non-dominated solution set.

[0101] In application, the choice between genetic optimization algorithms and non-dominated sorting genetic optimization algorithms can be determined based on the solution requirements of the multi-objective optimization model. When the multi-objective optimization model emphasizes overall search and rapid iteration of resource allocation results, genetic optimization algorithms can be used; when the multi-objective optimization model emphasizes preserving balanced solutions among multiple objectives and performing joint trade-offs among multiple objectives, non-dominated sorting genetic optimization algorithms can be used.

[0102] Furthermore, a pre-defined evolutionary optimization algorithm is used to solve the multi-objective optimization model to obtain resource allocation strategies, including:

[0103] S231, Based on the resource allocation variables in the multi-objective optimization model, generate multiple resource allocation candidate strategies as an initial candidate strategy set;

[0104] S232, Based on each resource allocation candidate strategy, calculate the target evaluation results corresponding to each resource allocation candidate strategy through a multi-objective optimization model, and determine the fitness of each resource allocation candidate strategy based on the target evaluation results;

[0105] S233, Based on the fitness of each resource allocation candidate strategy, the initial candidate strategy set is sequentially subjected to selection, crossover, and mutation processing to obtain an updated candidate strategy set;

[0106] S234. If the evolutionary optimization algorithm is a non-dominated sorting genetic optimization algorithm, then the updated candidate strategy set is non-dominated sorted to determine the non-dominated solution set, and when the preset convergence condition is met or the preset number of iterations is reached, the resource allocation strategy is determined from the non-dominated solution set.

[0107] S235, If the evolutionary optimization algorithm is a genetic optimization algorithm, then when the preset convergence condition is met or the preset number of iterations is reached, the resource allocation strategy is determined from the updated candidate strategy set.

[0108] In this embodiment, resource allocation variables are used to characterize the resource acquisition status of the model inference task and each concurrent task on the terminal device. When generating resource allocation candidate strategies, multiple different candidate resource allocation results can be formed based on the value range, allocation ratio, or allocation combination corresponding to the resource allocation variables, and each candidate resource allocation result is represented as a corresponding resource allocation candidate strategy. For example, the initial candidate strategy set can be constructed using a random generation method, so that the values of different candidate strategies on the resource allocation variables are different, covering multiple possible resource allocation situations under the current operating conditions of the terminal device. By generating multiple resource allocation candidate strategies and forming an initial candidate strategy set, the subsequent solution process is not limited to a single resource allocation result, but can be compared and iteratively optimized among multiple candidate solutions.

[0109] After obtaining the initial set of candidate strategies, the objective evaluation results corresponding to each resource allocation candidate strategy are calculated, and the fitness of each resource allocation candidate strategy is determined. The objective evaluation results characterize the performance of each resource allocation candidate strategy in the multi-objective optimization model, and the fitness characterizes the superiority or inferiority of the corresponding resource allocation candidate strategy in the current iteration. The multi-objective optimization model simultaneously includes the objective function of model inference performance, the objective function of concurrent task response, and the objective function of device energy consumption. Therefore, after substituting each resource allocation candidate strategy into the multi-objective optimization model, objective evaluation results corresponding to model inference task processing performance, concurrent task response, and terminal device energy consumption control can be obtained respectively. Furthermore, the fitness of each resource allocation candidate strategy can be determined based on the comprehensive situation of multiple objective evaluation results, so that the fitness can reflect the overall performance of the resource allocation candidate strategy under multi-objective constraints. Through this process, the resource allocation candidate strategies can be transformed from simple combinations of variables into candidate solutions with evaluation results and differences in superiority or inferiority.

[0110] Next, selection, crossover, and mutation processes are performed. Selection filters out resource allocation candidate strategies with better fitness from the current candidate strategy set. Crossover combines some resource allocation information from multiple candidate strategies. Mutation adjusts some variables in the candidate strategies, introducing new candidate strategy variations. Selection can employ tournament or roulette wheel selection methods to increase the retention probability of resource allocation candidate strategies with better fitness. Crossover reorganizes resource allocation variables from different candidate strategies to form new candidate strategies that incorporate characteristics of multiple strategies. Mutation randomly modifies some resource allocation variables to increase the diversity of the candidate strategy set. By sequentially executing selection, crossover, and mutation processes, the candidate strategy set continuously evolves towards a better direction during iteration, while preventing premature convergence to local results. By updating the initial candidate strategy set through evolution, candidate strategies are progressively optimized, improving the adaptability of subsequent resource allocation strategy solutions to complex operating scenarios of terminal devices.

[0111] The Non-Dominated Sorting Genetic Algorithm (NSGA-II) can analyze the non-dominated relationships among multiple candidate strategies in multi-objective optimization scenarios, retaining those strategies that have a balanced advantage across multiple objectives. Specifically, when performing non-dominated sorting on the updated candidate strategy set, the algorithm determines whether there is a relationship of "better on at least one objective and not inferior on others" based on the objective evaluation results of each resource allocation candidate strategy. This results in a set of non-dominated candidate strategies, which constitutes the non-dominated solution set. This non-dominated solution set represents a set of balanced solutions formed during the multi-objective optimization model solution process, used to retain multiple optional resource allocation results between model inference task processing performance, concurrent task response requirements, and terminal device energy consumption control. When a preset convergence condition is met or a preset number of iterations is reached, a resource allocation strategy can be determined from the non-dominated solution set as the output strategy under the current terminal device operating conditions. Preset convergence conditions can be used to characterize the state where the candidate strategy set tends to stabilize in multiple consecutive iterations, and preset iteration number can be used to limit the maximum number of execution rounds in the iterative solution process.

[0112] Genetic optimization algorithms, also known as genetic algorithms (GA), rely on the continuous evolution of a candidate strategy set through selection, crossover, and mutation processes to gradually obtain resource allocation results with better fitness. In this case, non-dominated sorting of the updated candidate strategy set is unnecessary. Instead, the candidate strategy with the best current fitness can be determined based on the fitness of each resource allocation candidate strategy in the updated set, and output as the resource allocation strategy. By using preset convergence conditions and preset iteration counts as output conditions, the genetic optimization algorithm can ensure that its solution process has sufficient iterative optimization space and can output an executable resource allocation result after meeting the predetermined conditions. By determining the resource allocation strategy from the updated candidate strategy set when the preset conditions are met, the genetic optimization algorithm outputs resource allocation results, providing a strategic basis for subsequent dynamic allocation and adjustment of terminal device resources.

[0113] In summary, by first generating an initial set of candidate strategies, then calculating the objective evaluation results and fitness based on a multi-objective optimization model, and further updating the candidate strategy set through selection, crossover, and mutation processes, an iterative solution process for the resource allocation problem of terminal devices can be formed. Furthermore, by combining different types of evolutionary optimization algorithms, in the scenario of non-dominated sorting genetic optimization, the non-dominated solution set can be determined through non-dominated sorting; or in the scenario of genetic optimization, the resource allocation strategy can be directly determined from the updated candidate strategy set. This allows the generation process of the resource allocation strategy to match the solution requirements of the multi-objective optimization model. Iterative optimization and output from the multi-objective optimization model to the resource allocation strategy are realized, providing an executable strategy generation mechanism for the resource coordination and allocation between model inference tasks and concurrent tasks in terminal devices.

[0114] In one embodiment, resources of the terminal device are dynamically allocated and adjusted according to a resource allocation strategy to complete resource allocation for the model inference task and various concurrent tasks, including:

[0115] S241, Based on the resource allocation strategy, determine the device resource allocation parameters corresponding to the model inference task and each concurrent task.

[0116] Specifically, based on the solution results regarding the resource occupancy ratio, resource usage intensity, resource adjustment direction, and resource guarantee relationship for different tasks in the resource allocation strategy, device resource allocation parameters are determined for the model inference task and each concurrent task. These device resource allocation parameters reflect the differences in resource acquisition, resource priority, and dynamic resource adjustment range among different tasks, enabling subsequent resource control processes of the terminal device to be executed directly based on these parameters.

[0117] S242, based on device resource allocation parameters, adjusts the heterogeneous computing resources of terminal devices to adjust the computing resources of model inference tasks and various concurrent tasks.

[0118] Specifically, heterogeneous computing resources are used to characterize a set of resources in a terminal device that perform different computing functions, such as at least one of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and a Neural Processing Unit (NPU). When adjusting heterogeneous computing resources, the operating frequency, computing unit allocation status, and the proportion of heterogeneous computing resources occupied by a task can be adjusted based on device resource allocation parameters. This ensures that the model inference task and each concurrent task receive computing resource support appropriate to their task priority and resource requirements under the current operating conditions. For example, when the device resource allocation parameters for a model inference task indicate a high demand for computing power, the degree of heterogeneous computing resource occupancy it receives can be increased accordingly. Similarly, when a concurrent task needs to prioritize meeting response requirements, its computing resource occupancy can be adjusted based on the corresponding device resource allocation parameters. This approach allows the competition for computing resources between different tasks to be concretely implemented as executable heterogeneous computing resource adjustment results on the terminal device side, improving the coordination of computing resource utilization in multi-task parallel operation scenarios.

[0119] S243, based on device resource allocation parameters, allocates and reclaims memory resources of the terminal device to adjust the memory resources of the model inference task and each concurrent task.

[0120] Specifically, memory resources represent the set of resources available for data storage, caching, and retention of intermediate results during task execution on a terminal device. Model inference tasks typically involve loading model parameters, maintaining intermediate features, and temporarily storing inference results. Each concurrent task also consumes a certain amount of memory resources during execution. Therefore, it is necessary to dynamically allocate and reclaim memory resources based on device resource allocation parameters. For example, based on the device resource allocation parameters corresponding to the model inference task and each concurrent task, appropriate memory resource capacity can be allocated to different tasks. When the task state changes, the task execution phase switches, or memory resource usage changes, the occupied memory resources can be reclaimed and reallocated to avoid long-term inefficient or invalid memory usage on the terminal device. By dynamically allocating and reclaiming memory resources, the memory resource usage status is updated according to the task execution status, improving the memory resource utilization efficiency of the terminal device in a multi-tasking environment.

[0121] S244 allocates bandwidth to the network resources of the terminal device based on the device resource allocation parameters in order to adjust the network resources of the model inference task and each concurrent task.

[0122] Specifically, network resources characterize the communication bandwidth resources available to terminal devices during data upload, download, and inter-task data transfer. Different tasks have varying degrees of dependence on network resources. Communication tasks, audio / video processing tasks, and some tasks requiring continuous data interaction typically have high network resource requirements, while model inference tasks may also consume certain network resources in scenarios involving model updates, external data acquisition, or reporting inference results. Therefore, network bandwidth can be allocated to model inference tasks and concurrent tasks separately based on device resource allocation parameters, ensuring that tasks with high real-time requirements or those more sensitive to data transmission at the current stage receive network resource support commensurate with their operational needs. When allocating bandwidth, the range of network bandwidth available to different tasks at the current moment, bandwidth priority, and bandwidth usage intensity can be adjusted according to device resource allocation parameters. This allows terminal devices to coordinate the network resource usage relationship between model inference tasks and concurrent tasks even with limited network resources, improving the rationality of network resource allocation in multi-task concurrent operation scenarios.

[0123] Figure 3 This is a schematic flowchart illustrating a strategy update method provided in an embodiment of this application. Based on the above embodiments, as follows... Figure 3 As shown, it includes:

[0124] S31, based on resource adjustment of the model inference task and each concurrent task, continuously collect resource status information of the terminal device, and collect task status information of the model inference task and at least one concurrent task respectively, to obtain resource allocation execution result information.

[0125] Specifically, the resource allocation execution results information is used to characterize the operational status of the terminal device and the execution status of the model inference task and each concurrent task after the resource adjustment action is implemented. This allows subsequent feedback optimization to be analyzed based on the actual operational situation after resource adjustment, rather than solely relying on the prediction results from the strategy solving stage. After the model inference task and each concurrent task obtain the corresponding resources, the monitoring and processing process in the terminal device continues to collect data on the changes in the occupancy of heterogeneous computing resources, memory resources, and network resources. Simultaneously, it acquires the task status changes of the model inference task and at least one concurrent task after resource adjustment, establishing a correspondence between the resource-side execution results and the task-side execution results. Continuous rather than one-time data collection allows the resource allocation execution results information to reflect the actual effect of the resource allocation strategy over a period of time, providing a practical data foundation for subsequent resource allocation effectiveness evaluation.

[0126] S32, Generate resource allocation effect evaluation results based on resource allocation execution result information; wherein, the resource allocation effect evaluation results include at least one of the following: task response effect evaluation results, resource utilization effect evaluation results, and energy consumption effect evaluation results.

[0127] Specifically, task response performance evaluation results can reflect the response speed, task waiting status, or operational smoothness of model inference tasks and concurrent tasks after resource adjustments; resource utilization performance evaluation results can reflect the utilization of heterogeneous computing resources, memory resources, and network resources of terminal devices after resource adjustments; and energy consumption performance evaluation results can reflect the power consumption of terminal devices after resource adjustments and energy usage related to changes in resource occupancy. Since resource allocation execution results information includes both resource status information and task status information, a comprehensive evaluation of resource allocation performance can be conducted from the dimensions of task response, resource utilization, and energy consumption based on these results. This allows the resource allocation performance evaluation results to reflect the execution performance of the resource allocation strategy in different aspects. By transforming resource allocation execution results information into resource allocation performance evaluation results, the conversion of resource allocation execution results from status monitoring data to performance evaluation results is realized, providing an evaluation basis for determining subsequent feedback adjustment parameters.

[0128] S33. Based on the evaluation results of resource allocation effectiveness and preset target conditions, determine the feedback adjustment parameters.

[0129] Specifically, preset target conditions characterize the task response goals, resource utilization goals, and energy consumption control goals that terminal devices expect to achieve during resource allocation. Feedback adjustment parameters characterize the parameter correction amount, direction, or rules used when adjusting the multi-objective optimization model and evolutionary optimization algorithm. The resource allocation effect evaluation results are compared with the preset target conditions to determine whether the current resource allocation strategy meets the expected requirements in terms of task response, resource utilization, and energy consumption. When there is a deviation between the resource allocation effect evaluation results and the preset target conditions, feedback adjustment parameters are determined based on the deviation. If the task response effect evaluation results do not meet the preset requirements, the feedback adjustment parameters can be determined in the direction of improving the relevant task guarantee level; if the resource utilization effect evaluation results show that some resources are underutilized or resource occupancy is unbalanced, the feedback adjustment parameters can be determined in the direction of optimizing the resource allocation ratio; if the energy consumption effect evaluation results deviate from the preset target, the feedback adjustment parameters can be determined in the direction of reducing the resource consumption of terminal devices. Through this processing method, the feedback adjustment parameters can directly reflect the difference between the resource allocation strategy execution results and the preset target conditions. This realizes the transformation of resource allocation execution results into feedback adjustment basis, providing a parameter foundation for subsequent adjustment of model parameters and solution parameters.

[0130] S34. Adjust the parameters based on the feedback, and adjust the model parameters of the multi-objective optimization model and / or the solution parameters of the evolutionary optimization algorithm to obtain the adjusted multi-objective optimization model and the adjusted evolutionary optimization algorithm.

[0131] Specifically, when adjusting the model parameters of a multi-objective optimization model, adjustments can be made to the relative balance between the model inference performance objective function, the concurrent task response objective function, and the device energy consumption objective function, as well as resource constraints. When adjusting the solution parameters of an evolutionary optimization algorithm, adjustments can be made to the selection, crossover, mutation, and convergence determination parameters during the evolution of the candidate strategy set. By applying the feedback adjustment parameters to both the model and algorithm layers, the subsequent re-solution process can reflect the actual execution effect at the objective expression level and improve the adaptability of strategy search and updates at the solution process level. By adjusting the model parameters of the multi-objective optimization model and / or the solution parameters of the evolutionary optimization algorithm, the feedback information is transformed into model and algorithm corrections, enabling the subsequent resource allocation strategy generation process to more closely reflect the actual operating state of the terminal device.

[0132] S35. The adjusted evolutionary optimization algorithm is used to solve the adjusted multi-objective optimization model to generate a new resource allocation strategy.

[0133] Specifically, the resource allocation problem is further optimized to ensure that the new resource allocation strategy reflects the feedback information from the previous round of resource allocation. Since the adjusted multi-objective optimization model already reflects the difference between the resource allocation results and the preset objective conditions, and the adjusted evolutionary optimization algorithm has been adapted to the solution process, the newly obtained resource allocation strategy is more adaptable to the current resource and task status information of the terminal device. By continuously refining the resource allocation processing logic through real-time monitoring results, the resource allocation strategy can be dynamically updated as the device's operating status, task operating status, and resource usage change. This improves the adaptability of the resource allocation strategy to the dynamic operating environment of the terminal device.

[0134] S36, Update the resources for the model inference task and each concurrent task according to the new resource allocation strategy.

[0135] Specifically, since the resource status information, task status information, power supply status, and temperature status in the terminal device are constantly changing, updating the resources of the model inference task and each concurrent task according to the new resource allocation strategy can keep the resource allocation results dynamically consistent with the current operating status of the terminal device, rather than using the results of the previous strategy indefinitely. By updating the resources of the model inference task and each concurrent task according to the new resource allocation strategy, the feedback optimization results are transformed into actual resource update results, improving the sustainability and adaptability of the coordinated allocation of resources between the model inference task and each concurrent task in the terminal device.

[0136] In another embodiment, determining resource requirement information further includes: predicting the trend of resource requirement changes over a future period based on historical monitoring data and current task requirement information; correcting the current resource requirement information based on the trend of resource requirement changes; and determining the resource requirement information for the model inference task and each concurrent task.

[0137] In this embodiment, historical monitoring data may include resource status information and task status information collected by the terminal device during historical operation, such as changes in the occupancy of heterogeneous computing resources, changes in the occupancy of memory resources, changes in the usage of network resources, and changes in the task status of model inference tasks and concurrent tasks over different time periods. Current task demand information may originate from the resource demand analysis results obtained from the model inference task and concurrent tasks at the current moment, used to characterize the degree of demand for at least one of the computing, memory, and network resources by different tasks at the current moment. With historical monitoring data available, the resource demand trend over a future period can be predicted based on the resource usage patterns and task operation patterns reflected in the historical monitoring data, combined with the current task demand information. This resource demand trend can characterize changes such as increasing resource demand, decreasing resource demand, or maintaining relatively stable resource demand, or it can characterize the direction and intensity of changes in different resource types over a future period. In this way, subsequent resource allocation processing can no longer be based solely on the current instantaneous resource demand, but can instead predict future resource demand changes by combining historical operating trajectories.

[0138] After obtaining the resource demand trend, the current resource demand information is revised based on this trend to determine the resource demand information for the model inference task and each concurrent task. By incorporating the resource demand trend into the revision process of the current resource demand information, the final determined resource demand information can simultaneously reflect the current task operation status and the resource demand changes in the near future, improving the adaptability of the resource demand information to subsequent resource allocation processing. Specifically, when the resource demand trend indicates an upward trend in the resource demand of a certain task in the near future, the resource demand information for that task can be increased accordingly; when the resource demand trend indicates a downward trend in the resource demand of a certain task in the near future, the resource demand information for that task can be decreased accordingly; when the resource demand trend indicates that the resource demand remains relatively stable in the near future, the current resource demand information can be maintained or only slightly revised. Through this revision process, the resource demand information for the model inference task and each concurrent task can more closely reflect the actual resource demand changes of the terminal device in the near future.

[0139] In one specific embodiment, determining a resource allocation strategy includes: performing front screening on the non-dominated solution set to determine the Pareto front solution set; evaluating each candidate resource allocation strategy in the Pareto front solution set based on the current device status information and user demand information to determine the resource allocation strategy.

[0140] In this embodiment, the Pareto front solution set corresponds to the Pareto Front, which represents a set of candidate resource allocation strategies that represent a representative balance among multiple optimization objectives. By performing front screening on the non-dominated solution set, candidate resource allocation strategies that are more suitable for participating in the final strategy selection in terms of multi-objective performance can be further selected from the non-dominated solution set. This ensures that the determination of subsequent resource allocation strategies no longer faces all candidate results, but focuses on a balanced solution set that takes into account the model inference task processing performance, concurrent task response requirements, and equipment energy consumption control requirements.

[0141] Specifically, when performing front screening on the non-dominated solution set, the comprehensive performance of different candidate resource allocation strategies in the multi-objective optimization model can be identified based on the objective evaluation results corresponding to each candidate resource allocation strategy. Candidate resource allocation strategies with good coordination among multiple objectives are retained. This screening process can determine a set of candidate resource allocation strategies that balance model inference task processing performance, concurrent task response requirements, and equipment energy consumption control. This ensures that the obtained Pareto front solution set reflects the balance in the multi-objective solution results while avoiding the retention of candidate resource allocation strategies that are clearly detrimental to the current operating scenario. Through this processing method, the Pareto front solution set can serve as the basis for subsequent strategy evaluation based on the current equipment status information and user demand information, improving the targeting of subsequent strategy evaluations.

[0142] After determining the Pareto front solution set, the candidate resource allocation strategies within the set are evaluated. The current device state information characterizes the resource operating environment and execution conditions of the device at the current moment, while user demand information characterizes the user's focus on model inference task performance, concurrent task response, and device energy consumption control in the current business scenario. The device state information may change under different operating scenarios, such as varying resource occupancy levels, power supply status, or temperature. Simultaneously, user demand information may also differ; for example, some scenarios prioritize model inference task performance, others prioritize concurrent task response speed, and still others prioritize device energy consumption control. Based on this, the adaptability of each candidate resource allocation strategy in the Pareto front solution set can be evaluated, characterizing the applicability of each strategy under the constraints of the current device state information and user demand information, and thus determining the resource allocation strategy that best matches the current operating scenario.

[0143] Figure 4 This is a schematic diagram of the structure of the device resource allocation system provided in the embodiments of this application, such as... Figure 4 As shown, the device resource allocation system 40 provided in this embodiment includes:

[0144] The information acquisition module 401 is used to collect resource status information of the terminal device, and to collect task status information of the model inference task and at least one concurrent task respectively.

[0145] The information analysis module 402 is used to obtain the task priorities of the model inference task and at least one concurrent task, and to determine the resource requirement information of the model inference task and each concurrent task through resource requirement analysis.

[0146] The information processing module 403 is used to construct a multi-objective optimization model based on resource status information, task status information, task priority and resource demand information, and to solve the multi-objective optimization model using a preset evolutionary optimization algorithm to obtain a resource allocation strategy.

[0147] The allocation execution module 404 is used to dynamically allocate and adjust the resources of the terminal device according to the resource allocation strategy, so as to complete the resource allocation for the model inference task and each concurrent task.

[0148] The device resource allocation system 40 provided in this embodiment can execute the method provided in the above method embodiment. Its implementation principle and technical effect are similar, and will not be described in detail here.

[0149] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Figure 5 As shown, the electronic device 50 provided in this embodiment includes at least one processor 501 and a memory 502. Optionally, the electronic device 50 further includes a communication component 503. The processor 501, memory 502, and communication component 503 are connected via a bus 504.

[0150] In a specific implementation, at least one processor 501 executes computer execution instructions stored in memory 502, causing at least one processor 501 to perform the above-described method.

[0151] The specific implementation process of processor 501 can be found in the above method embodiments, and its implementation principle and technical effect are similar. It will not be repeated here.

[0152] In the above embodiments, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor.

[0153] The memory may include random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device.

[0154] The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.

[0155] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method.

[0156] This application provides a chip, which includes at least one processor. The processor is used to run program instructions to execute the model inference method involved in the above method embodiments.

[0157] This application provides a chip module on which a computer program is stored. When the computer program is executed by the chip module, it implements the model reasoning method involved in the above method embodiments.

[0158] This application also provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described method.

[0159] The aforementioned readable storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The readable storage medium can be any available medium accessible to a general-purpose or special-purpose computer.

[0160] Those skilled in the art will understand that all or part of the steps of the above-described method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When executed, the program performs the steps of the above-described method embodiments; and the aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks.

[0161] Finally, it should be noted that other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not disclosed herein, and is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of the invention is limited only by the appended claims.

Claims

1. A resource allocation method for a terminal device, characterized in that, include: Collect resource status information of terminal devices, and collect task status information of model inference task and at least one concurrent task respectively; Obtain the task priorities of the model inference task and the at least one concurrent task, and determine the resource requirement information of the model inference task and each of the concurrent tasks through resource requirement analysis; Based on the resource status information, the task status information, the task priority, and the resource demand information, a multi-objective optimization model is constructed, and a preset evolutionary optimization algorithm is used to solve the multi-objective optimization model to obtain a resource allocation strategy. According to the resource allocation strategy, the resources of the terminal device are dynamically allocated and adjusted to complete the resource allocation for the model inference task and each of the concurrent tasks.

2. The method according to claim 1, characterized in that, The step of obtaining the task priorities of the model inference task and the at least one concurrent task, and determining the resource requirement information of the model inference task and each of the concurrent tasks through resource requirement analysis, includes: By parsing the task attribute information of the model inference task, the importance information and real-time information corresponding to the model inference task are obtained; and by reading the task attribute information of each concurrent task, the importance information and real-time information corresponding to each concurrent task are extracted. Based on the importance information and real-time information, a preset priority determination rule is used to obtain the task priorities corresponding to the model inference task and each concurrent task. Based on the task attribute information, obtain the initial resource requirement information of the model inference task and each of the concurrent tasks; Based on the task status information and the resource status information, the initial resource requirement information is corrected to determine the resource requirement information corresponding to the model inference task and each of the concurrent tasks; The resource requirement information includes at least one of computing resource requirements, memory resource requirements, and network resource requirements.

3. The method according to claim 1, characterized in that, The construction of a multi-objective optimization model based on the resource status information, the task status information, the task priority, and the resource requirement information includes: Based on the task priority and resource requirement information, determine the resource allocation variables corresponding to the model inference task and each of the concurrent tasks; Based on the task status information and the resource requirement information, construct the model inference performance objective function and the concurrent task response objective function; Based on the resource status information, a device energy consumption objective function is constructed, and based on the resource status information, the resource constraints of the multi-objective optimization model are determined. Based on the resource allocation variables, the model inference performance objective function, the concurrent task response objective function, the device energy consumption objective function, and the resource constraints, the multi-objective optimization model is constructed.

4. The method according to claim 3, characterized in that, The evolutionary optimization algorithm includes a genetic optimization algorithm or a non-dominated sorting genetic optimization algorithm; Accordingly, a preset evolutionary optimization algorithm is used to solve the multi-objective optimization model to obtain a resource allocation strategy, including: Based on the resource allocation variables in the multi-objective optimization model, multiple resource allocation candidate strategies are generated as an initial candidate strategy set. Based on each of the resource allocation candidate strategies, the target evaluation results corresponding to each resource allocation candidate strategy are calculated through the multi-objective optimization model, and the fitness of each resource allocation candidate strategy is determined based on the target evaluation results. Based on the fitness of each resource allocation candidate strategy, the initial candidate strategy set is sequentially subjected to selection, crossover, and mutation processes to obtain an updated candidate strategy set. If the evolutionary optimization algorithm is a non-dominated sorting genetic optimization algorithm, then the updated candidate strategy set is non-dominated sorted to determine the non-dominated solution set, and when the preset convergence condition is met or the preset number of iterations is reached, the resource allocation strategy is determined from the non-dominated solution set. Alternatively, if the evolutionary optimization algorithm is a genetic optimization algorithm, then when the preset convergence condition is met or the preset number of iterations is reached, the resource allocation strategy is determined from the updated candidate strategy set.

5. The method according to claim 1, characterized in that, The step of dynamically allocating and adjusting the resources of the terminal device according to the resource allocation strategy to complete the resource allocation for the model inference task and each of the concurrent tasks includes: Based on the resource allocation strategy, determine the device resource allocation parameters corresponding to the model inference task and each of the concurrent tasks; Based on the device resource allocation parameters, the heterogeneous computing resources of the terminal device are adjusted to adjust the computing resources of the model inference task and each of the concurrent tasks; And / or based on the device resource allocation parameters, allocate and reclaim the memory resources of the terminal device to adjust the memory resources of the model inference task and each of the concurrent tasks; And / or based on the device resource allocation parameters, allocate bandwidth to the network resources of the terminal device to adjust the network resources of the model inference task and each of the concurrent tasks.

6. The method according to claim 1, characterized in that, The method further includes: Based on resource adjustments to the model inference task and each of the concurrent tasks, resource status information of the terminal device is continuously collected, as well as task status information of the model inference task and at least one concurrent task are collected respectively, to obtain resource allocation execution result information. Based on the resource allocation execution result information, a resource allocation effect evaluation result is generated; wherein, the resource allocation effect evaluation result includes at least one of the following: task response effect evaluation result, resource utilization effect evaluation result, and energy consumption effect evaluation result; Based on the resource allocation effect evaluation results and preset target conditions, determine the feedback adjustment parameters; Based on the feedback adjustment parameters, the model parameters of the multi-objective optimization model and / or the solution parameters of the evolutionary optimization algorithm are adjusted to obtain the adjusted multi-objective optimization model and the adjusted evolutionary optimization algorithm. The adjusted evolutionary optimization algorithm is used to solve the adjusted multi-objective optimization model to generate a new resource allocation strategy; Based on the new resource allocation strategy, update the resources for the model inference task and each of the concurrent tasks.

7. A device resource allocation system, characterized in that, include: The information acquisition module is used to collect resource status information of the terminal device, as well as task status information of the model inference task and at least one concurrent task respectively. The information analysis module is used to obtain the task priorities of the model inference task and the at least one concurrent task, and to determine the resource requirement information of the model inference task and each of the concurrent tasks through resource requirement analysis. The information processing module is used to construct a multi-objective optimization model based on the resource status information, the task status information, the task priority, and the resource demand information, and to solve the multi-objective optimization model using a preset evolutionary optimization algorithm to obtain a resource allocation strategy. The allocation execution module is used to dynamically allocate and adjust the resources of the terminal device according to the resource allocation strategy, so as to complete the resource allocation for the model inference task and each of the concurrent tasks.

8. An electronic device, characterized in that, include: Memory, processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory, causing the processor to perform the method as described in any one of claims 1 to 6.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any one of claims 1 to 6.

10. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method of any one of claims 1 to 6.