An inference task scheduling method, device, equipment and storage medium
By integrating multi-dimensional data and network status monitoring machine decision-making, the inference tasks of the vehicle intelligent system are rationally scheduled, which solves the contradiction between the limited computing power of the vehicle and the dependence on the cloud network, achieves optimal scheduling in complex driving environments, and improves system performance and resource utilization efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- VOYAH AUTOMOBILE TECH CO LTD
- Filing Date
- 2026-03-18
- Publication Date
- 2026-06-19
AI Technical Summary
In existing vehicle intelligent systems, the scheduling of inference tasks by artificial intelligence models is difficult to achieve optimal scheduling in complex and ever-changing driving environments. This makes it difficult to resolve the contradiction between limited computing power on the vehicle side and reliance on cloud networks, and makes it difficult to meet the accuracy and real-time requirements of complex scenarios.
By integrating network communication data, processor load data, and driving status data, the task scheduling score is determined. Combined with the current network status of the network status monitoring unit, differentiated decisions are made to reasonably schedule inference tasks to the vehicle-side computing unit or the cloud computing unit to generate the optimal inference result.
While ensuring the real-time performance of safety-critical tasks, it improves the utilization rate of cloud computing power, enhances the inference accuracy of non-safety-critical tasks and the overall performance of the vehicle-mounted intelligent system, and strengthens the intelligence level of inference task scheduling.
Smart Images

Figure CN122240311A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of vehicle task scheduling technology, and in particular to a reasoning task scheduling method, apparatus, device and storage medium. Background Technology
[0002] With the widespread adoption of advanced driver assistance systems (ADAS) and autonomous driving systems in vehicles, higher demands are being placed on the inference capabilities of artificial intelligence (AI) models. Currently, AI models face a core contradiction when processing inference tasks: limited on-board (local) computing power and reliance on cloud networks. Specifically, on-board computing units have limited computing power, allowing them to run simplified models. While offering low latency, their inference accuracy is insufficient for complex scenarios. Cloud computing units, on the other hand, can run complete models and achieve high-precision inference, but their performance is highly dependent on network connection quality, making real-time performance difficult to guarantee. Therefore, how to rationally schedule inference tasks to either on-board or cloud computing units has become crucial for improving the performance of in-vehicle intelligent systems.
[0003] In related technologies, network parameters are typically monitored in real time to determine the vehicle's current network connectivity. When network quality is good, the inference task is scheduled to be executed by the cloud computing unit; otherwise, it is rolled back to the vehicle's on-board computing unit for processing. However, this method of scheduling inference tasks based on network parameters is difficult to achieve optimal scheduling in complex and ever-changing driving environments. Summary of the Invention
[0004] This application provides a method, apparatus, device, and storage medium for scheduling inference tasks, which can be used to achieve optimal scheduling of vehicle-mounted inference tasks in complex and dynamic driving environments.
[0005] Firstly, this application provides a method for scheduling inference tasks, including:
[0006] The task scheduling score is determined based on the vehicle's network communication data, processor load data, and driving status data.
[0007] If the security level of the inference task is non-critical, a scheduling decision is made based on the current network status determined by the network status monitoring machine and the task scheduling score to determine the target computing unit to execute the inference task. The current network status is used to reflect the network quality of the vehicle, and the target computing unit is a vehicle-side computing unit or a cloud computing unit.
[0008] The inference task is scheduled to be executed by the target computing entity to generate inference results.
[0009] In one possible implementation, determining the task scheduling score based on the vehicle's network communication data, processor load data, and driving status data includes:
[0010] Calculate the network quality value based on the network communication data;
[0011] Calculate the computing power margin value based on the processor load data;
[0012] Based on the driving status data, the urgency value of the driving scenario is determined;
[0013] The task scheduling score is obtained by weighting the network quality value, the computing power margin value, and the driving scenario urgency value.
[0014] In one possible implementation, the network communication data includes signal reception power and signal reception quality, and the calculation of the network quality value based on the network communication data includes:
[0015] The signal received power and the signal received quality are normalized to obtain normalized power values and normalized quality values;
[0016] Based on preset signal reception power weights and signal reception quality weights, the normalized power value and the normalized quality value are weighted and summed to obtain the network quality value.
[0017] In one possible implementation, the processor load data includes processor utilization and memory utilization, and the step of calculating the computing power margin value based on the processor load data includes:
[0018] Calculate the processor idle rate based on the processor utilization rate;
[0019] Calculate the memory free rate based on the memory usage rate;
[0020] Based on preset processor computing power weights and memory computing power weights, the processor idle rate and the memory idle rate are weighted and summed to obtain the computing power reserve value.
[0021] In one possible implementation, the driving status data includes at least one of vehicle speed, gear position, and Advanced Driver Assistance System (ADAS) status. Determining the urgency value of the driving scenario based on the driving status data includes:
[0022] Retrieve predefined scene urgency mapping rules;
[0023] Based on the scenario urgency mapping rule, determine the driving scenario urgency value corresponding to the vehicle speed, gear position, and ADAS status.
[0024] In one possible implementation, the scenario urgency mapping rule includes:
[0025] When the ADAS status is a warning triggered state, the urgency value of the driving scenario is the first urgency level value;
[0026] When the vehicle speed is greater than or equal to the first speed threshold, the urgency value of the driving scenario is the second urgency value.
[0027] If the vehicle speed is less than the first speed threshold and greater than the second speed threshold, and the ADAS state is active, the urgency value of the driving scenario is the third urgency value.
[0028] When the vehicle speed is less than or equal to the second speed threshold and the gear is reverse or parking, the urgency value of the driving scenario is the fourth urgency value.
[0029] Wherein, the first speed threshold is greater than the second speed threshold, and the first urgency level, the second urgency level, the third urgency level, and the fourth urgency level decrease sequentially.
[0030] In one possible implementation, the non-safety-critical level includes a driving assistance level and an experience enhancement level. If the safety level of the inference task is non-safety-critical, a scheduling decision is made based on the current network state determined by the network state monitoring machine and the task scheduling score to determine the target computational entity for executing the inference task, including:
[0031] If the task scheduling score is greater than or equal to the first score threshold, and the current network state is in full-function state, then the target computing entity for executing the inference task is determined to be the cloud computing unit.
[0032] If the task scheduling score is greater than or equal to the first score threshold, and the current network state is a restricted state, then the target computing entity for executing the inference task of the driving assistance level is determined to be the vehicle-side computing unit, and the target computing entity for executing the inference task of the experience enhancement level is determined to be the cloud computing unit.
[0033] In one possible implementation, the method further includes:
[0034] If the task scheduling score is less than the first score threshold and greater than the second score threshold, and the current network status is either full-function or limited, then the target computing entity for executing the inference task of the driving assistance level is determined to be the vehicle-side computing unit, and the target computing entity for executing the inference task of the experience enhancement level is determined to be the cloud computing unit, and the first score threshold is greater than the second score threshold.
[0035] If the security level is a security critical level, or the task scheduling score is less than or equal to the second score threshold, or the current network state is a local state, then the target computing entity for executing the inference task is determined to be the vehicle-side computing unit.
[0036] In one possible implementation, if the target computing entity is the cloud computing unit, scheduling the inference task to the target computing entity for execution and generating inference results includes:
[0037] Set a timeout duration for the inference task;
[0038] If no inference result is received from the cloud computing unit within the timeout period, the inference task will be rescheduled to the vehicle-mounted computing unit to execute the inference task.
[0039] In one possible implementation, setting a timeout duration for the inference task includes:
[0040] If the safety level of the reasoning task is the driving assistance level, then a first timeout duration is set for the reasoning task;
[0041] If the security level of the reasoning task is the enhanced experience level, then a second timeout duration is set for the reasoning task, wherein the first timeout duration is less than the second timeout duration.
[0042] In one possible implementation, the network status monitoring machine is configured with status switching rules, which include:
[0043] When the network quality value is less than the first network threshold and the duration reaches the first preset duration, or when the network type is switched from 5G to 4G, the current network state is switched from full-function state to restricted state.
[0044] When the network quality value is less than the second network threshold and the duration reaches the first preset duration, or when the network type is switched from 4G to no service, the current network state is switched from the restricted state to the local state, and the first network threshold is greater than the second network threshold.
[0045] When the network quality value is greater than or equal to the second network threshold and the duration reaches the first preset duration, and the network type is switched from no service to 4G, then after waiting for the second preset duration, the current network state is switched from the local state to the restricted state.
[0046] When the network quality value is greater than or equal to the first network threshold and the duration reaches the first preset duration, and the network type is switched from 4G to 5G, then after waiting for the second preset duration, the current network state is switched from the restricted state to the full-function state.
[0047] Secondly, this application provides a reasoning task scheduling device, the device comprising:
[0048] The first processing module is used to determine the task scheduling score based on the vehicle's network communication data, processor load data, and driving status data.
[0049] The scheduling decision module is used to make scheduling decisions based on the current network status determined by the network status monitoring machine and the task scheduling score if the security level of the inference task is non-safety critical level, and to determine the target computing party to execute the inference task. The current network status is used to reflect the network quality of the vehicle, and the target computing party is a vehicle-side computing unit or a cloud computing unit.
[0050] The second processing module is used to schedule the inference task to the target computing party for execution and generate inference results.
[0051] Thirdly, this application provides an electronic device, including: a memory and a processor;
[0052] The memory stores computer-executed instructions;
[0053] The processor executes computer execution instructions stored in the memory to implement the method as described in any of the first aspects.
[0054] Fourthly, this application provides a vehicle, including: a vehicle body, and electronic equipment as described in the third aspect.
[0055] Fifthly, this application provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the method described in any of the first aspects above.
[0056] In a sixth aspect, this application provides a computer program product, including a computer program that, when executed by a processor, implements the method described in any of the first aspects above.
[0057] This application provides a method, apparatus, device, and storage medium for scheduling inference tasks. The method includes determining a task scheduling score based on vehicle network communication data, processor load data, and driving status data. If the inference task's security level is non-safety-critical, a scheduling decision is made based on the current network status determined by a network status monitoring machine and the task scheduling score, determining whether the target computing entity for executing the inference task is an on-board computing unit or a cloud computing unit. The inference task is then scheduled to the target computing entity for execution, generating inference results. This method constructs task scheduling scores by integrating multi-dimensional dynamic data and combines this with the network status monitoring machine's hierarchical management of network quality. It implements differentiated end-to-cloud scheduling strategies for inference tasks of different security levels, accurately matching the optimal computing entity for various tasks. While ensuring the real-time processing of safety-critical tasks, it fully utilizes cloud computing power to improve the inference accuracy of non-safety-critical tasks, enhancing the overall performance and resource utilization efficiency of the in-vehicle intelligent system and strengthening the intelligence level of inference task scheduling. Attached Figure Description
[0058] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.
[0059] Figure 1 A schematic diagram illustrating the application scenarios provided in the embodiments of this application;
[0060] Figure 2 A flowchart illustrating an embodiment of the inference task scheduling method provided in this application;
[0061] Figure 3 A flowchart illustrating Embodiment 2 of the inference task scheduling method provided in this application;
[0062] Figure 4 This is a schematic diagram of network state switching provided in an embodiment of this application;
[0063] Figure 5 A flowchart illustrating Embodiment 3 of the inference task scheduling method provided in this application;
[0064] Figure 6 A system architecture diagram for edge-cloud collaborative inference task scheduling provided in this application embodiment;
[0065] Figure 7 This is a schematic diagram of the structure of the inference task scheduling device provided in the embodiments of this application;
[0066] Figure 8 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application.
[0067] The accompanying drawings illustrate specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to particular embodiments. Detailed Implementation
[0068] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0069] As intelligent vehicles evolve towards Advanced Driver Assistance Systems (ADAS) and smart cockpits, the demand for inference capabilities of AI models in in-vehicle AI systems is growing exponentially. For example, inference tasks such as forward collision warning and lane departure detection require real-time processing of sensor data and the generation of decisions, while inference tasks such as voice interaction, navigation route planning, and entertainment content generation rely on the complex inference capabilities of AI models.
[0070] Currently, AI models face a core contradiction when processing inference tasks: limited on-vehicle (local) computing power and reliance on cloud networks. The core hardware of the on-vehicle computing unit is the on-vehicle processor, which, constrained by size, power consumption, and cost in in-vehicle scenarios, typically maintains a computing power level between 10 and 100 trillion operations per second (TOPS). This only supports simplified models with relatively few parameters (e.g., 1-7B parameters). While offering significant advantages in low data processing latency, it has obvious limitations in inference accuracy. On the other hand, cloud computing units (such as servers or server clusters deployed in the cloud) can stably run complete models with 30-70B or more parameters, achieving high-precision inference operations. However, their performance is highly dependent on network connection quality, making real-time performance difficult to guarantee. Therefore, how to rationally schedule inference tasks to either on-vehicle or cloud computing units has become crucial for improving the performance of in-vehicle intelligent systems.
[0071] In related technologies, inference task scheduling often employs a single-dimensional decision-making approach, primarily relying on network connection quality as the core criterion. This involves monitoring network parameters such as signal strength and signal-to-interference-plus-noise ratio (SNR) to achieve simple switching between on-vehicle and cloud computing units. However, this method of scheduling inference tasks based on network parameters is too simplistic. It fails to consider the impact of multi-dimensional factors, such as processor load data and driving status data, on scheduling decisions, and it doesn't differentiate between inference tasks of different safety levels. Consequently, it struggles to achieve optimal scheduling of inference tasks in complex and ever-changing driving environments.
[0072] To address the aforementioned issues, the inventors considered establishing a joint scheduling mechanism that integrates network quality, computing power margin, and driving scenarios. This aims to improve cloud computing power utilization while ensuring the real-time performance of safety-critical tasks. Based on this, after numerous experiments, the inventors discovered that a task scheduling score can be determined based on vehicle network communication data, processor load data, and driving status data to comprehensively evaluate the suitability of scheduling inference tasks to cloud computing units at any given moment. Furthermore, for non-safety-critical inference tasks, the current network status determined by a network status monitoring machine is combined with the task scheduling score to comprehensively decide whether the target computing unit for the inference task is the vehicle-side computing unit or the cloud computing unit. The inference task is then scheduled to the determined target computing unit for execution, ultimately generating the inference result. This achieves optimal scheduling of inference tasks in complex and ever-changing driving environments. This application proposes an inference task scheduling method that achieves optimal scheduling of non-safety-critical inference tasks in complex and ever-changing driving environments through multi-dimensional data fusion and the combination of current network status determined by a network status monitoring machine.
[0073] Figure 1 This is a schematic diagram illustrating an application scenario provided in an embodiment of this application. Please refer to [link / reference]. Figure 1 Electronic devices deployed in vehicles can make joint decisions based on task scheduling scores and the current network status determined by the network status monitoring machine, and allocate inference tasks to vehicle-side computing units or cloud computing units for execution.
[0074] Among them, electronic devices are the decision-making entities for the allocation of inference tasks on the vehicle side. They can be set up independently or integrated with the vehicle computing unit into the vehicle computing platform (VCP).
[0075] The vehicle-side computing unit, as the computing execution entity on the vehicle side, is scheduled and controlled by electronic devices. Its core hardware is the vehicle-side processor, including neural processing unit (NPU), graphics processing unit (GPU), etc.
[0076] A cloud computing unit refers to a computing node or server cluster deployed in a remote cloud. It establishes a wireless communication connection with electronic devices in a vehicle through a cellular network (5G or 4G) to provide remote computing power support for inference tasks.
[0077] The technical solution of this application and how it solves the above-mentioned technical problems will be described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will be described below with reference to the accompanying drawings.
[0078] Figure 2 This is a flowchart illustrating an embodiment of the inference task scheduling method provided in this application. Please refer to [link / reference]. Figure 2 ,include:
[0079] S201. Determine the task scheduling score based on the vehicle's network communication data, processor load data, and driving status data.
[0080] The execution subject in this application embodiment can be an electronic device or an inference task scheduling device within an electronic device. The inference task scheduling device can be implemented in software or a combination of software and hardware. The inference task scheduling device can be a processor within an electronic device. For ease of understanding, the following description will use an electronic device as the execution subject.
[0081] In this step, the electronic device can acquire network communication data, processor load data, and driving status data in each sampling period. It calculates the network quality value based on the network communication data, the computing power margin value based on the processor load data, and the driving scenario urgency value based on the driving status data. Then, it performs a weighted calculation on the network quality value, computing power margin value, and driving scenario urgency value to obtain the task scheduling score.
[0082] The sampling period refers to the time interval at which the electronic device acquires and updates the aforementioned data. It can be preset according to the actual application scenario or dynamically adjusted according to the vehicle's driving status. The electronic device can periodically acquire network communication data, processor load data, and driving status data according to this sampling period to ensure that the task scheduling score can reflect the vehicle's current operating status in real time.
[0083] In one specific implementation, network communication data may include Reference Signal Received Power (RSRP) and Reference Signal Received Quality (RSRQ); processor load data may include processor utilization and memory utilization; driving status data may include at least one of vehicle speed, gear position, and ADAS status.
[0084] Optionally, processor utilization can specifically refer to NPU load or GPU load.
[0085] In one alternative implementation, the network communication data may further include the signal-to-interference-plus-noise ratio (SINR), and the driving status data may further include the vehicle steering angle.
[0086] S202. If the security level of the inference task is non-security critical level, then the scheduling decision is made based on the current network status and task scheduling score determined by the network status monitoring machine to determine the target computing party for executing the inference task.
[0087] In this step, for inference tasks with a safety level of non-safety criticality, the electronic device can make scheduling decisions by combining the current network status determined by the network status monitoring machine and the task scheduling score obtained in the previous steps, in order to determine the target computing entity to perform the inference task. Here, the current network status reflects the vehicle's network quality, and the target computing entity is either the vehicle-side computing unit or the cloud computing unit.
[0088] Specifically, network status monitoring devices can be implemented through software or a combination of software and hardware, and can be integrated into the processor or communication components of electronic devices.
[0089] In one specific implementation, reasoning tasks can be divided into three levels according to a predefined task safety level table. The first level is the safety critical level (L1), which includes forward collision warning, lane departure detection, and driver fatigue monitoring; the second level is the driver assistance level (L2), which includes voice command understanding, navigation route planning, and traffic sign recognition; and the third level is the experience enhancement level (L3), which includes music recommendation, casual conversation, news summaries, and entertainment content generation.
[0090] Furthermore, non-safety-critical levels include driver assistance level (L2) and experience enhancement level (L3).
[0091] In one optional implementation, the current network state includes a full-function state (G1), a restricted state (G2), and a local state (G3); wherein, the full-function state (G1) indicates that the network communication quality is excellent and meets the requirements for inference tasks to be executed on the cloud computing unit; the restricted state (G2) indicates that the network communication quality is degraded and only some inference tasks can be executed on the cloud computing unit; the local state (G3) indicates that the network communication quality is poor or the network is down, and the inference tasks can only be executed on the vehicle-side computing unit.
[0092] In one specific implementation, after determining that the security level of the inference task is L2 or L3, the corresponding scheduling decision can be executed based on the task scheduling score and a preset score threshold, combined with the current network status.
[0093] For example, if the task scheduling score is greater than or equal to the first score threshold and the current network state is G1, then the target computing unit for executing the inference task is determined to be the cloud computing unit.
[0094] For example, when the task scheduling score is greater than or equal to the first score threshold and the current network state is G2, the target computing unit for executing the inference task at the driving assistance level is determined to be the cloud computing unit, and the target computing unit for executing the inference task at the experience enhancement level is determined to be the vehicle computing unit.
[0095] S203. Schedule the inference task to the target computation side for execution and generate the inference result.
[0096] In this step, the electronic device can send the inference task that meets the scheduling conditions to the designated target computing party, which will then process the inference task and perform inference calculations to obtain the corresponding inference results.
[0097] Specifically, if the target computing entity is a vehicle-mounted computing unit, the vehicle-mounted computing unit can directly complete the loading, computation, and result output of the inference task. The entire execution process does not rely on an external network, thus ensuring low latency and high reliability of the inference execution. If the target computing entity is a cloud computing unit, the electronic device can upload the inference task to the cloud computing unit through a cellular network. The cloud computing unit can then use cloud computing power to complete the inference processing and send the processed inference results back to the electronic device.
[0098] Furthermore, after receiving the inference results from the target computing party, the electronic device can parse, verify, and store the inference results, and use the inference results for vehicle control, information display, human-computer interaction, or other functional modules according to the business scenario corresponding to the inference task, thereby completing the complete execution process of this inference task.
[0099] In this embodiment, a task scheduling score can be determined based on the vehicle's network communication data, processor load data, and driving status data. When the inference task is not at a safety-critical level, a scheduling decision is made by combining the current network status determined by the network status monitoring machine with the task scheduling score, identifying the target computing party, and scheduling the inference task to the target computing party for execution. Here, the current network status reflects the vehicle's network quality, the task safety level distinguishes different business importance, and the task scheduling score quantifies the comprehensive scheduling conditions. Through multi-dimensional data collaborative judgment and differentiated scheduling strategies based on hierarchical and state-based criteria, local and cloud computing power can be rationally allocated while ensuring driving safety and business experience. This effectively improves the execution efficiency of inference tasks, enhances the intelligence level of the in-vehicle intelligent system, and ultimately achieves optimal scheduling of inference tasks in complex and ever-changing driving environments.
[0100] exist Figure 2 Based on the illustrated embodiment, the following, in conjunction with Figure 3 The above-mentioned reasoning task scheduling method will be further explained in detail.
[0101] Figure 3 This is a flowchart illustrating a second embodiment of the inference task scheduling method provided in this application. Please refer to [link / reference]. Figure 3 The method includes:
[0102] S301. Calculate the network quality value based on network communication data.
[0103] In this step, the signal reception power and signal reception quality included in the network communication data can be normalized first to obtain power normalized value and quality normalized value. Then, based on the preset signal reception power weight and signal reception quality weight, the power normalized value and quality normalized value are weighted and summed to obtain the network quality value.
[0104] Among them, the network quality value is used to comprehensively reflect the current network communication quality. The higher the value, the better the network quality, and the more suitable it is to schedule inference tasks to cloud computing units for execution.
[0105] For example, if the signal received power is normalized to a value of 0.8, and the signal received quality is normalized to a value of 0.6, and both the power and quality normalization values are between [0, 1]; if the preset signal received power weight is 0.6 and the signal received quality weight is 0.4, then the network quality value can be calculated to be 0.72.
[0106] S302. Calculate the computing power margin value based on the processor load data.
[0107] In this step, the processor idle rate can be calculated first based on the processor utilization rate included in the processor load data; simultaneously, the memory idle rate can be calculated based on the memory utilization rate included in the processor load data. Then, based on preset processor computing power weights and memory computing power weights, the processor idle rate and memory idle rate are weighted and summed to obtain the computing power reserve value.
[0108] Among them, the computing power margin value is used to comprehensively reflect the sufficiency of the available computing resources of the current vehicle-side computing unit. The larger the value, the more sufficient the vehicle-side computing power is, and the more suitable it is to schedule the inference task to be executed locally on the vehicle-side computing unit; the smaller the value, the more strained the vehicle-side computing power is, and the more necessary it is to consider scheduling the inference task to the cloud computing unit for execution.
[0109] In one specific implementation, processor utilization can be obtained by reading the real-time load registers of the NPU or GPU in the vehicle-side computing unit, typically expressed as a percentage of the current computing resources being used. Memory utilization can be obtained by reading statistics from the operating system's memory management module, reflecting the proportion of currently used memory to the total available memory. Processor idle rate is 100% minus processor utilization, and memory idle rate is 100% minus memory utilization.
[0110] For example, if the processor utilization rate is 30%, the processor idle rate is 70%; if the memory utilization rate is 40%, the memory idle rate is 60%. The preset processor computing power weight is 0.5, and the memory computing power weight is 0.5, so the calculated computing power reserve value is 65% (i.e., 0.65).
[0111] S303. Determine the urgency value of the driving scenario based on the driving status data.
[0112] In this step, a predefined scenario urgency mapping rule can be obtained. Based on this mapping rule, the driving scenario urgency value corresponding to the driving state data is determined. The driving state data includes at least one of the following: vehicle speed, gear position, and ADAS status.
[0113] Among them, the driving scenario urgency value is used to comprehensively reflect the urgency of the current driving environment for the real-time requirements of the inference task. The larger the value, the more urgent the current driving scenario is, the higher the demand for low-latency processing of the inference task is, and the more suitable it is to be executed locally by the vehicle-side computing unit. The smaller the value, the more relaxed the current driving scenario is, the higher the tolerance for latency is, and it is advisable to schedule the inference task to the cloud computing unit for execution.
[0114] In one specific implementation, the scenario urgency mapping rule can be pre-calibrated based on the safety risk level of different driving scenarios. The specific rule is as follows:
[0115] When the ADAS is in a warning-triggered state, the driving scenario urgency value is the highest, the first urgency level. When the vehicle speed is greater than or equal to the first speed threshold, the driving scenario urgency value is a relatively high second urgency level. When the vehicle speed is less than the first speed threshold but greater than the second speed threshold, and the ADAS is active, the driving scenario urgency value is a moderate third urgency level. When the vehicle speed is less than or equal to the second speed threshold, and the gear is reverse or park, the driving scenario urgency value is the lowest, the fourth urgency level. The first speed threshold is greater than the second speed threshold, and the first, second, third, and fourth urgency levels decrease sequentially.
[0116] For example, if the current vehicle speed is 90 km / h and the first speed threshold is set to 80 km / h, then the condition that the vehicle speed is greater than or equal to the first speed threshold is met. According to the mapping rule, the urgency value of the driving scenario can be determined as the second urgency value (e.g., 0.8), indicating that the current driving scenario is at high speed and the reasoning task has high real-time requirements.
[0117] For example, if the current ADAS status is forward collision warning triggered, the urgency value of the driving scenario can be determined according to the mapping rules as the first urgency value (e.g., 1.0), indicating that the current scenario has the highest urgency level, and the inference task must be processed locally on the vehicle in real time.
[0118] S304. The task scheduling score is obtained by weighting the network quality value, computing power margin value, and driving scenario urgency value.
[0119] In this step, the network quality value, computing power margin value, and driving scenario urgency value can be weighted and summed based on preset network quality weight, computing power margin weight, and driving scenario urgency weight to obtain the task scheduling score.
[0120] The task scheduling score is used to comprehensively evaluate the suitability of scheduling inference tasks to the cloud computing unit at the current moment. A higher task scheduling score indicates better network quality, more strained vehicle computing power, and a more relaxed driving scenario, making it more suitable to schedule inference tasks to the cloud computing unit for execution. Conversely, a lower task scheduling score indicates worse network quality, more abundant vehicle computing power, and a more urgent driving scenario, making it more suitable to schedule inference tasks to the vehicle computing unit for local execution.
[0121] In one specific implementation, the specific values of the network quality weight, computing power margin weight, and driving scenario urgency weight can be calibrated according to the actual application scenario to balance the contribution of the three dimensions to the comprehensive decision-making. For example, if more emphasis is placed on the impact of network quality on scheduling decisions, the value of the network quality weight can be appropriately increased; if more emphasis is placed on the constraints of the driving scenario urgency on real-time requirements, the value of the driving scenario urgency weight can be appropriately increased.
[0122] For example, assuming a network quality value of 0.72, a computing power margin of 0.65, and a driving scenario urgency value of 0.8; and a preset network quality weight of 0.3, computing power margin weight of 0.3, and driving scenario urgency weight of 0.4, then the task scheduling score is calculated to be 0.73.
[0123] S305. If the security level of the inference task is non-security critical level, then the scheduling decision is made based on the current network status and task scheduling score determined by the network status monitoring machine to determine the target computing party for executing the inference task.
[0124] In this step, after receiving the inference task, the electronic device can first determine the security level of the inference task. If the inference task is at a security-critical level, the target computing entity for executing the inference task is directly determined to be the vehicle-side computing unit; if the security level of the inference task is not at a security-critical level, the current network status determined by the network status monitoring machine and the task scheduling score calculated in the previous steps are further obtained, and a joint decision is made based on the preset scheduling decision rules to determine whether the target computing entity for executing the inference task is the vehicle-side computing unit or the cloud computing unit.
[0125] The non-safety-critical levels include driver assistance levels and experience enhancement levels. The network status determined by the network status monitoring unit includes full-function status, restricted status, and local status.
[0126] In one specific implementation, the scheduling decision rules can be pre-configured and stored in the electronic device. The specific rules are as follows:
[0127] If the task scheduling score is greater than or equal to the first score threshold, and the current network status is full-function, then regardless of whether the reasoning task belongs to the driving assistance level or the experience enhancement level, the target computing unit is determined to be the cloud computing unit.
[0128] If the task scheduling score is greater than or equal to the first score threshold, and the current network state is restricted, then differentiated decisions are made based on the specific level of the inference task: for inference tasks at the driver assistance level, the target computing unit is determined to be the vehicle-side computing unit; for inference tasks at the experience enhancement level, the target computing unit is determined to be the cloud computing unit.
[0129] If the task scheduling score is less than the first score threshold but greater than the second score threshold, and the current network status is either fully functional or restricted, then the target computing unit for executing the inference task at the driving assistance level is determined to be the vehicle-side computing unit, and the target computing unit for executing the inference task at the experience enhancement level is determined to be the cloud computing unit.
[0130] If the task scheduling score is less than or equal to the second score threshold, or if the current network state is local, then regardless of whether the inference task belongs to the driving assistance level or the experience enhancement level, the target computing unit is determined to be the vehicle-side computing unit.
[0131] For example, the first score threshold is 0.7, and the second score threshold is 0.3. If the current task scheduling score is 0.8, and the network status monitoring machine determines that the current network status is in full-function state, then the target computing entity for this non-security-critical inference task is determined to be the cloud computing unit.
[0132] For example, if the current task scheduling score is 0.8, the current network status is restricted, and the inference task is at the driver assistance level, then the target computing entity is determined to be the vehicle-side computing unit; if the task is at the experience enhancement level, then the target computing entity is determined to be the cloud computing unit.
[0133] It should be noted that the current network state upon which the above scheduling decisions are based is not static, but dynamically maintained and updated by the network state monitoring unit based on real-time network conditions. During actual driving, network quality values and network type will fluctuate dynamically with factors such as vehicle movement and environmental changes. Therefore, state switching rules can be preset in the network state monitoring unit to achieve smooth transitions and accurate switching of network states.
[0134] In one specific implementation, the state transition rules can be pre-configured and stored in the network state monitoring machine. The specific rules are as follows:
[0135] When the network quality value is less than the first network threshold and the duration reaches the first preset duration, or when the network type is switched from 5G to 4G, the current network status is switched from full-function status to restricted status.
[0136] When the network quality value is less than the second network threshold and the duration reaches the first preset duration, or when the network type switches from 4G to no service, the current network state is switched from restricted state to local state. The first network threshold is greater than the second network threshold.
[0137] When the network quality value is greater than or equal to the second network threshold and the duration reaches the first preset duration, and the network type is switched from no service to 4G, then after waiting for the second preset duration, the current network status is switched from local status to restricted status.
[0138] When the network quality value is greater than or equal to the first network threshold and the duration reaches the first preset duration, and the network type is switched from 4G to 5G, then after waiting for the second preset duration, the current network status is switched from the restricted state to the full-function state.
[0139] Optionally, the first preset duration and the second preset duration can be the same, or they can be configured differently according to the actual application scenario.
[0140] Figure 4 This is a schematic diagram illustrating network state switching as provided in an embodiment of this application. Please refer to [link / reference]. Figure 4 The network status monitoring unit can monitor the network quality value (NQ) and network type in real time, and dynamically maintain the switching of the current network status between full-function status (G1), restricted status (G2) and local status (G3) according to the preset status switching rules.
[0141] For example, the first network threshold is set to 0.6, the second network threshold is set to 0.2, the first preset duration is set to 3 seconds, and the second preset duration is set to 3 seconds; the specific rules for state switching can be:
[0142] G1 to G2 switch: When NQ is less than 0.6 and lasts for 3 seconds, or when the network type is switched from 5G to 4G, the network status monitoring device can immediately switch the current network status from G1 to G2.
[0143] G2 to G3 switch: When NQ is less than 0.2 and lasts for 3 seconds, or when the network type changes from 4G to no service, the network status monitoring device can immediately switch the current network status from G2 to G3.
[0144] G3 to G2 switch: When NQ is greater than or equal to 0.2 and lasts for 3 seconds, and the network type changes from no service to 4G, the network status monitoring device can wait for 3 seconds before switching the current network status from G3 to G2.
[0145] G2 to G1 switch: When NQ is greater than or equal to 0.6 and lasts for 3 seconds, and the network type is switched from 4G to 5G, the network status monitoring device can wait for 3 seconds before switching the current network status from G2 to G1.
[0146] Through the aforementioned state switching rules, the network state monitoring machine can track changes in network conditions in real time and accurately switch the current network state when the corresponding conditions are met. By maintaining three network levels—full-function state, restricted state, and local state—and introducing a lag window mechanism (i.e., a second preset duration), the network state monitoring machine can achieve smooth network state switching: degradation is executed immediately, prioritizing driving safety; upgrades wait for the lag window to avoid frequent state switching caused by short-term network fluctuations. Therefore, the network state monitoring machine can provide reliable state input for the scheduling decisions of inference tasks, ensuring that the scheduling strategy dynamically adapts to the complex driving network environment.
[0147] It provides reliable state input for the scheduling decisions of inference tasks, thereby ensuring that the scheduling strategy for inference tasks can dynamically adapt to complex driving network environments.
[0148] S306. Schedule the inference task to the target computation side for execution and generate inference results.
[0149] In this step, the electronic device schedules the inference task to the corresponding computing unit for execution based on the determined target computing method, in order to generate inference results.
[0150] Specifically, if the target computation is the vehicle-mounted computing unit, the electronic device sends the inference task to the vehicle-mounted computing unit for execution. The vehicle-mounted computing unit can load the corresponding simplified model, perform inference computation, and return the inference result.
[0151] In one optional implementation, the vehicle-side computing unit may have the ability to dynamically switch between multi-precision quantization models to adapt to inference requirements under different network conditions. Here, quantization model refers to the technique of converting model parameters from high-bit-count to low-bit-count for storage and computation.
[0152] In one specific implementation, a first model and a second model can be deployed on the vehicle-side computing unit. The inference precision of the first model is lower than that of the second model. The first model can be an INT4 quantization model, and the second model can be an INT8 quantization model. INT8 and INT4 represent 8-bit integer quantization precision and 4-bit integer quantization precision, respectively. The INT8 model can reduce computing resource consumption while ensuring high inference precision, while the INT4 model further compresses the computing load and is suitable for scenarios with limited computing power.
[0153] Based on the above model deployment, the vehicle-side computing unit can dynamically switch according to the current network status determined by the network status monitoring machine: when the current network status is a full-function state or a restricted state, the vehicle-side computing unit loads the INT8 quantization model (i.e., the second model) for inference; when the current network status is a local state, since all inference tasks are executed by the vehicle, in order to save computing resources and ensure the ability to process multiple tasks concurrently, the vehicle-side computing unit switches to the INT4 quantization model (i.e., the first model) for inference.
[0154] Furthermore, the vehicle-side computing unit can employ a time-division multiplexing scheduling mechanism for multiple tasks, enabling rational resource allocation based on the priority of inference tasks. When multiple inference tasks arrive simultaneously, the vehicle-side computing unit can process them according to their priority order: prioritizing safety-critical tasks, followed by driver assistance tasks, and finally experience-enhancing tasks. For tasks of the same priority, scheduling can be performed on a first-come, first-served basis.
[0155] For example, if the current network state is local, the vehicle-side computing unit simultaneously receives a forward collision warning task (safety critical level), an adaptive cruise control task (driver assistance level), and a voice interaction task (experience enhancement level). The vehicle-side computing unit can first load the INT4 quantization model to process the forward collision warning task. After that task is completed, it will process the adaptive cruise control task and finally the voice interaction task, thus ensuring the real-time priority of safety critical tasks.
[0156] Specifically, if the target computation is a cloud computing unit, the electronic device sends the inference task to the cloud server for execution via the cellular network. The cloud computing unit loads the corresponding complete model, performs inference computation, and returns the inference result.
[0157] In one alternative implementation, a timeout period can be set for monitoring inference tasks scheduled to the cloud computing unit to address task execution delays caused by network fluctuations or cloud computing unit anomalies.
[0158] If the inference result is not received from the cloud within the timeout period, the electronic device can determine that the cloud execution has timed out and immediately trigger the task rescheduling mechanism to reschedule the inference task to the vehicle-side computing unit for execution, so as to ensure that the inference task can be completed in a timely manner.
[0159] In one optional implementation, the electronic device can also monitor the request queue depth of the cloud computing unit. Before sending the inference task to the cloud computing unit, it can check the current depth of the cloud request queue. If the current queue depth exceeds a preset maximum queue depth threshold (e.g., set to 20 by default), it is determined that the cloud computing unit is currently under high load or the network transmission is congested. Continuing to send the task to the cloud may result in request timeout or processing delay. In this case, the electronic device can trigger an overflow fallback mechanism to directly reschedule the inference task to the vehicle-side computing unit for execution, without waiting for a response from the cloud computing unit.
[0160] In another optional implementation, the electronic device can also record a decision log for each scheduling decision. This decision log can include information such as the network quality value, computing power reserve value, driving scenario urgency value, decision result, actual inference latency, and inference quality score on which each scheduling decision was based. Based on this historical data, lightweight linear regression or online gradient descent methods can be used to periodically (e.g., every hour or every time the vehicle is started) adaptively adjust the network quality weight, computing power reserve weight, driving scenario urgency weight, and the first and second score thresholds. This allows the scheduling strategy to better adapt to the usage patterns of different vehicles and users, improving the accuracy and adaptability of scheduling decisions.
[0161] In this embodiment, the electronic device can sequentially determine the network quality value, computing power margin value, and driving scenario urgency value based on network communication data, processor load data, and driving status data, thereby obtaining a task scheduling score. When the inference task is not safety-critical, a scheduling decision can be made based on the current network status determined by the network status monitoring machine and the task scheduling score. The target computing entity for executing the inference task can be determined as either the vehicle-side computing unit or the cloud computing unit, and the inference task can be scheduled to be executed by that target computing entity to generate inference results. In the above process, by integrating data from three dimensions—network communication data, processor load data, and driving status data—and combining the dynamic classification of network quality levels by the network status monitoring machine, differentiated end-to-cloud scheduling strategies can be implemented for inference tasks of different safety levels. While ensuring the real-time processing of safety-critical tasks, the optimal target computing entity for non-safety-critical tasks can be dynamically selected, thereby achieving optimal scheduling of inference tasks in a complex and ever-changing driving environment.
[0162] Figure 5 This is a flowchart illustrating Embodiment 3 of the inference task scheduling method provided in this application. Please refer to... Figure 5 Based on the above embodiment two, if the target computing entity is determined to be a cloud computing unit, then the step S306 of scheduling the inference task to the target computing entity for execution and generating inference results further includes the following steps:
[0163] S501, Set the timeout duration for the reasoning task.
[0164] In this step, before sending the inference task to the cloud computing unit, the electronic device can set a corresponding timeout duration for the inference task based on its security level.
[0165] The timeout period is used to limit the maximum waiting time for the cloud computing unit to return inference results, so as to avoid the task being suspended for a long time due to network fluctuations or cloud anomalies.
[0166] In one specific implementation, if the safety level of the reasoning task is the driving assistance level, a first timeout duration is set for it; if the safety level of the reasoning task is the experience enhancement level, a second timeout duration is set for it. The first timeout duration is shorter than the second timeout duration.
[0167] For example, the first timeout duration can be set to 1500ms, and the second timeout duration can be set to 3000ms.
[0168] S502. If the inference result is not received from the cloud computing unit within the timeout period, the inference task will be rescheduled to the vehicle-side computing unit to execute the inference task.
[0169] In this step, the electronic device can start a timer when sending the inference task to the cloud computing unit and wait for the cloud computing unit to return the inference result. If the inference result is received from the cloud computing unit within the preset timeout period, the electronic device can directly obtain the inference result and complete the scheduling. If the inference result is not received from the cloud computing unit within the timeout period, the electronic device determines that the cloud computing unit has timed out and immediately triggers the task rescheduling mechanism to reschedule the inference task to the vehicle-side computing unit for execution.
[0170] For example, a driver assistance level inference task is scheduled to be executed by a cloud computing unit, with a first timeout period set to 1500ms. After sending the task, the electronic device can start a timer. If the inference result is not received from the cloud computing unit after 1500ms, it can be determined that the cloud computing unit has timed out, and the task can be immediately rescheduled to the vehicle-side computing unit for execution. The vehicle-side computing unit will then complete the inference and return the result.
[0171] In one alternative implementation, the electronic device can also acquire vehicle navigation path information and predict changes in the network environment the vehicle will enter based on the navigation path information. When it is predicted that the vehicle will enter a weak network signal area (such as a tunnel, underground parking lot, mountainous area, etc.) within a preset time, non-safety-critical inference tasks can be scheduled to the cloud computing unit in batches in advance, or the corresponding inference results can be obtained from the cloud computing unit in advance and cached, thereby completing data exchange before network degradation and avoiding scheduling failures or task delays caused by network interruptions.
[0172] In one optional implementation, when multiple available cloud computing nodes exist (public cloud, vehicle manufacturer private cloud, roadside edge computing nodes, etc.), the electronic device can add a node selection step after determining to schedule the inference task to the cloud computing unit. Specifically, the electronic device can comprehensively evaluate factors such as the response latency, request queue depth, and inference accuracy of each cloud computing node to select the optimal cloud node to execute the inference task. This multi-cloud node selection mechanism can further optimize the success rate and inference quality of cloud execution, improving the overall performance of edge-cloud collaborative scheduling.
[0173] In this embodiment, by setting a timeout duration for the inference task executed by the cloud computing unit and promptly reverting to the vehicle-side computing unit for execution when the timeout occurs, the uncertainty caused by network fluctuations and cloud computing unit anomalies can be effectively addressed, avoiding task processing delays caused by waiting for timeouts. This further improves the reliability and robustness of inference task scheduling, ensuring that all types of inference tasks can obtain processing results in a timely manner in complex and ever-changing driving environments.
[0174] Figure 6 This is a system architecture diagram for edge-cloud collaborative inference task scheduling provided in an embodiment of this application. Please refer to... Figure 6 The architecture includes a business application layer, a decision scheduling layer, an execution engine layer, and a perception and acquisition layer.
[0175] The business application layer can include application modules such as voice interaction, visual perception, ADAS warning, and entertainment recommendation. These modules can generate corresponding inference tasks according to the needs of vehicle intelligent functions and send the inference tasks to the decision scheduling layer.
[0176] The decision-making and scheduling layer is responsible for receiving inference tasks from the business application layer and making comprehensive scheduling decisions based on real-time data reported by the perception and acquisition layer. Specifically, the decision-making and scheduling layer includes a task classification module, a network status management module, a task decision module, and a task scheduling module.
[0177] The task classification module is used to identify and label the security level of inference tasks; the network status management module is used to determine the current network status (full-function status, restricted status, local status) and realize dynamic switching between statuses; the decision module is used to calculate the task scheduling score based on the network quality value, computing power margin value, and driving scenario urgency value, and to make joint decisions based on the current network status to determine the target computing party for the inference task; the task scheduling module then distributes the inference task to the target computing party based on the decision results, which can be a vehicle-side computing unit or a cloud computing unit.
[0178] The execution engine layer includes vehicle-side execution units and cloud-side execution units. The vehicle-side execution unit is responsible for executing inference tasks scheduled to the vehicle, supporting dynamic switching of multi-precision quantization models and multi-task priority scheduling; the cloud-side execution unit is responsible for executing inference tasks scheduled to the cloud, providing high-precision large-model inference capabilities, and supporting timeout monitoring and request queue management.
[0179] The perception and acquisition layer is responsible for collecting various dynamic data during vehicle operation in real time, providing input to the decision-making and scheduling layer. Specifically, it may include a network signal acquisition module for monitoring signal reception power, signal reception quality, and network type; a computing load monitoring module for collecting load data such as processor utilization and memory utilization; a vehicle status acquisition module that can access the Controller Area Network (CAN) bus to read data such as vehicle speed and gear in real time; and an ADAS status module for obtaining the current status of the advanced driver assistance system.
[0180] The edge-cloud collaborative inference task scheduling system architecture provided in this application forms an integrated process from data collection, state perception, decision scheduling to inference task execution through the coordinated cooperation of the perception and acquisition layer, decision scheduling layer, execution engine layer and business application layer. It can dynamically optimize edge-cloud resource configuration in complex and ever-changing driving environments, and provide better inference performance for non-safety-critical tasks while ensuring timely response to safety-critical tasks.
[0181] Figure 7 This is a schematic diagram of the inference task scheduling device provided in an embodiment of this application. Please refer to... Figure 7 The reasoning task scheduling device 10 includes:
[0182] The first processing module 11 is used to determine the task scheduling score based on the vehicle's network communication data, processor load data, and driving status data.
[0183] The scheduling decision module 12 is used to make scheduling decisions based on the current network status and task scheduling score determined by the network status monitoring machine if the security level of the inference task is non-safety critical level, and to determine the target computing party to execute the inference task. The current network status is used to reflect the network quality of the vehicle, and the target computing party is the vehicle-side computing unit or the cloud computing unit.
[0184] The second processing module 13 is used to schedule the inference task to the target computing party for execution and generate inference results.
[0185] The inference task scheduling device provided in this application embodiment can execute the technical solution shown in the above method embodiment. Its implementation principle and beneficial effects are similar, and will not be described again here.
[0186] In one possible implementation, the first processing module 11 is specifically used for:
[0187] Calculate the network quality score based on network communication data;
[0188] Calculate the computing power margin value based on the processor load data;
[0189] Determine the urgency value of the driving scenario based on driving status data;
[0190] The task scheduling score is obtained by weighting the network quality value, computing power margin value, and driving scenario urgency value.
[0191] In one possible implementation, network communication data includes signal reception power and signal reception quality, and the first processing module 11 is specifically used for:
[0192] The signal received power and signal received quality are normalized to obtain normalized power values and normalized quality values.
[0193] Based on preset signal reception power weights and signal reception quality weights, the normalized power value and the normalized quality value are weighted and summed to obtain the network quality value.
[0194] In one possible implementation, the processor load data includes processor utilization and memory utilization. The first processing module 11 is specifically used for:
[0195] Calculate the processor idle rate based on the processor utilization rate;
[0196] Calculate the memory free rate based on memory usage;
[0197] Based on preset processor computing power weights and memory computing power weights, the processor idle rate and memory idle rate are weighted and summed to obtain the computing power reserve value.
[0198] In one possible implementation, the driving status data includes at least one of vehicle speed, gear position, and Advanced Driver Assistance System (ADAS) status; the first processing module 11 is specifically used for:
[0199] Retrieve predefined scene urgency mapping rules;
[0200] Based on the scenario urgency mapping rules, determine the driving scenario urgency value corresponding to the vehicle's driving speed, gear, and ADAS status.
[0201] In one possible implementation, the scenario urgency mapping rule includes:
[0202] When the ADAS status is a warning triggered state, the driving scenario urgency value is the first urgency level value;
[0203] When the vehicle speed is greater than or equal to the first speed threshold, the urgency level of the driving scenario is the second urgency level.
[0204] If the vehicle speed is less than the first speed threshold and greater than the second speed threshold, and the ADAS status is active, the driving scenario urgency value is the third urgency value.
[0205] When the vehicle speed is less than or equal to the second speed threshold and the gear is reverse or park, the driving scenario urgency value is the fourth urgency value.
[0206] Among them, the first speed threshold is greater than the second speed threshold, and the values of the first urgency level, the second urgency level, the third urgency level, and the fourth urgency level decrease in sequence.
[0207] In one possible implementation, the non-safety critical levels include driving assistance levels and experience enhancement levels, and the dispatch decision module 12 is specifically used for:
[0208] If the task scheduling score is greater than or equal to the first score threshold, and the current network state is in full-function state, then the target computing unit for executing the inference task is determined to be the cloud computing unit.
[0209] If the task scheduling score is greater than or equal to the first score threshold, and the current network status is restricted, then the target computing unit for executing the inference task at the driving assistance level is determined to be the vehicle-side computing unit, and the target computing unit for executing the inference task at the experience enhancement level is determined to be the cloud computing unit.
[0210] In one possible implementation, the scheduling decision module 12 is further configured to:
[0211] If the task scheduling score is less than the first score threshold but greater than the second score threshold, and the current network status is either full-function or limited, then the target computing unit for executing the inference task at the driving assistance level is determined to be the vehicle-side computing unit, and the target computing unit for executing the inference task at the experience enhancement level is determined to be the cloud computing unit, and the first score threshold is greater than the second score threshold.
[0212] If the security level is security critical level, or the task scheduling score is less than or equal to the second score threshold, or the current network state is local state, then the target computing unit for executing the inference task is determined to be the vehicle-side computing unit.
[0213] In one possible implementation, if the target computing entity is a cloud computing unit, the second processing module 13 is specifically used for:
[0214] Set a timeout duration for the reasoning task;
[0215] If the inference result is not received from the cloud computing unit within the timeout period, the inference task will be rescheduled to the vehicle-side computing unit to execute the inference task.
[0216] In one possible implementation, the second processing module 13 is specifically used for:
[0217] If the safety level of the reasoning task is the driving assistance level, then set the first timeout duration for the reasoning task;
[0218] If the security level of the reasoning task is the enhanced experience level, then a second timeout duration is set for the reasoning task, and the first timeout duration is shorter than the second timeout duration.
[0219] In one possible implementation, the network status monitoring machine is configured with status switching rules, which include:
[0220] When the network quality value is less than the first network threshold and the duration reaches the first preset duration, or when the network type is switched from 5G to 4G, the current network status will be switched from full-function status to restricted status.
[0221] When the network quality value is less than the second network threshold and the duration reaches the first preset duration, or when the network type is switched from 4G to no service, the current network status is switched from restricted status to local status, and the first network threshold is greater than the second network threshold.
[0222] When the network quality value is greater than or equal to the second network threshold and the duration reaches the first preset duration, and the network type is switched from no service to 4G, then after waiting for the second preset duration, the current network status is switched from local status to restricted status.
[0223] When the network quality value is greater than or equal to the first network threshold and the duration reaches the first preset duration, and the network type is switched from 4G to 5G, then after waiting for the second preset duration, the current network state is switched from the restricted state to the full-function state.
[0224] The inference task scheduling device provided in this application embodiment can execute the technical solution shown in the above method embodiment. Its implementation principle and beneficial effects are similar, and will not be described again here.
[0225] Figure 8 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Please refer to... Figure 8 The electronic device 20 provided in this embodiment can be a cockpit domain controller or a vehicle controller. The electronic device 20 includes at least one processor 21 and a memory 22. Optionally, the device 20 also includes a communication component 24. The processor 21, memory 22, and communication component 24 are connected via a bus 23.
[0226] In the specific implementation process, at least one processor 21 executes computer execution instructions stored in memory 22, causing at least one processor 21 to perform the above-described method.
[0227] The specific implementation process of processor 21 can be found in the above method embodiments, and its implementation principle and technical effect are similar. It will not be repeated here.
[0228] In the above embodiments, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor.
[0229] The memory may include random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device.
[0230] The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.
[0231] This application embodiment also provides a vehicle, including a vehicle body and such as Figure 8 Optionally, the vehicle may also include a cellular communication module for data transmission between the vehicle and the cloud computing unit, in order to implement the inference task scheduling method shown in the above method embodiments.
[0232] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method.
[0233] This application also provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described method.
[0234] The aforementioned readable storage medium can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The readable storage medium can be any available medium accessible to a general-purpose or special-purpose computer.
[0235] An exemplary readable storage medium is coupled to a processor, enabling the processor to read information from and write information to the readable storage medium. Of course, the readable storage medium can also be a component of the processor. The processor and the readable storage medium can reside in an Application Specific Integrated Circuit (ASIC). Alternatively, the processor and the readable storage medium can exist as discrete components in the device.
[0236] The division of units is merely a logical functional division; in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces, devices, or units, and may be electrical, mechanical, or other forms.
[0237] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
[0238] In addition, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
[0239] If a function is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, or the part that contributes to the prior art, or a part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0240] Those skilled in the art will understand that all or part of the steps of the above-described method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When executed, the program performs the steps of the above-described method embodiments; and the aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks.
[0241] Finally, it should be noted that other embodiments of the invention will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention that follow the general principles of the invention and include common knowledge or customary techniques in the art not disclosed herein, and is not limited to the precise structures described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of the invention is limited only by the appended claims.
Claims
1. A method for scheduling inference tasks, characterized in that, include: The task scheduling score is determined based on the vehicle's network communication data, processor load data, and driving status data. If the security level of the inference task is non-critical, a scheduling decision is made based on the current network status determined by the network status monitoring machine and the task scheduling score to determine the target computing unit to execute the inference task. The current network status is used to reflect the network quality of the vehicle, and the target computing unit is a vehicle-side computing unit or a cloud computing unit. The inference task is scheduled to be executed by the target computing entity to generate inference results.
2. The method according to claim 1, characterized in that, The process of determining the task scheduling score based on the vehicle's network communication data, processor load data, and driving status data includes: Calculate the network quality value based on the network communication data; Calculate the computing power margin value based on the processor load data; Based on the driving status data, the urgency value of the driving scenario is determined; The task scheduling score is obtained by weighting the network quality value, the computing power margin value, and the driving scenario urgency value.
3. The method according to claim 2, characterized in that, The network communication data includes signal reception power and signal reception quality. The calculation of the network quality value based on the network communication data includes: The signal received power and the signal received quality are normalized to obtain normalized power values and normalized quality values; Based on preset signal reception power weights and signal reception quality weights, the normalized power value and the normalized quality value are weighted and summed to obtain the network quality value.
4. The method according to claim 2, characterized in that, The processor load data includes processor utilization and memory utilization. The step of calculating the computing power reserve value based on the processor load data includes: Calculate the processor idle rate based on the processor utilization rate; Calculate the memory free rate based on the memory usage rate; Based on preset processor computing power weights and memory computing power weights, the processor idle rate and the memory idle rate are weighted and summed to obtain the computing power reserve value.
5. The method according to claim 2, characterized in that, The driving status data includes at least one of vehicle speed, gear position, and Advanced Driver Assistance System (ADAS) status. Determining the urgency value of the driving scenario based on the driving status data includes: Retrieve predefined scene urgency mapping rules; Based on the scenario urgency mapping rule, determine the driving scenario urgency value corresponding to the vehicle speed, gear position, and ADAS status.
6. The method according to claim 5, characterized in that, The scenario urgency mapping rules include: When the ADAS status is a warning triggered state, the urgency value of the driving scenario is the first urgency level value; When the vehicle speed is greater than or equal to the first speed threshold, the urgency value of the driving scenario is the second urgency value. If the vehicle speed is less than the first speed threshold and greater than the second speed threshold, and the ADAS state is active, the urgency value of the driving scenario is the third urgency value. When the vehicle speed is less than or equal to the second speed threshold and the gear is reverse or parking, the urgency value of the driving scenario is the fourth urgency value. Wherein, the first speed threshold is greater than the second speed threshold, and the first urgency level, the second urgency level, the third urgency level, and the fourth urgency level decrease sequentially.
7. The method according to any one of claims 1-6, characterized in that, The non-safety-critical levels include driving assistance levels and experience enhancement levels. If the safety level of the inference task is a non-safety-critical level, a scheduling decision is made based on the current network state determined by the network state monitoring machine and the task scheduling score to determine the target computing party to execute the inference task, including: If the task scheduling score is greater than or equal to the first score threshold, and the current network state is in full-function state, then the target computing entity for executing the inference task is determined to be the cloud computing unit. If the task scheduling score is greater than or equal to the first score threshold, and the current network state is a restricted state, then the target computing entity for executing the inference task of the driving assistance level is determined to be the vehicle-side computing unit, and the target computing entity for executing the inference task of the experience enhancement level is determined to be the cloud computing unit.
8. The method according to claim 7, characterized in that, The method further includes: If the task scheduling score is less than the first score threshold and greater than the second score threshold, and the current network status is either full-function or limited, then the target computing entity for executing the inference task of the driving assistance level is determined to be the vehicle-side computing unit, and the target computing entity for executing the inference task of the experience enhancement level is determined to be the cloud computing unit, and the first score threshold is greater than the second score threshold. If the security level is a security critical level, or the task scheduling score is less than or equal to the second score threshold, or the current network state is a local state, then the target computing entity for executing the inference task is determined to be the vehicle-side computing unit.
9. The method according to any one of claims 1-6, characterized in that, If the target computing entity is the cloud computing unit, the step of scheduling the inference task to the target computing entity for execution and generating inference results includes: Set a timeout duration for the inference task; If no inference result is received from the cloud computing unit within the timeout period, the inference task will be rescheduled to the vehicle-mounted computing unit to execute the inference task.
10. The method according to claim 9, characterized in that, Setting a timeout duration for the inference task includes: If the safety level of the reasoning task is the driving assistance level, then a first timeout duration is set for the reasoning task; If the security level of the reasoning task is the enhanced experience level, then a second timeout duration is set for the reasoning task, wherein the first timeout duration is less than the second timeout duration.
11. The method according to any one of claims 2-6, characterized in that, The network status monitoring machine is configured with status switching rules, which include: When the network quality value is less than the first network threshold and the duration reaches the first preset duration, or when the network type is switched from 5G to 4G, the current network state is switched from full-function state to restricted state. When the network quality value is less than the second network threshold and the duration reaches the first preset duration, or when the network type is switched from 4G to no service, the current network state is switched from the restricted state to the local state, and the first network threshold is greater than the second network threshold. When the network quality value is greater than or equal to the second network threshold and the duration reaches the first preset duration, and the network type is switched from no service to 4G, then after waiting for the second preset duration, the current network state is switched from the local state to the restricted state. When the network quality value is greater than or equal to the first network threshold and the duration reaches the first preset duration, and the network type is switched from 4G to 5G, then after waiting for the second preset duration, the current network state is switched from the restricted state to the full-function state.
12. A reasoning task scheduling device, characterized in that, The device includes: The first processing module is used to determine the task scheduling score based on the vehicle's network communication data, processor load data, and driving status data. The scheduling decision module is used to make scheduling decisions based on the current network status determined by the network status monitoring machine and the task scheduling score if the security level of the inference task is non-safety critical level, and to determine the target computing party to execute the inference task. The current network status is used to reflect the network quality of the vehicle, and the target computing party is a vehicle-side computing unit or a cloud computing unit. The second processing module is used to schedule the inference task to the target computing party for execution and generate inference results.
13. An electronic device, characterized in that, include: Memory, processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory, causing the processor to perform the method as described in any one of claims 1-11.
14. A vehicle, characterized in that, include: The vehicle body, and the electronic device as described in claim 13.
15. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any one of claims 1-11.
16. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method described in any one of claims 1-11.