A cloud-edge collaborative algorithm scheduling method and device
By collecting and weighting resource indicators through the edge resource monitoring module and combining them with a dual-mode reporting strategy, dynamic algorithm scheduling for edge-cloud collaboration is achieved. This solves the rigidity and lag problems of existing cloud-edge collaborative algorithm scheduling schemes, and improves resource utilization efficiency and user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA UNICOM ONLINE INFORMATION TECHNOLOGY CO LTD
- Filing Date
- 2026-03-17
- Publication Date
- 2026-06-19
AI Technical Summary
Existing cloud-edge collaborative algorithm scheduling schemes suffer from problems such as rigid scheduling modes, incomplete resource monitoring, passive status reporting, and delayed decision response. They are difficult to adapt to the application requirements of multi-user concurrency and high-complexity algorithms, and cannot achieve efficient and balanced utilization of edge-cloud computing power.
The edge resource monitoring module collects multiple operational resource indicators, uses weighted fusion calculation to generate comprehensive resource evaluation indicators, and combines periodic reporting and event-triggered reporting strategies to achieve dynamic algorithm scheduling in a cloud-edge collaborative manner, supporting seamless switching between dynamic algorithm package distribution and cloud invocation.
It enables real-time monitoring and evaluation of the status of edge resources, shortens scheduling response time, improves algorithm execution efficiency and stability, and ensures service quality and operational reliability in multi-user concurrent scenarios.
Smart Images

Figure CN122248069A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of cloud-edge collaborative computing technology, and in particular to a cloud-edge collaborative algorithm scheduling method and apparatus. Background Technology
[0002] Cloud-edge collaborative computing architecture, with its advantages of low latency, high bandwidth, and distributed processing, has been widely used in various scenarios such as intelligent vision and IoT device management, playing a particularly important role in algorithm scheduling for AIoT terminal devices such as smart cameras. With the increasing prevalence of intelligent vision applications, the demands for multi-user concurrent calls and increased algorithm complexity are becoming more prominent, placing higher requirements on the flexibility, real-time performance, and reliability of cloud-edge collaborative scheduling.
[0003] However, current mainstream cloud-edge collaborative scheduling solutions still have many shortcomings and are difficult to adapt to complex and ever-changing real-world application scenarios. Regarding resource scheduling modes, traditional solutions mostly adopt static configuration methods, that is, fixing the algorithm's deployment location during the device initialization phase. This means either deploying the algorithm directly on the edge or uploading it to the cloud for execution. During operation, it cannot dynamically adjust based on the real-time resource usage of the edge device, resulting in underutilization of edge resources when they are idle, and problems such as algorithm execution stuttering and failures occurring when resources are scarce.
[0004] In terms of resource monitoring, most existing solutions focus only on single general resource metrics such as CPU utilization and network bandwidth, lacking comprehensive monitoring of critical resources on edge devices, especially ignoring memory usage, storage space, and the operational status of dedicated AI computing chips such as the NPU. Since the core capabilities of AIoT devices such as smart cameras rely on the inference performance of dedicated AI chips, evaluating resource status based on a single metric can easily lead to biased scheduling decisions and an inability to accurately match the resource requirements of algorithm execution.
[0005] Regarding status reporting and decision response, most existing systems passively query resource information from the endpoint. This means the cloud side only queries the endpoint resource status and makes scheduling decisions after the user triggers the algorithm call. This model introduces significant scheduling latency, failing to meet the needs of low-latency applications. Furthermore, some pre-scheduling schemes based on historical data, while attempting to predict resource load in advance, rely excessively on the accuracy of historical data. When faced with sudden multi-user concurrency, device hardware failures, or thermal throttling, their predictive performance is poor, hindering timely adjustments to scheduling strategies and consequently affecting the stability of algorithm execution.
[0006] Furthermore, in scenarios where multiple users share terminal devices such as smart cameras, existing solutions lack effective resource contention avoidance mechanisms. When multiple users call the algorithm simultaneously, resource contention and computing power overload can easily occur, leading to decreased algorithm execution efficiency, increased response latency, or even execution failure, which seriously affects the user experience.
[0007] In summary, current cloud-edge collaborative algorithm scheduling solutions generally suffer from rigid scheduling modes, incomplete resource monitoring, passive status reporting, and delayed decision-making responses. These issues make them ill-suited for applications with multi-user concurrency and high-complexity algorithms, and hinder efficient and balanced utilization of edge-cloud computing power. Therefore, there is an urgent need for a cloud-based algorithm scheduling method based on real-time status reporting from the edge SDK, multi-dimensional resource weighted evaluation, and adaptive switching of edge-cloud execution paths. This method aims to achieve stable deployment and execution of edge-cloud collaborative algorithms in low-latency, high-reliability, and high-concurrency scenarios. Summary of the Invention
[0008] The main objective of this invention is to provide an algorithmic scheduling method for cloud-edge collaboration.
[0009] Another objective of this invention is to propose a cloud-edge collaborative algorithm scheduling device.
[0010] The third objective of this invention is to provide an electronic device.
[0011] A fourth objective of this invention is to provide a non-transitory computer-readable storage medium.
[0012] To achieve the above objectives, a first aspect of the present invention proposes a cloud-edge collaborative algorithm scheduling method, comprising:
[0013] The edge resource monitoring module completes initialization when the smart camera algorithm starts, collects and calculates multiple operating resource usage indicators of the terminal device, uses these multiple operating resource usage indicators as raw resource indicators, and completes the sampling and updating of raw resource indicators according to a preset cycle. Based on preset weighting coefficients, the original resource indicators are weighted and fused through the edge state calculation module to generate a comprehensive resource evaluation index that represents the current resource abundance level, and the comprehensive evaluation index is cached and updated. A dual-mode strategy combining periodic reporting and event-triggered reporting is adopted. The comprehensive resource evaluation index and the corresponding original resource index are encapsulated and uploaded to the cloud platform through the end-side reporting control module. The cloud-side information receiving module parses and caches the uploaded comprehensive resource evaluation indicators and original resource indicators. The status assessment module determines the terminal computing power status based on preset dual thresholds, and the algorithm scheduling module adaptively executes the corresponding algorithm scheduling according to the determination results, thereby realizing dynamic algorithm scheduling in a cloud-edge collaborative manner.
[0014] Optionally, based on preset weighting coefficients, the original resource indicators are weighted and fused using the edge-side state calculation module to generate a comprehensive resource evaluation index characterizing the current resource sufficiency level, and the comprehensive evaluation index is cached and updated, including: Preset memory weight coefficient α, disk weight coefficient β, NPU weight coefficient γ, and CPU weight coefficient θ, and satisfy the following conditions: ; After each sampling of raw resource indicators, a weighted fusion calculation is performed through the edge-side state calculation module. The calculation formula is as follows:
[0015] Where D is the comprehensive evaluation index of edge computing resources, a is the memory utilization rate, b is the disk idle rate, c is the NPU utilization rate, and d is the CPU utilization rate. The calculated comprehensive evaluation index D is stored in the cache and updated after the next calculation.
[0016] Optionally, a dual-mode strategy combining periodic reporting and event-triggered reporting is adopted. The comprehensive resource evaluation indicators and their corresponding original resource indicators are encapsulated and uploaded to the cloud platform via the end-side reporting control module, including: After the smart camera is powered on, it first establishes a stable communication connection with the cloud platform. Once the communication connection is successfully established, the end-side reporting control module synchronously starts the periodic reporting mode and the event-triggered reporting mode. Through the periodic reporting mechanism, the comprehensive resource evaluation index D and the original resource indices a, b, c, and d are read according to the preset period Ts, encapsulated into JSON data packets, and uploaded to the cloud platform via the MQTT communication protocol. Through an event-triggered reporting mechanism, the current and historical D values are compared in real time and the absolute difference Sn is calculated. When Sn > S, the relevant indicator data is immediately encapsulated and uploaded to the cloud platform to achieve real-time synchronization of the sudden change status of end-side resources.
[0017] Optionally, the cloud-side information receiving module can be used to parse and cache the uploaded comprehensive resource evaluation indicators and original resource indicators, including: The cloud-side information receiving module listens to the data packets reported by the terminal side. After receiving the JSON format data packets, the data packets are parsed and decapsulated to extract the device's unique identification number, comprehensive resource evaluation index D, and original resource indices a, b, c, and d. Update the data to the device status cache table based on the device's unique identification number, configure a preset validity period Td for the cached data and record the update timestamp; When an algorithm scheduling request is received, it is determined whether the timestamp of the cached data is within the valid duration Td. If it is valid, the cached data is used for state evaluation. If it is invalid, the cached data is marked as expired and waits for a new round of data reporting from the end side.
[0018] Optionally, the terminal computing power status can be determined by the status assessment module based on preset dual thresholds, including: In the cloud-side state assessment module, a computing power sufficiency threshold Dh and a computing power shortage threshold Dl are pre-configured, and Dh > Dl is satisfied. The effective comprehensive resource evaluation index D is compared with Dh and Dl in turn; When D≥Dh, the edge computing power is considered sufficient; when Dl<D<Dh, the edge computing power is considered critical; when D≤Dl, the edge computing power is considered strained. The determination result is sent to the algorithm scheduling module in real time.
[0019] Optionally, the algorithm scheduling module adaptively executes the corresponding algorithm scheduling based on the judgment result to achieve dynamic algorithm scheduling for edge-cloud collaboration, including: When the algorithm scheduling module receives a judgment result indicating sufficient computing power, it retrieves the complete algorithm plugin package and sends it to the terminal device. The terminal then loads the module and executes local inference, reporting inference events and displaying the results in real time. When the determination result of the computing power threshold is received, the lightweight algorithm plugin package is sent to the terminal, and local inference is performed through the lightweight module to reduce the computing load on the terminal side. When a judgment result indicating a shortage of computing power is received, a cloud-side algorithm call token is generated and bound to the terminal video stream address. The cloud computing power is then invoked to complete the inference, and the result is sent back to the user terminal, realizing dynamic algorithm scheduling in collaboration between the end and the cloud.
[0020] To achieve the above objectives, a second aspect of the present invention provides a cloud-edge collaborative algorithm scheduling device, comprising: The resource acquisition unit is used to complete initialization through the end-side resource monitoring module when the intelligent camera algorithm starts, collect and calculate multiple operating resource occupancy indicators of the terminal device, use the multiple operating resource occupancy indicators as raw resource indicators, and complete the sampling and updating of raw resource indicators according to a preset period. The indicator fusion unit is used to perform weighted fusion calculation on the original resource indicators based on preset weight coefficients through the edge state calculation module, generate a comprehensive resource evaluation indicator that represents the current resource abundance, and cache and update the comprehensive evaluation indicator. The data reporting unit is used to adopt a dual-mode strategy that combines periodic reporting and event-triggered reporting. It encapsulates the comprehensive resource evaluation index and the corresponding original resource index and uploads them to the cloud platform through the end-side reporting control module. The algorithm scheduling unit is used to parse and cache the uploaded comprehensive resource evaluation indicators and original resource indicators by the cloud-side information receiving module, determine the terminal computing power status by the status evaluation module based on preset dual thresholds, and then the algorithm scheduling module adaptively executes the corresponding algorithm scheduling according to the determination result to realize dynamic algorithm scheduling of end-cloud collaboration.
[0021] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0022] To achieve the above objectives, a third aspect of this application provides an electronic device, including a processor and a memory; wherein the processor runs a program corresponding to the executable program code by reading executable program code stored in the memory, so as to implement a cloud-edge collaborative algorithm scheduling method as described in the first aspect embodiment.
[0023] To achieve the above objectives, a fourth aspect of this application provides a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements a cloud-edge collaborative algorithm scheduling method as described in the first aspect.
[0024] The embodiments of the present invention have the following beneficial effects: 1. Using the utilization rate of the dedicated AI chip on the edge as the core monitoring dimension, and assigning exclusive weights to the NPU utilization rate, it can accurately reflect the real inference performance of AIoT devices such as smart cameras, and improve the pertinence and accuracy of edge resource status assessment.
[0025] 2. By adopting a mechanism of active reporting by the client-side SDK and pre-caching of cloud-side status, resource data is synchronized in real time through dual-mode reporting. This allows the cloud to obtain the effective status before the business is triggered, shortening the scheduling response time. This is different from the traditional passive query mode and significantly reduces decision latency.
[0026] 3. Supports seamless switching between dynamic algorithm package delivery and cloud invocation. It can adaptively select local execution or cloud inference based on the comprehensive evaluation index D value, maximizing the advantages of end-to-cloud collaboration and improving algorithm execution efficiency and stability without affecting the user experience.
[0027] 4. Establish a resource competition avoidance mechanism for multi-user shared devices. Predict resource competition risks in advance through the comprehensive evaluation index D value, dynamically distribute tasks to the cloud for execution, effectively avoid end-side resource overload, and ensure service quality and operational reliability in multi-user concurrent scenarios. Attached Figure Description
[0028] The above and / or additional aspects and advantages of the present invention will become apparent and readily understood from the following description of the embodiments taken in conjunction with the accompanying drawings, wherein: Figure 1 A flowchart illustrating a cloud-edge collaborative algorithm scheduling method provided in an embodiment of the present invention; Figure 2 This is a layered architecture diagram of the edge-cloud collaborative state awareness and scheduling system provided in an embodiment of the present invention; Figure 3 This is a schematic diagram of the dual-mode reporting process for the status of the end-side device provided in an embodiment of the present invention; Figure 4 A flowchart for cloud-side device status classification assessment provided in this embodiment of the invention; Figure 5 This is a flowchart of the trigger-based cloud-side algorithm scheduling and inference mode selection provided in an embodiment of the present invention; Figure 6 This is a structural diagram of a cloud-edge collaborative algorithm scheduling device provided in an embodiment of the present invention. Detailed Implementation
[0029] It should be noted that, unless otherwise specified, the embodiments and features described in the present invention can be combined with each other. The present invention will now be described in detail with reference to the accompanying drawings and embodiments.
[0030] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.
[0031] The following description, with reference to the accompanying drawings, illustrates an algorithmic scheduling method and apparatus for cloud-edge collaboration according to an embodiment of the present invention.
[0032] Example 1 This invention provides an algorithmic scheduling method for cloud-edge collaboration. Figure 1 This is a flowchart illustrating a cloud-edge collaborative algorithm scheduling method provided in an embodiment of the present invention. Figure 1 As shown, the method includes the following steps: Step S1: Initialization is completed by the edge resource monitoring module when the smart camera algorithm starts. Multiple operating resource usage indicators of the terminal device are collected and calculated. The multiple operating resource usage indicators are used as the original resource indicators. The original resource indicators are sampled and updated according to a preset period.
[0033] The overall architecture of the edge-cloud collaborative system upon which this application's embodiments are based is as follows: Figure 2 As shown, the architecture adopts a layered design concept, divided from top to bottom into three core functional layers: the edge-side state detection layer, the communication transmission layer, and the cloud-side logic judgment layer. The edge-side state detection layer, as the data perception source of the entire system, is mainly composed of three core sub-modules connected in series: a resource monitoring module, a state calculation module, and a state reporting module. These modules seamlessly connect through standardized data interfaces, providing complete underlying support for real-time perception and accurate assessment of the resource status of edge devices. In the actual operation of this embodiment, the resource monitoring module deployed inside the device automatically triggers its initialization process the moment the Algorithm SDK on the smart camera device starts and completes its loading initialization. This module aims to comprehensively collect the occupancy status of various core hardware resources of the terminal device during operation, standardize the collected raw data, and then continuously sample and dynamically update these raw resource indicators according to a pre-set time cycle. This lays a solid, real-time data foundation for subsequent accurate calculation of edge-side resource status and intelligent scheduling decisions on the cloud side.
[0034] Specifically, to comprehensively and objectively assess the end-side resource status of smart camera devices, the resource monitoring module employs a multi-dimensional, comprehensive data collection strategy. This embodiment of the application collects various core resource indicators from the end-side across different dimensions, ensuring the comprehensiveness and accuracy of the collected data. This provides reliable data support for subsequent resource fusion and evaluation, avoiding the one-sidedness of resource evaluation caused by collecting only a single indicator.
[0035] First, the module calls the native system resource access interface of the Linux operating system, specifically referring to, but not limited to, the / proc / meminfo file reading interface, to obtain the device's total physical memory capacity and the current occupied memory capacity in real time at the kernel level. Through simple numerical difference calculation and proportional conversion, the memory utilization rate 'a' is accurately calculated. This indicator can intuitively and quantitatively reflect the current density of memory resource consumption on the edge. In this embodiment, the accurate calculation of the memory utilization rate 'a' is particularly important because the loading of the edge algorithm module and the caching of inference data both require a large amount of memory resources, and the sufficiency of memory resources directly affects the operational stability of the edge algorithm.
[0036] Secondly, the module utilizes standard file system statistics interfaces, referencing, but not limited to, the statvfs file system statistics structure interface, to perform a comprehensive scan of the device's storage disks. This scan yields detailed data on the total disk storage space and the current remaining available space. The disk idle rate, 'b', is then calculated. This metric effectively characterizes the sufficiency of remaining storage resources on the endpoint, ensuring the feasibility of data caching and module storage. In this embodiment, the local storage of the endpoint algorithm plugin package and the temporary data caching generated during inference both rely on disk resources. Therefore, the disk idle rate 'b' is a crucial reference indicator for determining whether the endpoint has the conditions for local algorithm deployment.
[0037] Secondly, regarding the dedicated computing resources relied upon by edge-side intelligent inference tasks, the module acquires data by calling the dedicated software development kit (SDK) interface provided by the corresponding NPU (Neural Processing Unit) chip manufacturer. Specifically, this refers to, but is not limited to, the aclrtGetMemInfo interface in the ACLAPI interface provided by Huawei Ascend chips, or the cnrtQueryDevice interface in the CNToolkit toolkit provided by Cambricon chips. This allows the module to obtain the number of active tasks currently running on the NPU chip and the maximum number of concurrent tasks supported by the chip hardware. The NPU utilization rate (c) is calculated by the ratio of these two values. This metric is a core parameter for measuring the load of edge-side dedicated computing resources. In this embodiment, edge-side intelligent inference tasks mainly rely on the NPU for efficient computation. The NPU utilization rate (c) directly determines the number and efficiency of algorithm inference tasks that the edge can handle, and is one of the core bases for subsequent algorithm scheduling decisions.
[0038] Finally, to further improve the comprehensiveness of resource assessment, the module also synchronously calls the Linux system's common system resource call interface to obtain the real-time CPU utilization rate d as an auxiliary reference indicator, thereby supplementing the assessment of the busy usage of general computing resources on the edge side. In this embodiment, the CPU is mainly responsible for basic operations such as system scheduling and data transmission on the edge device. Its utilization rate indirectly affects the operating efficiency of dedicated computing resources such as the NPU. Therefore, using the CPU utilization rate d as an auxiliary indicator can further improve the comprehensiveness and accuracy of edge resource assessment.
[0039] In this embodiment, to balance the real-time nature of data acquisition with the overhead of system operation, the technicians pre-set the resource sampling period to T seconds. That is, every T seconds, the resource monitoring module performs a complete polling and numerical calculation update of the four raw resource indicators: memory utilization, disk idle rate, NPU utilization, and CPU utilization. This design ensures that the resource status data reported from the edge to the cloud remains highly consistent with the device's current real-time operating status, effectively avoiding deviations in subsequent resource status assessments caused by delayed or outdated data acquisition. It also avoids excessive resource consumption on the edge devices due to excessively high sampling frequencies, further guaranteeing the decision-making accuracy and operational stability of the entire edge-cloud collaborative scheduling system.
[0040] Step S2: Based on preset weight coefficients, the original resource indicators are weighted and fused through the edge state calculation module to generate a comprehensive resource evaluation index that represents the current resource abundance level, and the comprehensive evaluation index is cached and updated.
[0041] In this embodiment, the status calculation module, as the core processing unit for edge resource assessment, undertakes the crucial responsibility of fusing, quantifying, and updating raw resource indicators. Its core function is to perform scientific and reasonable weighted fusion calculations on various raw resource indicators collected and organized by the resource monitoring module in step S1, based on pre-configured resource weight coefficients. This results in the generation of a comprehensive resource evaluation indicator that accurately and comprehensively represents the current resource sufficiency of the edge device. Simultaneously, it completes the real-time update operation of this indicator in the device's local cache, realizing the transformation of multi-dimensional and decentralized raw resource data into a unified and quantitative evaluation indicator. This provides a standardized core basis for subsequent status reporting and cloud-side scheduling decisions.
[0042] In this embodiment, the state calculation module pre-configures corresponding weight coefficients for the four types of raw resource indicators (memory utilization rate a, disk idle rate b, NPU utilization rate c, and CPU utilization rate d) collected in step S1, namely memory weight α, disk weight β, NPU weight γ, and CPU weight θ. Furthermore, to ensure the rationality and scientific nature of the weighted fusion calculation, the four weight coefficients are strictly limited to satisfying the following conditions: The constraints are as follows. It should be noted that the weight coefficients are not fixed, but can be dynamically adjusted according to the computing power requirements and business priorities of the actual application scenario. For example, in a business scenario that is mainly based on visual reasoning, the edge algorithm reasoning mainly relies on the dedicated computing power resources of the NPU. In this case, the proportion of the NPU weight γ can be appropriately increased to highlight the impact of the NPU, a core computing power resource, on the overall resource status of the edge, so that the comprehensive evaluation index is more in line with the actual business needs.
[0043] After each sampling of the original resource indicators in step S1, the state calculation module immediately triggers a weighted fusion calculation operation. Using a preset linear fusion formula, it generates a comprehensive resource evaluation index D, the specific calculation formula of which is as follows: The D value is strictly limited to the range of [0,1]. This range normalizes the comprehensive indicators, facilitating unified state judgment and threshold comparison by the cloud platform. Specifically, a larger D value indicates more abundant computing, storage, and other resources on the edge device, enabling it to execute algorithm inference tasks locally without relying on cloud computing power. Conversely, a smaller D value indicates more strained resources on the edge device, with insufficient local computing power to support stable algorithm operation. In this case, it is necessary to consider utilizing cloud computing resources to ensure normal business operations.
[0044] After calculating the comprehensive resource evaluation index D, the status calculation module promptly stores the D value in the local cache of the smart camera device, forming real-time updated resource status cache data. In this embodiment, the local cache adopts an overwrite update mechanism, that is, after each new D value is calculated, the old D value stored in the cache is overwritten with the new D value, thereby achieving real-time updates of the cache data. This ensures that the subsequent status reporting module can always obtain the latest comprehensive resource status of the current end device when performing reporting operations, avoiding problems such as inaccurate reporting information and scheduling decision errors caused by cache data lag.
[0045] As an optional implementation of this application, to further improve the representation accuracy and scenario adaptability of the comprehensive resource evaluation index, a nonlinear fusion function can be used instead of the above-mentioned linear weighted calculation method. For example, a nonlinear fusion formula in the form of a power function can be used: By using nonlinear transformation, it better reflects the nonlinear laws of resource changes on the edge in actual application scenarios. For example, when the NPU utilization rate is close to saturation, its impact on the overall resource status will increase nonlinearly. At this time, using a nonlinear fusion function can more accurately capture this resource change characteristic, further optimize the accuracy of comprehensive resource evaluation indicators, and provide a more reliable decision basis for subsequent cloud-side adaptive scheduling.
[0046] Step S3: A dual-mode strategy combining periodic reporting and event-triggered reporting is adopted. The comprehensive resource evaluation index and the corresponding original resource index are encapsulated and uploaded to the cloud platform through the end-side reporting control module.
[0047] The dual-mode reporting strategy in this application embodiment is as follows: Figure 3As shown, this strategy is the core technical means to achieve accurate synchronization of resource status between the edge and the cloud. The status reporting module adopts a hybrid reporting mode that combines periodic reporting and event-triggered reporting. After standardizing and encapsulating the comprehensive resource evaluation index D calculated in step S2 and the corresponding original resource indexes (memory utilization a, disk idle rate b, NPU utilization c, CPU utilization d), it stably uploads them to the cloud platform through the system's communication transmission layer. This dual-mode design takes into account both the regularity and real-time nature of resource status synchronization. It can ensure that the cloud platform periodically obtains the baseline status of edge resources, and can also realize the rapid perception and reporting of sudden changes in edge resource scenarios. It effectively avoids the lag or resource waste problems of a single reporting mode, and provides timely and reliable data support for the cloud's subsequent adaptive scheduling decisions.
[0048] In this embodiment, the startup process of the dual-mode reporting strategy is closely related to the establishment of the communication connection. When the smart camera device powers on, it first establishes a stable long-term connection by handshaking with the cloud platform through the communication transport layer. This embodiment preferably uses the MQTT lightweight communication protocol to build this long-term connection, which has advantages such as low bandwidth consumption, high reliability, and low latency, and can adapt to the resource characteristics and remote communication needs of the smart camera device. Once the long-term connection between the device and the cloud is successfully established and the communication status is stable, the status reporting module will automatically and synchronously start both the periodic reporting mode and the event-triggered reporting mode, ensuring that the two reporting modes work collaboratively without interference, comprehensively covering various scenarios of device resource status.
[0049] The periodic reporting mode is mainly used to achieve regular and systematic synchronization of the status of edge resources, ensuring that the cloud platform can continuously grasp the baseline operating status of edge resources. In this embodiment, the periodic reporting mode performs reporting operations according to a pre-set reporting period Ts, where the value of Ts is strictly greater than the resource sampling period T set in step S1. This parameter setting can effectively avoid problems such as wasted communication bandwidth and excessive resource consumption of edge devices due to excessive reporting frequency, while also ensuring that the data reported in each period is the latest collected and calculated valid data. Specifically, the periodic reporting mode will periodically read the comprehensive resource evaluation index D and various raw resource indicators stored in the device's local cache at time intervals of Ts, and encapsulate these data into standardized JSON format data packets according to preset format specifications. The JSON format has the characteristics of clear data structure, convenient parsing, and strong cross-platform compatibility, which can ensure that the cloud information receiving module can quickly complete data parsing. After the data packet is encapsulated, it is published to the pre-specified communication topic of the cloud platform through lightweight data transmission protocols such as MQTT, realizing the periodic synchronization of edge resource status, allowing the cloud platform to grasp the normal operating status of edge resources in real time.
[0050] Event-triggered reporting mode is primarily used to address scenarios involving sudden changes in the state of end-side resources. It enables real-time perception and rapid reporting of resource state changes, avoiding delays in cloud-side scheduling decisions due to the time latency of periodic reporting. In this embodiment, the event-triggered reporting mode monitors the changes in the comprehensive resource evaluation index D output by the state calculation module in real time. The state reporting module calculates the difference between the latest calculated D value and the historical D value stored in the local cache, obtaining the absolute difference Sn. Simultaneously, this embodiment pre-sets a difference threshold S, which can be dynamically adjusted based on the resource sensitivity of the actual business scenario to determine whether a significant change in the state of end-side resources has occurred. When the calculated absolute difference Sn is greater than the preset threshold S, it is determined that a sudden change in the state of end-side resources has occurred. Examples include scenarios where a large number of NPU tasks are suddenly released, causing a sharp drop in NPU utilization, or a large number of new algorithm inference tasks are added, causing a sharp increase in memory and NPU utilization. At this point, the status reporting module will immediately trigger the emergency reporting process. Without waiting for the periodic reporting time, it will quickly encapsulate the current comprehensive resource evaluation index D and the corresponding original resource index into a JSON format data packet and upload it to the cloud platform through the communication transmission layer. This enables real-time perception and rapid synchronization of the sudden change in the resource status on the end side, providing accurate basis for the cloud platform to adjust the scheduling strategy in a timely manner and avoiding business interruption or resource waste caused by scheduling delays.
[0051] As an optional implementation of this application embodiment, considering the potential upload bandwidth constraints in real-world application scenarios, this application embodiment also designs a dynamic adaptation mechanism for the reporting strategy. When the status reporting module detects that the upload bandwidth between the end-side and the cloud side is under strain, it can automatically and dynamically switch the reporting strategy, disabling the periodic reporting mode and retaining only the event-triggered reporting mode, while simplifying the reported content. Specifically, detailed data such as the comprehensive resource evaluation index D and various raw resource indicators that originally needed to be reported are simplified into simple resource status identifiers, such as using an encoding method where 0 represents abundant resources, 1 represents critical resources, and 2 represents strained resources, significantly reducing the size of the reported data packets and reducing bandwidth consumption. This adaptation mechanism can effectively alleviate bandwidth pressure in scenarios with tight bandwidth, ensuring that key information about the resource status on the end-side can be stably and timely uploaded to the cloud platform, balancing the reliability and efficiency of data transmission, and further improving the scenario adaptability and practicality of the dual-mode reporting strategy in this application embodiment.
[0052] Step S4: The cloud-side information receiving module parses and caches the uploaded comprehensive resource evaluation indicators and original resource indicators. The status assessment module determines the terminal computing power status based on preset dual thresholds. The algorithm scheduling module adaptively executes the corresponding algorithm scheduling based on the determination result, thereby realizing dynamic algorithm scheduling in a cloud-edge collaborative manner.
[0053] In this embodiment of the application, the cloud-side logic judgment layer is as follows: Figure 2 As shown, it mainly includes an information receiving module, a status assessment module, and an algorithm scheduling module. The cloud-side logic judgment layer receives various reported data from the end-side devices through the communication transmission layer, and executes the corresponding processing flow in the order of data parsing, status assessment, and algorithm scheduling, ultimately realizing the coordinated cooperation between end-side resources and cloud-side computing power, and completing adaptive and dynamic end-cloud collaborative algorithm scheduling.
[0054] The cloud-side platform continuously monitors the communication transmission layer through its information receiving module, receiving various status data uploaded by end-side devices in real time. When the information receiving module receives a data packet encapsulated in JSON format from the end-side device, it performs complete parsing and decapsulation, accurately extracting the device's unique identification number, comprehensive resource evaluation index D, and original resource indices a, b, c, and d. The unique device identification number includes, but is not limited to, unique hardware identifiers such as CUEI and SN, used to distinguish different terminal devices. After data parsing, the information receiving module updates the relevant data in the device status cache table in the cloud-side memory based on the device's unique identification number. Simultaneously, it configures a preset validity period Td for the cached data of each device and records the corresponding data update timestamp for subsequent data validity verification. When the cloud platform receives an algorithm scheduling request triggered by the user, it will first determine the validity of the cached data. If the difference between the update timestamp of the cached data and the current time is less than the preset valid duration Td, the data is determined to be valid and can be directly used for subsequent state evaluation. If the time difference exceeds the valid duration Td, the cached data corresponding to the device is marked as expired and expired data will no longer be used for scheduling decisions. The system will wait for the next new data reported by the end side to update the cache, thereby avoiding deviations or errors in scheduling decisions due to outdated or invalid data.
[0055] Cloud-side status assessment module, such as Figure 4The diagram shows the core module for accurately determining the status of edge resources. In this embodiment, the status assessment module is pre-configured with two threshold parameters: an edge computing power sufficiency threshold Dh and an edge computing power shortage threshold Dl, strictly satisfying the constraint that Dh > Dl. The specific values of the two thresholds can be dynamically configured and adjusted based on the actual performance of the edge hardware, the computing power requirements of the business scenario, and the system operation strategy. Based on the obtained effective comprehensive resource evaluation index D, the status assessment module compares the D value with Dh and Dl step by step to accurately determine the current computing power status of the edge device. When D ≥ Dh, the edge computing power is considered sufficient, indicating that the edge device has adequate resources such as memory, disk, NPU, and CPU, and is capable of stably executing local algorithm analysis and inference tasks. When Dl < D < Dh, the edge computing power is considered critical, indicating that the edge device resources are approaching a critical state of strain. Although it can still support a certain degree of local inference, the algorithm complexity needs to be appropriately reduced to prevent resource exhaustion. When D ≤ Dl, the edge computing power is considered strained, indicating that the edge device resources are severely insufficient and cannot stably run edge algorithms, requiring reliance on cloud computing power to complete the inference task. The state assessment module pushes the results of the above three computing power states to the algorithm scheduling module in real time, providing a key basis for subsequent adaptive scheduling decisions.
[0056] Cloud-side algorithm scheduling module, such as Figure 5 The diagram shows the execution unit for dynamic scheduling between the edge and cloud. When a user triggers an algorithm deployment operation on the APP side, an algorithm scheduling request is sent to the cloud platform via an API interface. Upon receiving the request, the algorithm scheduling module first extracts the unique device identification number corresponding to the request to locate the resource status judgment result of the corresponding device. Subsequently, the algorithm scheduling module, combined with the edge resource status output by the status assessment module, executes three different adaptive algorithm scheduling decisions. When the judgment result indicates that edge resources are sufficient, the algorithm scheduling module retrieves the full version of the algorithm plugin package from the cloud algorithm repository and sends it to the target edge device through the communication transport layer. After receiving the algorithm package, the edge camera's main process automatically loads the full model and starts local inference. Inference events are reported to the backend server in real time and visualized in the user's APP. When the judgment result indicates that edge resources are critically strained, the algorithm scheduling module sends a lightweight algorithm plugin package to the edge device. After loading the lightweight model, the edge camera performs local inference, reducing the edge computing load and resource consumption while ensuring that the inference accuracy meets business requirements. When the determination result is that the terminal side resources are severely insufficient, the algorithm scheduling module generates a cloud-side algorithm call token and binds the token to the real-time video stream address of the terminal device. It then directly calls the cloud-side computing power cluster to complete the algorithm inference, and the inference result is sent back to the user terminal through the API interface.
[0057] Through the above-mentioned hierarchical scheduling strategy, the embodiments of this application ultimately realize dynamic algorithm scheduling for end-to-cloud collaboration, which maximizes the utilization of computing resources on the end side and the cloud side while ensuring business performance and operational stability, thereby improving the overall system's operational efficiency and practicality.
[0058] Example 2 This invention provides a cloud-edge collaborative algorithm scheduling device. Figure 6 This is a schematic flowchart of a cloud-edge collaborative algorithm scheduling device provided in an embodiment of the present invention. Figure 6 As shown, the device includes: The resource acquisition unit 100 is used to complete initialization through the end-side resource monitoring module when the intelligent camera algorithm starts, collect and calculate multiple operating resource occupancy indicators of the terminal device, use the multiple operating resource occupancy indicators as original resource indicators, and complete the sampling and updating of the original resource indicators according to a preset period. The indicator fusion unit 200 is used to perform weighted fusion calculation on the original resource indicators based on preset weight coefficients through the end-side state calculation module to generate a comprehensive resource evaluation indicator that represents the current resource abundance level, and to cache and update the comprehensive evaluation indicator. The data reporting unit 300 is used to adopt a dual-mode strategy that combines periodic reporting and event-triggered reporting, and uploads the comprehensive resource evaluation index and the corresponding original resource index to the cloud platform through the end-side reporting control module. The algorithm scheduling unit 400 is used to parse and cache the uploaded comprehensive resource evaluation indicators and original resource indicators by the cloud-side information receiving module, determine the terminal computing power status by the status evaluation module based on preset dual thresholds, and the algorithm scheduling module adaptively executes the corresponding algorithm scheduling according to the determination result to realize dynamic algorithm scheduling of end-cloud collaboration.
[0059] Regarding the apparatus in the above embodiments, the specific manner in which each module performs its operation has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0060] Example 3 To implement the methods of the above embodiments, the present invention also provides an electronic device, which includes a memory and a processor; wherein the processor reads executable program code stored in the memory to run a program corresponding to the executable program code, so as to implement the various steps of the methods described above.
[0061] Example 4 To implement the above embodiments, this application also proposes a non-transitory computer-readable storage medium storing a computer program thereon, which, when executed by a processor, implements the method described in the foregoing embodiments.
[0062] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
[0063] In the description of this specification, the references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Moreover, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0064] Furthermore, the terms "first" and "second" are used for descriptive purposes only and should not be construed as indicating or implying relative importance or implicitly specifying the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one of that feature. In the description of this invention, "a plurality of" means at least two, such as two, three, etc., unless otherwise explicitly specified.
Claims
1. A cloud-edge collaborative algorithm scheduling method, characterized in that, include: The edge resource monitoring module completes initialization when the smart camera algorithm starts, collects and calculates multiple operating resource usage indicators of the terminal device, uses these multiple operating resource usage indicators as raw resource indicators, and completes the sampling and updating of raw resource indicators according to a preset cycle. Based on preset weighting coefficients, the original resource indicators are weighted and fused through the edge state calculation module to generate a comprehensive resource evaluation index that represents the current resource abundance level, and the comprehensive evaluation index is cached and updated. A dual-mode strategy combining periodic reporting and event-triggered reporting is adopted. The comprehensive resource evaluation index and the corresponding original resource index are encapsulated and uploaded to the cloud platform through the end-side reporting control module. The cloud-side information receiving module parses and caches the uploaded comprehensive resource evaluation indicators and original resource indicators. The status assessment module determines the terminal computing power status based on preset dual thresholds, and the algorithm scheduling module adaptively executes the corresponding algorithm scheduling according to the determination results, thereby realizing dynamic algorithm scheduling in a cloud-edge collaborative manner.
2. The method according to claim 1, characterized in that, Based on preset weighting coefficients, the original resource indicators are weighted and fused using the edge-side state calculation module to generate a comprehensive resource evaluation index characterizing the current resource sufficiency level. The comprehensive evaluation index is then cached and updated, including: Preset memory weight coefficient α, disk weight coefficient β, NPU weight coefficient γ, and CPU weight coefficient θ, and satisfy the following conditions: ; After each sampling of raw resource indicators, a weighted fusion calculation is performed through the edge-side state calculation module. The calculation formula is as follows: Where D is the comprehensive evaluation index of edge computing resources, a is the memory utilization rate, b is the disk idle rate, c is the NPU utilization rate, and d is the CPU utilization rate. The calculated comprehensive evaluation index D is stored in the cache and updated after the next calculation.
3. The method according to claim 2, characterized in that, A dual-mode strategy combining periodic reporting and event-triggered reporting is adopted. The edge-side reporting control module encapsulates the comprehensive resource evaluation indicators and their corresponding original resource indicators and uploads them to the cloud platform, including: After the smart camera is powered on, it first establishes a stable communication connection with the cloud platform. Once the communication connection is successfully established, the end-side reporting control module synchronously starts the periodic reporting mode and the event-triggered reporting mode. Through the periodic reporting mechanism, the comprehensive resource evaluation index D and the original resource indices a, b, c, and d are read according to the preset period Ts, encapsulated into JSON data packets, and uploaded to the cloud platform via the MQTT communication protocol. Through an event-triggered reporting mechanism, the current and historical D values are compared in real time and the absolute difference Sn is calculated. When Sn > S, the relevant indicator data is immediately encapsulated and uploaded to the cloud platform to achieve real-time synchronization of the sudden change status of end-side resources.
4. The method according to claim 3, characterized in that, The cloud-side information receiving module parses and caches the uploaded comprehensive resource evaluation indicators and original resource indicators, including: The cloud-side information receiving module listens to the data packets reported by the terminal side. After receiving the JSON format data packets, the data packets are parsed and decapsulated to extract the device's unique identification number, comprehensive resource evaluation index D, and original resource indices a, b, c, and d. Update the data to the device status cache table based on the device's unique identification number, configure a preset validity period Td for the cached data and record the update timestamp; When an algorithm scheduling request is received, it is determined whether the timestamp of the cached data is within the valid duration Td. If it is valid, the cached data is used for state evaluation. If it is invalid, the cached data is marked as expired and waits for a new round of data reporting from the end side.
5. The method according to claim 4, characterized in that, The terminal's computing power status is determined by the status assessment module based on preset dual thresholds, including: In the cloud-side state assessment module, a computing power sufficiency threshold Dh and a computing power shortage threshold Dl are pre-configured, and Dh > Dl is satisfied. The effective comprehensive resource evaluation index D is compared with Dh and Dl in turn; When D≥Dh, the edge computing power is considered sufficient; when Dl<D<Dh, the edge computing power is considered critical; when D≤Dl, the edge computing power is considered strained. The determination result is sent to the algorithm scheduling module in real time.
6. The method according to claim 5, characterized in that, The algorithm scheduling module adaptively executes the corresponding algorithm scheduling based on the judgment result, including: When the algorithm scheduling module receives a judgment result indicating sufficient computing power, it retrieves the complete algorithm plugin package and sends it to the terminal device. The terminal then loads the module and executes local inference, reporting inference events and displaying the results in real time. When the determination result of the computing power threshold is received, the lightweight algorithm plugin package is sent to the terminal, and local inference is performed through the lightweight module to reduce the computing load on the terminal side. When a judgment result indicating a shortage of computing power is received, a cloud-side algorithm call token is generated and bound to the terminal video stream address. The cloud computing power is then invoked to complete the inference, and the result is sent back to the user terminal, realizing dynamic algorithm scheduling in collaboration between the end and the cloud.
7. A cloud-edge collaborative algorithm scheduling device, characterized in that, include: The resource acquisition unit is used to complete initialization through the end-side resource monitoring module when the intelligent camera algorithm starts, collect and calculate multiple operating resource occupancy indicators of the terminal device, use the multiple operating resource occupancy indicators as raw resource indicators, and complete the sampling and updating of raw resource indicators according to a preset period. The indicator fusion unit is used to perform weighted fusion calculation on the original resource indicators based on preset weight coefficients through the edge state calculation module, generate a comprehensive resource evaluation indicator that represents the current resource abundance, and cache and update the comprehensive evaluation indicator. The data reporting unit is used to adopt a dual-mode strategy that combines periodic reporting and event-triggered reporting. It encapsulates the comprehensive resource evaluation index and the corresponding original resource index and uploads them to the cloud platform through the end-side reporting control module. The algorithm scheduling unit is used to parse and cache the uploaded comprehensive resource evaluation indicators and original resource indicators by the cloud-side information receiving module, determine the terminal computing power status by the status evaluation module based on preset dual thresholds, and then the algorithm scheduling module adaptively executes the corresponding algorithm scheduling according to the determination result to realize dynamic algorithm scheduling of end-cloud collaboration.
8. The apparatus according to claim 7, characterized in that, The indicator fusion unit is also used for: Preset memory weight coefficient α, disk weight coefficient β, NPU weight coefficient γ, and CPU weight coefficient θ, and satisfy the following conditions: ; After each sampling of raw resource indicators, a weighted fusion calculation is performed through the edge-side state calculation module. The calculation formula is as follows: Where D is the comprehensive evaluation index of edge computing resources, a is the memory utilization rate, b is the disk idle rate, c is the NPU utilization rate, and d is the CPU utilization rate. The calculated comprehensive evaluation index D is stored in the cache and updated after the next calculation.
9. An electronic device, characterized in that, Including processor and memory; The processor reads executable program code stored in the memory to run a program corresponding to the executable program code, so as to implement the method as described in any one of claims 1-6.
10. A non-transitory computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method as described in any one of claims 1-6.