Multi-performance balancing and clipping scheduling method for concurrent services of IVCPS under limited resources
By modeling multiple performance indicators and using two-level decoupling optimization scheduling, the problem of coordinating multiple performance objectives in resource-constrained scenarios is solved by existing scheduling methods, achieving efficient resource utilization and optimal performance scheduling in real-time systems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHONGQING UNIV
- Filing Date
- 2026-02-04
- Publication Date
- 2026-06-19
Smart Images

Figure CN122240253A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the intersection of IVCPS and real-time embedded system scheduling, specifically involving a multi-performance balancing pruning scheduling method for concurrent IVCPS services under limited resources. Background Technology
[0002] With the development of cloud computing, edge computing, and networked control systems, various complex cyber-physical systems (CPS) are gradually evolving into multi-source concurrent service systems. Typical systems include, but are not limited to, intelligent transportation and vehicle-road cooperative systems, industrial internet and intelligent manufacturing cloud platforms, smart city integrated management platforms, energy dispatch and power information systems, and large-scale Internet of Things (IoT) cloud service systems. In these systems, the cloud or central node needs to concurrently process a large number of service requests from different terminals and different business types under limited computing resources and strict time constraints.
[0003] These requests typically share the following common characteristics: 1. Significant differences in real-time requirements, with different requests having varying tolerances for response time; 2. Diverse performance objectives, requiring the system to simultaneously focus on multiple performance dimensions, such as efficiency, stability, accuracy, and reliability; 3. Graded processing quality, with different performance levels of processing modules used for the same type of request, and significant differences in computation time and performance contribution between different levels; 4. Burst and uncertainty in load, with request volume potentially increasing dramatically in a short period.
[0004] When the number of concurrent requests is low, the system can allocate the highest-performance processing module to all requests to achieve optimal system performance. However, when the concurrent load surges in a short period, forcibly maintaining optimal performance processing will lead to exceeding the total response time limit and compromising system real-time performance. If the lowest-performance processing method is uniformly adopted, although the time constraint can be met, it will cause a significant decrease in overall system performance and affect the quality of system operation. Therefore, how to schedule and allocate concurrent service requests with controllable performance and minimal loss under limited resources and strict time constraints has become a key technical problem with universal significance.
[0005] Existing concurrent task scheduling methods mainly include fixed priority scheduling, polling or fair queue-based scheduling, single-objective optimization load distribution methods, and resource scheduling strategies that only target response time or throughput. These methods have advantages such as simple implementation and predictable behavior in scenarios with sufficient resources or stable loads. However, in concurrent service scenarios with limited resources and diverse performance objectives, they generally reveal obvious theoretical and methodological limitations.
[0006] Existing scheduling methods struggle to achieve optimal coordination of multiple performance objectives under given resource constraints. OLAbraham et al., in their review of cloud computing task scheduling, pointed out that the task scheduling problem is essentially "the problem of optimizing one or more performance objectives under limited resource constraints," classifying it as a multi-objective optimization problem. This conclusion indicates that, under conditions of fixed or limited system resources, the core of the scheduling problem is not simply task ordering, but rather the trade-off and optimization of different performance objectives within the feasible resource constraint domain. However, most existing scheduling methods, in resource-constrained scenarios, typically assume that task resource requirements and performance indicators are rigid and unadjustable, and scheduling decisions only affect the execution order, allocation ratio, or preemption relationship of tasks. When resources are insufficient to simultaneously meet the highest performance requirements of all tasks, the system often maintains operation through queuing, delayed execution, task preemption, or rejection. While these strategies formally guarantee system operability, their essence remains scheduling on a fixed task model that does not explicitly characterize the performance-resource trade-off relationship, making it difficult to further explore the overall performance potential of the system within the feasible solution space.
[0007] To alleviate resource shortages, some studies have introduced ideas such as task pruning, task degradation, or approximate computation, for example, saving resources by reducing computational precision, decreasing functional modules, or reducing execution frequency. However, these methods still have significant shortcomings at the scheduling level. Existing pruning methods generally adopt the following pattern: first, a pruned task version is generated, and then the scheduler directly executes the pruned task without evaluating whether the pruning scheme is the optimal choice under the current system state. In other words, the pruning behavior itself is not included in the solution of the optimization problem, but is treated as a given premise. This leads to pruning strategies often relying on human experience or static rules, making it difficult to guarantee optimal performance under different load conditions.
[0008] Meanwhile, existing methods struggle to handle conflicts between multiple performance objectives. For the multi-performance-objective scheduling problem, previous research has proposed methods based on evolutionary algorithms and heuristic search to approximate the Pareto optimal solution through global search. These studies demonstrate that the multi-objective scheduling problem is theoretically a highly complex combinatorial optimization problem. Related reviews consistently point out that such methods typically require significant computational time and search costs, making them difficult to implement in real-time in systems with stringent response time requirements. This, conversely, proves that directly performing global multi-objective optimization lacks engineering feasibility in real-time concurrent scheduling scenarios. Summary of the Invention
[0009] In view of this, the purpose of this invention is to provide a multi-performance balanced pruning scheduling method for concurrent IVCPS services under limited resources. This invention aims to solve the following problems: existing scheduling methods struggle to achieve optimal coordination of multiple performance objectives under given resource constraints and are difficult to handle conflicts between multiple performance objectives.
[0010] A method for performance balancing and pruning scheduling of concurrent IVCPS services under limited resources includes the following steps:
[0011] S1. A unified abstract model is used to model the performance adjustability of concurrent service tasks;
[0012] S2. Determine whether the current concurrent request load requires initiating a performance trimming and optimization scheduling process;
[0013] If performance trimming scheduling needs to be initiated, proceed to step S3 of the performance trimming optimization scheduling process.
[0014] If performance trimming scheduling is not required, scheduling will be allocated based on the current concurrent request load.
[0015] S3. Performance trimming and optimized scheduling;
[0016] S3.1 Dynamic performance weight calculation;
[0017] A request existence indication mechanism is introduced, which only applies to the categories of requests that actually exist in the current period for performance weight normalization and allocation, so that the comprehensive performance evaluation function always reflects the "performance dimensions that the current system is really concerned with".
[0018] S3.2 Major Load Balancing Optimization;
[0019] Determine the quota of requests that need to be "degraded" from the high-performance module for each category to meet the overall time trimming requirements;
[0020] S3.3 Intra-class module scheduling;
[0021] Within each request category, the combination of high-performance and low-performance modules is enumerated, and the solution with the minimum performance loss is selected while satisfying the time constraint.
[0022] S4. Output the scheduling results.
[0023] Furthermore, the specific content of step S1 is as follows: introduce a multi-dimensional performance abstraction mechanism to uniformly map the system impact of any concurrent service request under different processing performance levels into a multi-dimensional performance index vector.
[0024] And define the following for each performance level j:
[0025] Processing time cost : Refers to the time resources consumed when processing the i-th type of request using the j-th performance level;
[0026] Multidimensional performance contribution vector:
[0027] Where i represents the category index of concurrent requests, and each category of concurrent requests i is configured with several performance level processing modules, including high performance, medium performance and low performance; j represents the performance level index; This represents a multidimensional performance contribution vector, which represents the contribution value generated in the K performance dimensions that the system focuses on when processing the i-th type of request with the j-th performance level; K represents the total number of performance dimensions that the system focuses on.
[0028] Furthermore, the specific content of step S2 is as follows:
[0029] Let S be the set of concurrent requests received by the system in the current period, and define:
[0030] T_high: Total processing time when all requests are processed using the high-performance module;
[0031] T_low: Total processing time when all requests are processed using the low-performance module;
[0032] T_limit: The maximum allowed response time threshold of the system;
[0033] The discrimination rules are as follows:
[0034] If T_high≤T_limit, then the system resources are sufficient, and all requests will use high-performance modules;
[0035] If T_low > T_limit, the system enters an unsatisfiable state, triggering an overload alarm or system degradation.
[0036] If T_low≤T_limit<T_high, then proceed to the performance pruning and optimization scheduling process in step S3.
[0037] Furthermore, the specific content of step S3.1 is as follows:
[0038] Set K general performance metrics, and each type of request has a preset weight for each performance metric;
[0039] Introducing a request existence indicator function To avoid the unrequested category interfering with performance evaluation;
[0040]
[0041] in, This represents the actual number of concurrent requests of type i within the current scheduling period;
[0042] Each performance metric is dynamically normalized, and weighted calculations are performed only on the request categories that actually exist in the current period, thereby obtaining a dynamic performance weight vector that reflects the current business load characteristics.
[0043] Furthermore, the specific content of step S3.2 is as follows:
[0044] First, define the unit performance penalty coefficient for request category i. :
[0045]
[0046] in, The dynamic weight of the k-th performance metric; , These represent the performance of the high-performance and low-performance modules on the k-th performance metric, respectively.
[0047] Define the total time required for cutting. for:
[0048]
[0049] By constructing an integer programming model, and with the goal of minimizing the total performance loss while satisfying the time pruning constraint, the degradation quota for each request category is solved.
[0050] Furthermore, the specific content of step S3.3 is as follows:
[0051] Within each request category, the determined degradation quota is further subdivided to determine the specific number of medium-performance and low-performance modules. Among all feasible combinations that meet the time-saving constraint, the module allocation scheme with the least performance loss is selected by enumeration search, thus obtaining the final scheduling decision.
[0052] Beneficial effects:
[0053] This invention proposes a multi-performance balanced pruning scheduling method for concurrent services under limited resource constraints. The improvements are achieved through the following methods: 1. Introducing unified modeling of multiple performance indicators to overcome the difficulty of handling conflicts between multiple performance objectives in existing methods; 2. A performance level pruning mechanism that allows the same type of request to flexibly switch between different performance implementations, improving resource utilization efficiency; 3. A dynamic performance weight allocation mechanism that enables scheduling decisions to adapt to the current actual load structure; 4. A two-level optimized scheduling structure that decomposes the highly complex global multi-objective problem into sub-problems that can be solved in real time. These improvements demonstrate that this invention can effectively reduce overall performance loss while meeting the overall system time constraints, addressing the key technical deficiencies of existing concurrent scheduling methods in multi-performance constraint scenarios.
[0054] Other advantages, objectives, and features of the invention will be set forth in part in the description which follows, and in part will be apparent to those skilled in the art from the following examination, or may be learned from practice of the invention. The objectives and other advantages of the invention can be realized and obtained through the following description. Attached Figure Description
[0055] Figure 1 This is a flowchart of a multi-performance balancing pruning and scheduling method for concurrent IVCPS services under limited resources, according to the present invention. Detailed Implementation
[0056] To make the technical solutions, advantages, and objectives of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. All other embodiments obtained by those skilled in the art based on the described embodiments of the present invention without creative effort are within the protection scope of this application.
[0057] like Figure 1 As shown, this invention proposes a multi-performance balancing pruning and scheduling method for concurrent IVCPS services under limited resources, including the following steps:
[0058] S1. A unified abstract model is used to model the performance adjustability of concurrent service tasks;
[0059] A multi-dimensional performance abstraction mechanism is introduced to uniformly map the system impact of any concurrent service request at different processing performance levels into a multi-dimensional performance index vector.
[0060] And define the following for each performance level j:
[0061] Processing time cost : Refers to the time resources consumed when processing the i-th type of request using the j-th performance level;
[0062] Multidimensional performance contribution vector:
[0063] Where i represents the category index of concurrent requests, and each category of concurrent requests i is configured with several optional performance level processing modules (including at least high performance, medium performance and low performance); j represents the performance level index; This represents a multidimensional performance contribution vector, which represents the contribution value generated in the K performance dimensions that the system focuses on when processing the i-th type of request with the j-th performance level; K represents the total number of performance dimensions that the system focuses on.
[0064] This modeling approach abstracts different services and functional modules into optional implementations that contribute different multidimensional performance under different time costs, thus providing a unified mathematical foundation for scheduling problems across services.
[0065] Taking concurrent speed guidance dispatching in urban road networks as an example, this service can be configured into three performance levels of processing modules as shown in the table below: high, medium, and low.
[0066] Table 1. Concurrency processing module configuration and performance indicators
[0067]
[0068] In Table 1, To determine the number of requests of type i, we need to allocate the number of each type of request across the high, medium, and low performance modules with minimal performance impact, based on the sum of the response times of all requests. , , .
[0069] This invention introduces a multi-dimensional performance abstraction mechanism at the scheduling modeling level. Instead of directly relying on specific business performance metrics, it maps the impact of concurrent service requests on the system at different processing performance levels into a standardized multi-dimensional performance metric vector. This vector only describes the relative level of a certain type of performance effect, without depending on specific business semantics, thereby achieving a unified characterization of heterogeneous business performance.
[0070] Through this abstract modeling approach, different services and functional modules are equivalently represented at the scheduling level as several performance levels that can be selected to produce different multi-dimensional performance contributions under certain time cost constraints. Thus, heterogeneous tasks that were originally difficult to compare and jointly optimize are uniformly incorporated into the same performance-resource allocation space for scheduling decisions. This allows the system to perform computable and searchable optimal configuration of multiple performance objectives, given resource constraints and the existence of feasible solutions. This modeling mechanism provides the foundation for subsequent pruning decisions and decoupled optimization scheduling, and is one of the fundamental technical features that distinguishes this invention from existing scheduling methods.
[0071] This invention does not shy away from multi-objective optimization problems. Instead, it introduces a two-level decoupled optimization scheduling structure to break down the originally difficult-to-solve global problem into structurally solvable sub-problems. This significantly reduces computational complexity while ensuring the consistency of optimization objectives, thus balancing optimality and real-time performance.
[0072] Two-level decoupled optimized scheduling structure
[0073] In scenarios with multiple performance levels and multiple concurrent requests, if global multi-objective optimization is performed directly on the performance level combination of all requests, the computational complexity increases exponentially with the number of requests, making it difficult to meet the scheduling requirements of real-time systems.
[0074] This invention starts with the problem structure and proposes a two-level decoupled optimized scheduling structure. Its core idea is not to find an approximate solution, but to structurally decompose the scheduling problem based on its inherent hierarchical nature.
[0075] The first layer focuses on load balancing across request categories: Under the constraint of total system time, determining which request categories should bear more performance pruning is essentially a resource allocation problem among different task categories.
[0076] The second layer focuses on the module combination problem within a single request category: given that the available time budget or pruning limit for the category has been determined, the optimal combination of high, medium and low performance modules within it is further determined.
[0077] Through this structure, the originally coupled high-dimensional combinatorial optimization problem is decomposed into: (1) a low-dimensional cross-class resource allocation problem; (2) several independent intra-class optimization problems with limited scale.
[0078] This decoupling is based on the inherent hierarchical differences between "inter-class trade-offs" and "intra-class choices" in terms of objective functions and constraints, thereby significantly reducing computational complexity without sacrificing global optimality, making complex pruning scheduling feasible in real-time systems.
[0079] S2. Determine whether the current concurrent request load requires initiating a performance trimming and optimization scheduling process;
[0080] Let S be the set of concurrent requests received by the system in the current period, and define:
[0081] T_high: Total processing time when all requests are processed using the high-performance module;
[0082] T_low: Total processing time when all requests are processed using the low-performance module;
[0083] T_limit: The maximum allowed response time threshold of the system;
[0084] The discrimination rules are as follows:
[0085] If T_high≤T_limit, then the system resources are sufficient, and all requests will use high-performance modules;
[0086] If T_low > T_limit, the system enters an unsatisfiable state, triggering an overload alarm or system degradation.
[0087] If T_low≤T_limit<T_high, then proceed to the performance pruning and optimization scheduling process in step S3.
[0088] S3. Performance trimming and optimized scheduling;
[0089] S3.1 Dynamic performance weight calculation;
[0090] In the comprehensive optimization of multiple performance indicators, traditional methods usually use fixed performance weights for weighted summation. However, in concurrent service systems, the distribution of request categories and their quantities often fluctuates significantly in different scheduling cycles. Fixed weights can lead to the following problems: (1) Some request categories that do not exist or have a very low proportion in this cycle still affect the performance evaluation; (2) The performance evaluation results cannot truly reflect the system's operating quality under the current load structure.
[0091] To address this, this invention proposes a dynamic performance weight adaptive mechanism based on request existence. The core idea is that the importance of performance metrics is not a static attribute, but rather should be determined by the actual business structure occurring within the current scheduling cycle. By introducing a request existence indication mechanism, performance weight normalization and allocation are only applied to the categories of requests actually existing within the current cycle, ensuring that the comprehensive performance evaluation function always reflects the "performance dimensions that the current system truly cares about." This avoids the performance evaluation distortion problem common in traditional fixed-weight methods, allowing pruning decisions to adaptively adjust with changes in load structure.
[0092] Set K general performance metrics, and each type of request has a preset weight for each performance metric;
[0093] Introducing a request existence indicator function To avoid the unrequested category interfering with performance evaluation;
[0094]
[0095] in, This represents the actual number of concurrent requests of type i within the current scheduling period;
[0096] Each performance metric is dynamically normalized, and weighted calculations are performed only on the request categories that actually exist in the current period, thereby obtaining a dynamic performance weight vector that reflects the current business load characteristics.
[0097] S3.2 Major Load Balancing Optimization;
[0098] Determine the quota of requests that need to be "degraded" from the high-performance module for each category to meet the overall time trimming requirements;
[0099] First, define the unit performance penalty coefficient for request category i. :
[0100]
[0101] in, The dynamic weight of the k-th performance metric; , These represent the performance of the high-performance and low-performance modules on the k-th performance metric, respectively.
[0102] Define the total time required for cutting. for:
[0103]
[0104] By constructing an integer programming model, and with the goal of minimizing the total performance loss while satisfying the time pruning constraint, the degradation quota for each request category is solved.
[0105] S3.3 Intra-class module scheduling;
[0106] Within each request category, the determined degradation quota is further subdivided to determine the specific number of medium-performance and low-performance modules. Among all feasible combinations that meet the time-saving constraint, the module allocation scheme with the least performance loss is selected by enumeration search, thus obtaining the final scheduling decision.
[0107] For example, in IVCPS concurrent scheduling, within each type of request, the specific allocation of the nci degradation quotas—how many medium-performance modules (ni2) and low-performance modules (ni3)—is determined for further refinement and optimization. Given the determined degradation quota nci, this module first needs to address the time constraint: the time saved by using medium and low-performance modules must be no less than the time saved by all nci requests being downgraded from high performance to low performance. Secondly, it needs to complete the performance optimization task: among all (ni2, ni3) combinations that satisfy the time constraint, the combination that minimizes the total performance loss for this type of request is selected. The performance loss coefficients for using medium-performance and low-performance modules are βi2 and βi3, respectively. Due to the limited number of variables (only ni2 and ni3), this module uses an efficient enumeration method to quickly obtain the optimal module allocation scheme (ni1, ni2, ni3) within each type of request, thus completing the final scheduling decision.
[0108] The performance pruning optimization scheduling process of this invention adopts a "time-performance balanced pruning solution strategy";
[0109] Unlike existing trimming methods that assume "trimming is reasonable by default", this invention explicitly models the trimming behavior as a performance optimization problem subject to rigid time constraints. Under this framework, the following should be satisfied: (1) the total system response time constraint is considered an inviolable hard constraint; (2) all performance trimming behaviors are based on "satisfying the time constraint"; (3) among all trimming schemes that satisfy the constraints, the goal is to minimize the overall system performance loss.
[0110] The essence of this strategy lies not in whether to prune, but in answering the core question of to what extent to prune, which tasks to prune, and what combination of performance levels to use, under current resource conditions, to minimize system performance loss. By incorporating time constraints and performance loss objectives into a unified optimization model, and combining mathematical programming with finite enumeration to solve the problem, this invention achieves provable optimality or near-optimality of pruning decisions while ensuring real-time performance.
[0111] S4. Output the scheduling results.
[0112] It is hereby declared that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.
Claims
1. A multi-performance balancing pruning and scheduling method for concurrent IVCPS services under limited resources, characterized in that, Includes the following steps: S1. A unified abstract model is used to model the performance adjustability of concurrent service tasks; S2. Determine whether the current concurrent request load requires initiating a performance trimming and optimization scheduling process; If performance trimming scheduling needs to be initiated, proceed to step S3 of the performance trimming optimization scheduling process. If performance trimming scheduling is not required, scheduling will be allocated based on the current concurrent request load. S3. Performance trimming and optimized scheduling; S3.1 Dynamic performance weight calculation; A request existence indication mechanism is introduced, which only applies to the categories of requests that actually exist in the current period for performance weight normalization and allocation, so that the comprehensive performance evaluation function always reflects the "performance dimensions that the current system is truly concerned with". S3.2 Major Category Load Balancing Optimization; Determine the quota of requests that need to be "degraded" from the high-performance module for each category to meet the overall time trimming requirements; S3.3 Intra-class module scheduling; Within each request category, the combination of high-performance and low-performance modules is enumerated, and the solution with the minimum performance loss is selected while satisfying the time constraint. S4. Output the scheduling results.
2. The method for multi-performance balancing pruning and scheduling of concurrent IVCPS services under limited resources according to claim 1, characterized in that, The specific content of step S1 is as follows: introduce a multi-dimensional performance abstraction mechanism to uniformly map the system impact of any concurrent service request under different processing performance levels into a multi-dimensional performance index vector. And define for each performance level j: Processing time cost : Refers to the time resources consumed when processing the i-th type of request using the j-th performance level; Multidimensional performance contribution vector: Where i represents the category index of concurrent requests, and each category of concurrent requests i is configured with several performance level processing modules, including high performance, medium performance and low performance; j represents the performance level index; This represents a multidimensional performance contribution vector, which represents the contribution value generated in the K performance dimensions that the system focuses on when processing the i-th type of request with the j-th performance level; K represents the total number of performance dimensions that the system focuses on.
3. The method for multi-performance balancing pruning and scheduling of concurrent IVCPS services under limited resources according to claim 2, characterized in that, The specific content of step S2 is as follows: Let S be the set of concurrent requests received by the system in the current period, and define: T_high: Total processing time when all requests are processed using the high-performance module; T_low: Total processing time when all requests are processed using the low-performance module; T_limit: The maximum allowed response time threshold of the system; The discrimination rules are as follows: If T_high≤T_limit, then the system resources are sufficient, and all requests will use high-performance modules; If T_low > T_limit, the system enters an unsatisfiable state, triggering an overload alarm or system degradation. If T_low≤T_limit<T_high, then proceed to the performance pruning and optimization scheduling process in step S3.
4. The method for multi-performance balancing pruning and scheduling of concurrent IVCPS services under limited resources according to claim 3, characterized in that: The specific content of step S3.1 is as follows: Set K general performance metrics, and each type of request has a preset weight for each performance metric; Introducing a request existence indicator function To avoid the unrequested category interfering with performance evaluation; in, This represents the actual number of concurrent requests of type i within the current scheduling period; Each performance metric is dynamically normalized, and weighted calculations are performed only on the request categories that actually exist in the current period, thereby obtaining a dynamic performance weight vector that reflects the current business load characteristics.
5. The method for multi-performance balancing pruning and scheduling of concurrent IVCPS services under limited resources according to claim 4, characterized in that, The specific content of step S3.2 is as follows: First, define the unit performance penalty coefficient for request category i. : in, The dynamic weight of the k-th performance metric; , These represent the performance of the high-performance and low-performance modules on the k-th performance metric, respectively. Set the total time required for cutting. for: By constructing an integer programming model, and with the goal of minimizing the total performance loss while satisfying the time pruning constraint, the degradation quota for each request category is solved.
6. The method for multi-performance balancing pruning and scheduling of concurrent IVCPS services under limited resources according to claim 5, characterized in that, The specific content of step S3.3 is as follows: Within each request category, the determined degradation quota is further subdivided to determine the specific number of medium-performance and low-performance modules. Among all feasible combinations that meet the time-saving constraint, the module allocation scheme with the least performance loss is selected by enumeration search, thus obtaining the final scheduling decision.