An application instance capacity allocation optimization method, a terminal and a readable storage medium
By monitoring resource usage in a microservice architecture, allocating and storing remaining memory resources in a pre-defined pool to execute interruptible tasks, the problem of idle resources in a microservice architecture is solved, and flexible management and efficient utilization of resources are achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- FUJIAN TQ ONLINE INTERACTIVE INC
- Filing Date
- 2026-01-21
- Publication Date
- 2026-06-12
AI Technical Summary
In a microservice architecture, instance scaling leads to idle and wasted resources, and existing technologies cannot effectively optimize resource utilization.
By monitoring the resource usage of application instances, total memory resources are allocated and the remaining memory resources are calculated. These resources are then stored in a preset resource pool to execute interruptible tasks. When the pool is expanded, the tasks are terminated and the resources are reclaimed and redistributed.
It improved resource utilization, ensured dynamic response capability of resource scheduling, optimized the flexibility and overall efficiency of resource management, and guaranteed the needs of critical applications.
Smart Images

Figure CN122195631A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of memory management technology, and in particular to an application instance capacity allocation optimization method, a terminal, and a readable storage medium. Background Technology
[0002] Currently, with the widespread adoption of microservice architectures, systems typically rely on multi-instance deployments to improve performance and throughput. However, each instance requires additional resources. To improve overall system throughput, instance scaling is performed, including adding new instances and expanding the capacity of existing instances. Each microservice subsystem requests a large amount of resources to meet its own business needs. Once the request is approved, these resources are allocated immediately. However, in actual use, the system may not immediately require these resources, resulting in many requested resources remaining idle and wasting resources. Summary of the Invention
[0003] The technical problem to be solved by the present invention is to provide an application instance capacity allocation optimization method, a terminal and a readable storage medium, which can realize the dynamic expansion of application instances and improve resource utilization.
[0004] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is as follows: An application instance capacity allocation optimization method includes the following steps: Based on the expansion request of the application instance to be expanded, allocate the corresponding total memory resources to it, and calculate the remaining memory resources based on the total memory resources and the actual memory requirements of the application instance to be expanded. The remaining memory resources are stored in a preset resource pool, and the remaining memory resources are used to execute interruptible tasks; When the application instance to be expanded needs to be expanded again, the interruptible task is terminated, and the remaining memory resources are reclaimed from the preset resource pool and redistributed to the application instance to be expanded.
[0005] To solve the above-mentioned technical problems, another technical solution adopted by the present invention is as follows: An application instance capacity allocation optimization terminal includes a memory, a processor, and a computer program stored in the memory and executable on the processor. The processor executes the computer program to implement the various steps of the application instance capacity allocation optimization method described above.
[0006] To solve the above-mentioned technical problems, another technical solution adopted by the present invention is as follows: A computer-readable storage medium storing a computer program that is executed by a processor to implement the steps of the above-described application example capacity allocation optimization method.
[0007] The beneficial effects of this invention are as follows: This invention provides an application instance capacity allocation optimization method, terminal, and readable storage medium. Based on the expansion request of the application instance to be expanded, the total memory resources are allocated, and the remaining memory resources are calculated in combination with its actual memory needs. The remaining memory resources are stored in a preset resource pool for the execution of interruptible tasks, which can improve resource utilization. When the application instance needs to be expanded again, the interruptible tasks are terminated and the resources are reclaimed and redistributed, ensuring the dynamic response capability of resource scheduling, prioritizing the needs of critical applications, thereby optimizing the flexibility and overall efficiency of resource management. Attached Figure Description
[0008] Figure 1 This is a flowchart of a capacity allocation optimization method according to an embodiment of the present invention; Figure 2 This is a schematic diagram of a capacity allocation optimization terminal, an application example of an embodiment of the present invention. Label Explanation: 1. An application instance capacity allocation optimization terminal; 2. Memory; 3. Processor. Detailed Implementation
[0009] To explain in detail the technical content, objectives, and effects of the present invention, the following description is provided in conjunction with the embodiments and accompanying drawings.
[0010] The following is an explanation of the technical terms used in this invention: (1) Interruptible task: refers to an external application or external instance task that has a short execution time and consumes few resources and can be terminated at any time.
[0011] (2) Candidate execution tasks: refers to all tasks that may be scheduled to run on the remaining resources in the resource pool.
[0012] (3) Ownership label: is a mark used to identify and associate idle resources with their original allocation objects, ensuring that each resource has a clear owner and avoiding ambiguity of ownership.
[0013] Currently, microservice architecture is widely adopted. Systems typically improve performance and throughput through multi-instance deployment, and instance scaling includes adding new instances and increasing the capacity of existing instances. In practical applications, each microservice subsystem often pre-applies for a large resource quota to meet its own business needs, and the resources are allocated immediately after approval. However, because business load growth is often gradual or fluctuating, the system does not immediately require all allocated resources during actual operation, resulting in a large amount of resources remaining idle for a considerable period, leading to low resource utilization and wasted costs.
[0014] Please refer to Figure 1This invention provides an application instance capacity allocation optimization method, including the following steps: Based on the expansion request of the application instance to be expanded, allocate the corresponding total memory resources to it, and calculate the remaining memory resources based on the total memory resources and the actual memory requirements of the application instance to be expanded. The remaining memory resources are stored in a preset resource pool, and the remaining memory resources are used to execute interruptible tasks; When the application instance to be expanded needs to be expanded again, the interruptible task is terminated, and the remaining memory resources are reclaimed from the preset resource pool and redistributed to the application instance to be expanded.
[0015] As can be seen from the above description, the beneficial effects of the present invention are as follows: by allocating total memory resources based on the expansion request of the application instance to be expanded and calculating the remaining memory resources beyond its actual memory requirements, the remaining memory resources are stored in a preset resource pool to execute interruptible tasks, so that idle resources can be temporarily put into use, effectively improving resource utilization. When the application instance to be expanded needs to be expanded again, the interruptible tasks are terminated, and the remaining memory resources are reclaimed from the resource pool and redistributed to the application instance, thereby ensuring that resources can be adjusted as needed, guaranteeing the immediacy and flexibility of application instance expansion, and enhancing the flexibility and overall efficiency of resource management.
[0016] Furthermore, based on the scaling request of the application instance to be scaled, the corresponding total memory resources are allocated to it, including: The resource usage of each application instance is monitored in real time. It is determined whether the resource usage meets the preset expansion conditions. If so, the corresponding application instance is selected as the application instance to be expanded, and an expansion request for the application instance to be expanded is generated.
[0017] As described above, by monitoring the resource usage of application instances in real time and determining whether they meet the preset expansion conditions based on the resource usage, application instances whose resource load has reached the critical point can be identified in a timely manner, effectively preventing performance degradation due to insufficient resources. When the preset expansion conditions are met, an expansion request is generated to ensure the timeliness and automation of the expansion operation and improve the response efficiency of resource management.
[0018] Furthermore, the resource usage includes CPU usage, memory usage, and instance request concurrency; Determining whether the resource usage meets the preset expansion conditions includes: If any one of the CPU usage, memory usage, and instance request concurrency reaches the corresponding preset expansion threshold, then the preset expansion conditions are met.
[0019] As described above, by comprehensively monitoring the resource usage of application instances, including CPU usage, memory usage, and instance request concurrency, the actual pressure status of application instances can be reflected more comprehensively and accurately. When any of the resource usage parameters reaches its corresponding preset expansion threshold, it is determined that the preset expansion conditions are met. This enables rapid response to various critical resource bottlenecks, improves the sensitivity and timeliness of expansion triggering, and effectively prevents service unavailability issues caused by resource overload in a single dimension.
[0020] Furthermore, based on the total memory resources and the actual memory requirements of the application instances to be expanded, the remaining memory resources are calculated, including: Determine whether the memory capacity of the application instance to be expanded has reached a preset capacity threshold. If not, expand the application instance to be expanded, calculate the difference between the actual memory requirement of the application instance to be expanded and the total memory resources, and obtain the remaining memory resources.
[0021] As described above, by determining whether the memory capacity of the application instance to be expanded has reached the preset capacity threshold, the expansion method is determined. The remaining memory resources are determined by calculating the difference between the actual memory required by the application instance after expansion and the total memory resources. This enables accurate quantification and effective identification of allocated but unused resources, providing an accurate basis for the subsequent execution of interruptible tasks.
[0022] Furthermore, based on the total memory resources and the actual memory requirements of the application instances to be expanded, the remaining memory resources are calculated, including: Determine whether the memory capacity of the application instance to be expanded has reached a preset capacity threshold. If so, add a new application instance, calculate the difference between the actual memory requirement of the new application instance and the total memory resources, and obtain the remaining memory resources.
[0023] As described above, by determining that the memory capacity of the instance to be expanded has reached the preset capacity threshold, the expansion scenario of the new application instance can be accurately identified, enabling the resource allocation strategy to cover different scenarios of instance expansion. The difference between the actual memory requirement of the new application instance and the total allocated memory resources is calculated to determine the remaining memory resources. This effectively identifies and quantifies the allocated but unused resources, providing an accurate basis for the subsequent execution of interruptible tasks, thereby improving resource utilization and resource management capabilities.
[0024] Furthermore, the remaining memory resources are used to execute an interruptible task, which previously included: Monitor all candidate execution tasks, obtain the execution time and resource consumption data of each candidate execution task, classify them according to the execution time and resource consumption data according to the preset evaluation rules and add corresponding task tags, and filter out interruptible tasks from the candidate execution tasks based on the task tags.
[0025] As described above, by monitoring and acquiring the execution duration and resource consumption data of all candidate tasks, task operation characteristics can be collected, providing an accurate data foundation for subsequent analysis and screening. Based on the execution duration and resource consumption data, all candidate tasks are classified according to preset evaluation rules and corresponding task tags are added, realizing structured identification and marking of task characteristics, enhancing the precision and controllability of task management. Based on the task tags, interruptible tasks are screened from the candidate tasks, thereby accurately identifying task types that are suitable for execution using idle resources and can be safely interrupted, ensuring that temporary calls to the resource pool are both efficient and have no impact on core business.
[0026] Furthermore, real-time monitoring of resource usage for each application instance also includes: The real-time monitoring results of each application instance are stored in the same distributed cache cluster.
[0027] As described above, by centrally storing the real-time monitoring results of each application instance, the integrity and persistence of the data are ensured. Furthermore, by storing all data in the same distributed cache cluster, unified access and centralized management of monitoring data are achieved, effectively avoiding data dispersion and heterogeneity issues, and greatly improving the efficiency of data access and processing.
[0028] Furthermore, storing the remaining memory resources in a preset resource pool also includes: Configure corresponding ownership tags for the remaining memory resources based on the application instance to be expanded.
[0029] As described above, by configuring corresponding ownership tags for remaining memory resources, a clear ownership association is established between resources and application instances, ensuring resource traceability. Furthermore, resource pools can be managed based on ownership tags. When an instance needs to be expanded again, its dedicated idle resources can be quickly and accurately identified and reclaimed, avoiding resource mismatch, improving the accuracy and efficiency of resource rollback, and ensuring the timeliness of expansion operations.
[0030] Please refer to Figure 2 Another embodiment of the present invention provides an application instance capacity allocation optimization terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the various steps of the above-described application instance capacity allocation optimization method.
[0031] The application instance capacity allocation optimization method, terminal, and readable storage medium described above are applicable to application instance expansion scenarios, enabling real-time dynamic expansion of application instances and improving resource utilization. The following detailed implementation methods illustrate these methods: Please refer to Figure 1 One embodiment of the present invention is as follows: An application instance capacity allocation optimization method includes the following steps: S0. Monitor the resource usage of each application instance in real time, including CPU usage, memory usage, and instance request concurrency. If any one of the CPU usage, memory usage, and instance request concurrency reaches the corresponding preset expansion threshold, it is determined that the preset expansion conditions are met, and the corresponding application instance is taken as the application instance to be expanded, and an expansion request for the application instance to be expanded is generated.
[0032] In this embodiment, the monitoring system within the overall microservice architecture monitors the CPU usage, memory usage, and instance request concurrency of all application instances, using the monitoring results as the basis for scaling up. When any one of these three metrics reaches a preset scaling up threshold, the system determines that the preset scaling up conditions are met, designates the corresponding application instance as the scaling up instance, and generates a scaling up request for the application instance to be scaled up. The preset scaling up threshold can be set according to specific business rules. For example, scaling up may be triggered when the memory usage of an application instance reaches 80% of the total memory; or, if the real-time request concurrency of an application instance reaches a certain level, scaling up may be performed in advance to ensure service quality even if CPU and memory usage are not high.
[0033] Furthermore, in this embodiment, real-time monitoring of the resource usage of each application instance also includes: storing the real-time monitoring results of each application instance in the same distributed cache cluster. The storage data format is: application ID, instance ID, instance CPU usage, memory usage, and instance request volume. This method ensures data integrity and persistence, and enables unified access and centralized management of monitoring data, effectively avoiding data dispersion and heterogeneity issues, and greatly improving the efficiency of data access and processing.
[0034] S1. Allocate corresponding total memory resources to the application instance to be expanded based on its expansion request, and calculate the remaining memory resources based on the total memory resources and the actual memory requirements of the application instance to be expanded, specifically including S1.1-S1.2.
[0035] S1.1 Determine whether the memory capacity of the application instance to be expanded has reached the preset capacity threshold. If not, expand the application instance to be expanded, calculate the difference between the actual memory requirement of the application instance to be expanded and the total memory resources, and obtain the remaining memory resources.
[0036] In this embodiment, the expansion method is determined by judging whether the memory capacity of the application instance to be expanded has reached a preset capacity threshold.
[0037] When the application instance to be expanded has not reached the preset capacity threshold, expansion is performed by increasing the capacity of the application instance to be expanded. The difference between the actual memory requirement of the application instance after expansion and the total memory resources is calculated to determine the remaining memory resources. This achieves accurate quantification and effective identification of allocated but unused resources, providing an accurate basis for subsequent execution of interruptible tasks. Taking an existing instance with a maximum memory of 1G as an example, its current concurrency is 100, and its current memory usage is 0.8G, meeting the preset expansion conditions. An expansion request is generated to request an expansion ratio of 4 times the maximum memory. After expanding the existing instance, the maximum memory of the expanded existing instance is 4G, and the actual memory requirement is twice the currently used memory, i.e., 1.6G. The difference between the total memory resources and the actual memory requirement of the expanded existing instance is calculated, resulting in a remaining memory resource of 4 - 1.6 = 2.4G.
[0038] S1.2 Determine whether the memory capacity of the application instance to be expanded has reached the preset capacity threshold. If so, add a new application instance, calculate the difference between the actual memory requirement of the new application instance and the total memory resources, and obtain the remaining memory resources.
[0039] In this embodiment, when the memory capacity of the application instance to be expanded reaches a preset capacity threshold, it indicates that the expansion method is to add a new application instance. The remaining memory resources are determined by calculating the difference between the actual memory requirement of the new application instance and the total allocated memory resources. This effectively identifies and quantifies the allocated but unused resources, providing an accurate basis for the subsequent execution of interruptible tasks, thereby improving resource utilization and resource management capabilities. Taking an existing instance with 1GB of memory and a request for 4GB of total memory resources as an example, a brand new application instance (i.e., a new instance) is created, 1GB of memory is allocated to it, and the difference between the total memory resources and the actual memory requirement of the new instance is calculated, resulting in a remaining memory resource of 3GB.
[0040] S2. Monitor all candidate execution tasks, obtain the execution time and resource consumption data of each candidate execution task, classify them according to the execution time and resource consumption data according to the preset evaluation rules and add corresponding task tags, and filter out interruptible tasks from the candidate execution tasks based on the task tags. The remaining memory resources are stored in a preset resource pool, and the remaining memory resources are used to execute interruptible tasks.
[0041] In this embodiment, a pre-defined resource pool serves as the unit for managing remaining memory resources. It aggregates idle memory stripped from various expansion operations. This idle memory can be used to execute non-critical, interruptible computing tasks, typically originating from external applications or instances. To identify which computing tasks are interruptible, the runtime metrics of all candidate tasks are monitored, including execution time and resource consumption data. Tasks are categorized based on pre-defined evaluation rules and tagged accordingly. This allows for the selection of suitable interruptible tasks for execution in the resource pool based on these tags. In this way, remaining memory resources are effectively utilized to support these tagged tasks, significantly improving overall resource utilization. Furthermore, because tasks can be interrupted, when memory in the resource pool is reclaimed due to the need for expansion of the source instance, the termination of the task will not affect the stability of the core business process, thus achieving elastic resource reuse and secure scheduling.
[0042] Furthermore, in this embodiment, storing the remaining memory resources in a preset resource pool also includes: configuring corresponding ownership tags for the remaining memory resources based on the application instance to be expanded. This establishes a clear ownership association between the resources and the application instance, ensuring resource traceability.
[0043] S3. When the application instance to be expanded needs to be expanded again, the interruptible task is terminated, and the remaining memory resources are reclaimed from the preset resource pool and redistributed to the application instance to be expanded.
[0044] In this embodiment, when an instance needs to be expanded again (such as when the concurrent requests for the instance continue to increase), its dedicated idle resources are quickly and accurately identified and reclaimed from the preset resource pool based on the ownership tag, avoiding resource mismatch, improving the accuracy and efficiency of resource rollback, and ensuring the timeliness of the expansion operation.
[0045] According to another aspect of the invention, Figure 2 This is a schematic diagram illustrating an application instance capacity allocation optimization terminal according to an embodiment of the present invention. It includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the various steps of the application instance capacity allocation optimization method described above.
[0046] According to another aspect of the present invention, a computer-readable medium is provided. The computer-readable medium stores a computer program that is executed by a processor to implement an application instance capacity allocation optimization method as described above.
[0047] In summary, this invention provides an application instance capacity allocation optimization method, terminal, and readable storage medium. By allocating total memory resources based on the expansion request of the application instance to be expanded and calculating the remaining memory resources beyond its actual memory requirements, the remaining memory resources are stored in a preset resource pool to execute interruptible tasks. This allows idle resources to be temporarily put into use, effectively improving resource utilization. When the application instance to be expanded needs to be expanded again, the interruptible tasks are terminated, and the remaining memory resources are reclaimed from the resource pool and redistributed to the application instance. This ensures that resources can be adjusted as needed, guarantees the immediacy and elasticity of application instance expansion, and enhances the flexibility and overall efficiency of resource management.
[0048] The above description is merely an embodiment of the present invention and does not limit the patent scope of the present invention. Any equivalent modifications made based on the content of the present invention specification and drawings, or direct or indirect applications in related technical fields, are similarly included within the patent protection scope of the present invention.
Claims
1. A method for optimizing capacity allocation in application instances, characterized in that, Including the following steps: Based on the expansion request of the application instance to be expanded, allocate the corresponding total memory resources to it, and calculate the remaining memory resources based on the total memory resources and the actual memory requirements of the application instance to be expanded. The remaining memory resources are stored in a preset resource pool, and the remaining memory resources are used to execute interruptible tasks; When the application instance to be expanded needs to be expanded again, the interruptible task is terminated, and the remaining memory resources are reclaimed from the preset resource pool and redistributed to the application instance to be expanded.
2. The application instance capacity allocation optimization method according to claim 1, characterized in that, Based on the expansion request of the application instance to be expanded, allocate the corresponding total memory resources to it, which includes: The resource usage of each application instance is monitored in real time. It is determined whether the resource usage meets the preset expansion conditions. If so, the corresponding application instance is selected as the application instance to be expanded, and an expansion request for the application instance to be expanded is generated.
3. The application instance capacity allocation optimization method according to claim 2, characterized in that, The resource usage includes CPU usage, memory usage, and instance request concurrency. Determining whether the resource usage meets the preset expansion conditions includes: If any one of the CPU usage, memory usage, and instance request concurrency reaches the corresponding preset expansion threshold, then the preset expansion conditions are met.
4. The application instance capacity allocation optimization method according to claim 1, characterized in that, Based on the total memory resources and the actual memory requirements of the application instances to be expanded, the remaining memory resources are calculated, including: Determine whether the memory capacity of the application instance to be expanded has reached a preset capacity threshold. If not, expand the application instance to be expanded, calculate the difference between the actual memory requirement of the application instance to be expanded and the total memory resources, and obtain the remaining memory resources.
5. The application instance capacity allocation optimization method according to claim 1, characterized in that, Based on the total memory resources and the actual memory requirements of the application instances to be expanded, the remaining memory resources are calculated, including: Determine whether the memory capacity of the application instance to be expanded has reached a preset capacity threshold. If so, add a new application instance, calculate the difference between the actual memory requirement of the new application instance and the total memory resources, and obtain the remaining memory resources.
6. The application instance capacity allocation optimization method according to claim 1, characterized in that, Executing an interruptible task using the remaining memory resources, prior to: Monitor all candidate execution tasks, obtain the execution time and resource consumption data of each candidate execution task, classify them according to the execution time and resource consumption data according to the preset evaluation rules and add corresponding task tags, and filter out interruptible tasks from the candidate execution tasks based on the task tags.
7. The application instance capacity allocation optimization method according to claim 2, characterized in that, Real-time monitoring of resource usage for each application instance also includes: The real-time monitoring results of each application instance are stored in the same distributed cache cluster.
8. The application instance capacity allocation optimization method according to claim 1, characterized in that, Storing the remaining memory resources into a preset resource pool also includes: Configure corresponding ownership tags for the remaining memory resources based on the application instance to be expanded.
9. An application instance capacity allocation optimization terminal, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, characterized in that, When the processor executes the computer program, it implements each step of the application instance capacity allocation optimization method according to any one of claims 1 to 8.
10. A computer-readable storage medium storing a computer program that is executed by a processor to implement the steps of the application instance capacity allocation optimization method according to any one of claims 1 to 8.