An auto-scaling system and method for cloud-native intelligent typesetting services

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By combining a master resource scheduler and a slave resource scheduler with a planning search algorithm, the problem of unreasonable utilization of cloud resources is solved, achieving efficient service deployment and rational use of resources.

CN116263715BActive Publication Date: 2026-06-30HANGZHOU DIANZI UNIV

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HANGZHOU DIANZI UNIV
Filing Date: 2022-12-22
Publication Date: 2026-06-30

Smart Images

Figure CN116263715B_ABST

Patent Text Reader

Abstract

This invention discloses an auto-scaling system and method for cloud-native intelligent typesetting services. The system includes a master resource scheduler: for receiving service requests, determining whether current cluster resources are sufficient and automatically scaling, and for phased distribution of service requests; and slave resource schedulers: for receiving phased distributed service requests, determining whether the resources within the containers providing the service locally are sufficient and automatically scaling, and for feeding back request results and resource usage to the master resource scheduler. This invention utilizes a planning algorithm to predict service deployment methods and a search algorithm to predict whether service deployment methods will lead to excessively long service response times. While ensuring timely service completion, it rationally and efficiently utilizes cloud resources. Combined with deployed services, this invention performs targeted automated deployment of microservices under known microservice processes and estimated resource consumption and service duration, while also considering cloud resource utilization and timely scaling up or down cloud resources.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of cloud computing technology, and in particular to an automatic scaling system and method for cloud-native intelligent typesetting services. Background Technology

[0002] Cloud computing provides access to a broad collection of servers, storage, databases, and application services via the internet. Cloud service providers offer cloud services, such as web services, and business applications hosted on servers in one or more data centers accessible to companies or individuals via the internet. Hyperscale cloud service providers typically have hundreds of thousands of servers. Programs are broken down into microservices running on cloud servers, deployed using containers such as Docker, and managed using containers such as Kubernetes. Kubernetes' deployment and autoscaling mechanisms are based on the deployment itself, not on the services being deployed. During service deployment, insufficient randomly matched machine resources may cause deployment failures, leading to multiple deployments. Service scheduling does not prioritize deployments, often resulting in long server response times. Because it is service-independent, Kubernetes cannot efficiently deploy services or utilize cloud resources. Summary of the Invention

[0003] The present invention aims to overcome the problems of existing technologies, such as service deployment failure due to insufficient randomly matched machine resources, resulting in multiple deployments and low efficiency due to unreasonable cloud resource utilization, and provides an automatic scaling system and method for cloud-native intelligent typesetting services.

[0004] To achieve the above objectives, the present invention adopts the following technical solution:

[0005] An auto-scaling system for cloud-native intelligent layout services includes a master resource scheduler: for receiving service requests, determining whether current cluster resources are sufficient and automatically scaling, and for phased distribution of service requests; and slave resource schedulers: for receiving phased distributed service requests, determining whether the resources within the containers providing the service locally are sufficient and automatically scaling, and for reporting request results and resource usage to the master resource scheduler. This invention utilizes a planning algorithm to predict service deployment methods and a search algorithm to predict whether service deployment methods will lead to excessively long service response times. While ensuring timely service completion, it rationally and efficiently utilizes cloud resources. This invention, combined with the deployed services, performs targeted automated deployment of microservices under the given DAG flowchart and estimated resource consumption and service duration, while also considering cloud resource utilization and timely scaling up or down cloud resources.

[0006] As a preferred embodiment of the present invention, the master resource scheduler includes: a service type determination module: used to determine the request type and service process, assign a unique ID to the request, initialize the service process record file, and send the request ID to the request distribution module; a service distribution module: after receiving the request ID, it sends a resource query request to the cluster resource management module to find the optimal slave resource scheduler in the cluster to distribute the service request; a cluster resource management module: records all resources in the cluster, updates the number of available and occupied resources based on the feedback from the slave resource schedulers; receives resource query requests and returns the number of available resources; receives requests to add cluster resources, sends a request to add resources to the cloud platform, and returns a message indicating whether the resource addition was successful or not after receiving feedback from the cloud platform; after a certain period of time without receiving resource query requests, it checks the number of available and occupied resources and sends a request to reduce resources to the cloud platform.

[0007] As a preferred embodiment of the present invention, the slave resource scheduler includes: a container service module: receiving service requests, checking if there are containers on the local machine that have completed the corresponding service; if not, starting a new container; if so, sending a resource query request to the container resource management module to check if the container has been allocated sufficient resources; if so, sending a task to the container and sending resource occupancy information to the container resource management module; and sending resource release information after receiving feedback from the container; and a container resource management module: recording all resources in the slave resource scheduler, receiving resource query requests, returning the number of resources allocated to the container, updating the total number of available and occupied resources in the slave resource scheduler based on feedback from the container service module, and sending statistical information to the master resource scheduler.

[0008] An autoscaling method for cloud-native intelligent typesetting services includes the following steps: S1: The master resource scheduler receives user service requests, determines the service process based on the request type, and calculates the estimated resource consumption of the request and the idle resources of the cluster; S2: If the estimated resource consumption is less than the idle resources of the cluster, proceed directly to S3; otherwise, send an increase request for cluster resources to the cluster resource management module, and then execute S3; S3: The service type judgment module initializes the service process log file, selects the optimal slave resource scheduler based on the calculated estimated resource consumption, and after the increase request for cluster resources is successful, allocates resources to the optimal slave resource scheduler. The slave resource scheduler receives the service request, and the container service module adds the request to the service request list; S4: Check if there are containers on the local machine that have completed the corresponding service. If so, allocate resources to the container... The source management module sends a resource query request and directly executes S7. If no resource is available, a new container is started, and S5 is executed. S5: Check if the container has free resources to meet the resource requirements of various services in the service set. If so, S6 is executed directly. If not, container resources are adjusted, the service set is allocated to each container, the resource table of the container resource management module is updated, and after statistics, the cluster resource management module is requested to update the resource usage from the resource scheduler. S6: After the container completes the phased service, the return value is written to the service process record file and a request ID is sent to the container service module. The container service module updates the resource table of the container resource management module according to the container service type and container name, releases the resource usage, and returns to execute S4. S7: The container service module completes the request and returns the execution result to the user, and the process ends. This invention provides an automatic scaling method for cloud-native intelligent typesetting services. It uses a planning algorithm to predict service deployment methods and a search algorithm to predict whether the service deployment method will lead to excessively long service response times. While ensuring timely service completion, it makes reasonable and efficient use of cloud resources. This invention is combined with the deployed services. Under the condition of knowing the DAG flowchart of the microservice and the estimated resource consumption and service duration, it performs targeted automatic deployment of microservices, while considering the utilization rate of cloud resources and timely reducing or expanding cloud resources.

[0009] As a preferred embodiment of the present invention, S1 specifically involves: after the main resource scheduler receives a user's service request, the service type judgment module determines the request type according to the request type dictionary, determines the service process required for this request, calculates the estimated resource consumption of this request according to the service resource consumption dictionary, and queries the cluster resource management module for cluster resource statistics to obtain the cluster's idle resources.

[0010] As a preferred embodiment of the present invention, after sending a request to increase cluster resources to the cluster resource management module in step S2, if no response is received within a certain period of time, or if the request to increase cluster resources is unsuccessful, or if the request to increase cluster resources is successful, step S3 shall continue to be executed.

[0011] As a preferred embodiment of the present invention, S3 specifically comprises: the service type judgment module generating the ID of this request and initializing the service process record file of this request, sending the request ID to the request distribution module, the request distribution module querying the cluster resource management module for the most available slave resource scheduler of each type of resource based on the statistical estimated resource consumption, i.e. the optimal slave resource scheduler, the cluster resource management module returning the address of the slave resource scheduler, if the cluster resource management module receives feedback from the cloud platform to send a request to add resources, it allocates the resources to the optimal slave resource scheduler, even if the request to add resources fails, it still returns the address of the slave resource scheduler, the slave resource scheduler receives the service request, the container service module adds the request to the service request list, and when the slave resource scheduler returns the request ID, it returns the result of this request to the client.

[0012] As a preferred embodiment of the present invention, the step S4 of checking whether there is a container that has completed the corresponding service locally specifically involves: finding the first service with no return value for each request in the service request list when allocating a task; using integer programming to derive a list of service sets for a single allocation, without exceeding the existing idle resources from the resource scheduler and using as many idle resources as possible; using a tree search method to predict whether each service set will cause a certain request to have an excessively long final response time; selecting the optimal service set; and checking whether there is a container that has completed the corresponding service locally based on the container service dictionary.

[0013] As a preferred embodiment of the present invention, after the resource table of the container resource management module completes the resource release operation, it counts the available and occupied amounts of various resources owned by the resource scheduler, and requests to update the resource information of the slave resource scheduler in the cluster resource management module. When the container resource management module does not receive the request to release the resource occupation within a set time, it counts the available and occupied amounts of various resources owned by all slave resource schedulers. When the available amount of various resources is greater than the set value, it requests the cluster resource management module to reduce the various resources of the slave resource scheduler.

[0014] Therefore, the present invention has the following beneficial effects: The automatic scaling system and method for cloud-native intelligent typesetting services of the present invention uses a planning algorithm to predict the service deployment method and a search algorithm to predict whether the service deployment method will lead to excessively long service response time. While ensuring the timeliness of service completion, the present invention uses cloud resources reasonably and efficiently. The present invention is combined with the deployed service. Under the condition of knowing the DAG flowchart of the microservice and the estimated resource consumption and service duration, the microservice is deployed in a targeted and automated manner. At the same time, the utilization rate of cloud resources is considered, and cloud resources are reduced or expanded in a timely manner. Attached Figure Description

[0015] Figure 1 This is a schematic diagram of the system structure of the present invention;

[0016] Figure 2 This is a flowchart of the method of the present invention;

[0017] Figure 3 This is a flowchart of a method according to an embodiment of the present invention;

[0018] Figure 4 This is a flowchart of a request to reduce the resource scheduler according to an embodiment of the present invention.

[0019] In the diagram, 1. Master resource scheduler; 2. Slave resource scheduler; 3. Service type determination module; 4. Service distribution module; 5. Cluster resource management module; 6. Container service module; 7. Container resource management module. Detailed Implementation

[0020] The present invention will now be further described with reference to the accompanying drawings and specific embodiments.

[0021] like Figure 1 As shown, an auto-scaling system for cloud-native intelligent typesetting services includes a master resource scheduler 1: used to receive service requests, determine whether the current cluster resources are sufficient and automatically scale, and to distribute service requests in stages; and a slave resource scheduler 2: used to receive service requests distributed in stages, determine whether the resources in the container providing the service locally are sufficient and automatically scale, and to report the request results and resource usage to the master resource scheduler 1.

[0022] The master resource scheduler 1 includes: a service type determination module 3, used to determine the request type and service process, assign a unique ID to the request, initialize the service process record file, and send the request ID to the request distribution module; a service distribution module 4, which, after receiving the request ID, sends a resource query request to the cluster resource management module 5 to find the optimal slave resource scheduler 2 within the cluster to distribute the service request; and a cluster resource management module 5, which records all resources in the cluster, updates the number of available and occupied resources based on feedback from the slave resource scheduler 2, receives resource query requests, returns the number of available resources, receives requests to add cluster resources, sends the request to the cloud platform, and returns a message indicating whether the resource addition was successful or not, after receiving feedback from the cloud platform; and after a certain period of time without receiving resource query requests, checks the number of available and occupied resources and sends a request to reduce resources to the cloud platform.

[0023] The resource scheduler 2 includes: Container Service Module 6: Receives service requests, checks if there are any containers on the local machine that have completed the corresponding service. If not, it starts a new container. If there are, it sends a resource query request to Container Resource Management Module 7 to check if the container has sufficient resources. If sufficient, it sends a task to the container and sends resource occupancy information to Container Resource Management Module 7. After receiving feedback from the container, it sends resource release information. Container Resource Management Module 7: Records all resources in the resource scheduler 2, receives resource query requests, returns the number of resources allocated to the container, updates the total number of available and occupied resources in the resource scheduler 2 based on the feedback from Container Service Module 6, and sends statistical information to the main resource scheduler 1.

[0024] like Figure 2 As shown, an automatic scaling method for cloud-native intelligent typesetting services includes the following steps:

[0025] S1: The main resource scheduler receives the user's service request, determines the service process based on the request type, and performs statistics on the estimated resource consumption of the request and the cluster's idle resources. Specifically, after receiving the user's service request, the service type judgment module determines the request type based on the request type dictionary, determines the service process required for this request, performs statistics on the estimated resource consumption of this request based on the service resource consumption dictionary, and queries the cluster resource management module for cluster resource statistics to obtain the cluster's idle resources.

[0026] S2: If the estimated resource consumption is less than the cluster's idle resources, then execute S3 directly; otherwise, send a request to the cluster resource management module to increase cluster resources, and then execute S3. If no response is received within a certain time after sending the request to the cluster resource management module in S2, or if the request to increase cluster resources is not received, or if the request to increase cluster resources is successful, then continue to execute S3.

[0027] S3: The service type determination module initializes the service process log file, selects the optimal slave resource scheduler based on the estimated resource consumption, and allocates the resources to the optimal slave resource scheduler after a successful request to add cluster resources. The slave resource scheduler receives the service request, and the container service module adds the request to the service request list. Specifically, S3 is as follows: The service type determination module generates the ID of this request and initializes the service process log file for this request, sends the request ID to the request distribution module, and the request distribution module queries the cluster resource management module for the slave resource scheduler with the most available resources of each type, i.e., the optimal slave resource scheduler, based on the estimated resource consumption. The cluster resource management module returns the address of the slave resource scheduler. If the cluster resource management module receives feedback from the cloud platform that a request to add resources has been sent, it allocates the resources to the optimal slave resource scheduler. Even if the request to add resources fails, it still returns the address of the slave resource scheduler. The slave resource scheduler receives the service request, and the container service module adds the request to the service request list. When the slave resource scheduler returns the request ID, it returns the result of this request to the client.

[0028] S4: Check if there is a container on the local machine that has completed the corresponding service. If so, send a resource query request to the container resource management module and execute S7 directly. If not, start a new container and execute S5. The specific steps for checking if there is a container on the local machine that has completed the corresponding service in S4 are as follows: When allocating a task, find the first service without a return value for each request in the service request list. Under the premise of not exceeding the existing idle resources from the resource scheduler and using as many idle resources as possible, use integer programming to obtain a list of service sets for a single allocation. Use tree search method to predict whether each service set will cause a certain request to have an excessively long final response time. Select the optimal service set and check if there is a container on the local machine that has completed the corresponding service according to the container service dictionary.

[0029] S5: Check if the container has free resources to meet the resource requirements of this type of service in the service set. If it does, proceed directly to S6. If not, adjust the container resources, allocate this service set to each container, update the resource table of the container resource management module, and after statistics, request the cluster resource management module to update the resource usage from the resource scheduler.

[0030] S6: After the container completes the phased service, it writes the return value to the service process record file and sends the request ID to the container service module. The container service module updates the resource table of the container resource management module according to the container service type and container name, releases the resource occupation, and returns to execute S4.

[0031] S7: The container service module completes the request and returns the execution result to the user, and the process ends.

[0032] After the container resource management module completes the resource release operation, it calculates the available and occupied resources of each type of resource owned by the resource scheduler and requests an update to the resource information of the corresponding resource scheduler in the cluster resource management module. If the container resource management module does not receive a request to release the resource occupation within a set time, it calculates the available and occupied resources of each type of resource owned by all the resource schedulers. When the available resources of each type of resource are greater than the set value, it requests the cluster resource management module to reduce the resources of the corresponding resource scheduler.

[0033] The present invention has the following beneficial effects: The automatic scaling method for cloud-native intelligent typesetting services of the present invention uses a planning algorithm to predict the service deployment mode and a search algorithm to predict whether the service deployment mode will lead to excessively long service response time. While ensuring the timeliness of service completion, the present invention uses cloud resources reasonably and efficiently. The present invention is combined with the deployed service. Under the condition of knowing the DAG flowchart of the microservice and the estimated resource consumption and service duration, the microservice is deployed in a targeted and automated manner. At the same time, the utilization rate of cloud resources is considered, and cloud resources are reduced or expanded in a timely manner.

[0034] In this embodiment, an automatic scaling system and method for cloud-native intelligent typesetting services according to the present invention are described in detail.

[0035] like Figure 1 As shown, an auto-scaling system for cloud-native intelligent typesetting services includes a master resource scheduler for receiving service requests, determining whether current cluster resources are sufficient and automatically scaling, and distributing service requests in stages; and a slave resource scheduler for receiving service requests distributed in stages, determining whether the resources within the container providing the service locally are sufficient and automatically scaling, and feeding back the request results and resource usage to the master resource scheduler.

[0036] The master resource scheduler includes: a service type determination module, used to determine the request type and service flow, assign a unique ID to the request, initialize the service flow record file, and send the request ID to the request distribution module; a service distribution module, which, upon receiving the request ID, sends a resource query request to the cluster resource management module to find the optimal slave resource scheduler within the cluster to distribute the service request; and a cluster resource management module, which records all resources in the cluster and updates the number of available and occupied resources based on feedback from the slave resource schedulers. It receives resource query requests and returns the number of available resources. It receives requests to add cluster resources, sends the request to the cloud platform, and returns a message indicating whether the resource addition was successful or not. After a certain period without receiving resource query requests, it checks the number of available and occupied resources and sends a request to reduce resources to the cloud platform.

[0037] The resource scheduler includes: a container service module, which receives service requests, checks if there are any containers on the local machine that can complete the corresponding service, and if not, starts a new container; if so, it sends a resource query request to the container resource management module to check if the container has enough resources allocated; if so, it sends the task to the container and sends resource occupancy information to the container resource management module; and sends resource release information after receiving feedback from the container. The container resource management module records all resources within the resource scheduler, receives resource query requests, returns the number of resources allocated to the container, updates the total number of available and occupied resources within the resource scheduler based on feedback from the container service module, and sends statistical information to the main resource scheduler.

[0038] An auto-scaling method for cloud-native intelligent typesetting services, such as Figure 3 and Figure 4 As shown, the specific steps are as follows:

[0039] Step 1: After the main resource scheduler receives a user's service request, the service type determination module determines the request type based on the request type dictionary, identifies which service processes are required for this request, calculates the estimated resource consumption for this request based on the service resource consumption dictionary, and queries the cluster resource management module for cluster resource statistics. If the available quantity of each type of resource is greater than the estimated resource consumption, proceed to the next step; otherwise, send a request to the cluster resource management module to increase cluster resources (the amount of each type of cluster resource added is the difference between the two plus the set redundancy value). If no response is received within a certain time, or if the message indicates that resources cannot be added or that resource addition was successful, continue to the next step.

[0040] Examples of variables and formulas:

[0041] Request type dictionary: {Request1: [Service A, Service B, Service C],

[0042] Request 2: [Service D, Service E], ...};

[0043] Service resource consumption dictionary: {Service A: {CPU: 2, Memory: 3, Runtime: 5}, Service B: {CPU: 1, Memory: 4, Runtime: 2}, Service C: {CPU: 3, Memory: 2, Runtime: 3}, ...};

[0044] Resource consumption statistics: Request 1: {CPU: 6, Memory: 9};

[0045] If the cluster's idle resource statistics are: {CPU: 8, memory: 6}, then the additional cluster resource request is {memory: 3 + Δ}, where Δ is the set redundancy value.

[0046] Step 2: The service type determination module generates an ID for this request, initializes the service flow record file for this request, and sends the request ID to the request distribution module. Based on the estimated resource consumption, the request distribution module queries the cluster resource management module for the slave resource scheduler with the most available resources of each type. The cluster resource management module returns the address of that slave resource scheduler. If the cluster resource management module receives feedback from the cloud platform requesting additional resources, it will allocate the resources to the optimal slave resource scheduler. Even if the request to add resources fails, it still returns the address of that slave resource scheduler. Once the slave resource scheduler returns the request ID, it returns the result of this request to the client.

[0047] Examples of variables and formulas:

[0048] Service flow log file: {Request id: XXX, Service flow: [{Service A: Return value, Service time: XX}, {Service B: Return value, Service time: XX}, ....]};

[0049] Service time = running time + waiting time;

[0050] Find the optimal slave resource scheduler: resource consumption statistics: {CPU: 6, memory: 9};

[0051] Available resources from resource scheduler 1: {CPU: 4, Memory: 8};

[0052] Available resources from resource scheduler 2: {CPU: 3, Memory: 10};

[0053] Score from Resource Scheduler 1: 6 * 4 / 9 + 8 = 10.67;

[0054] Score from Resource Scheduler 2: 6*3 / 9+10=12>10.67;

[0055] Then resource scheduler 2 is the optimal resource scheduler.

[0056] Step 3: After receiving a service request from the resource scheduler, the container service module adds it to the service request list for unified planning. Each time a task is allocated, it finds the first service without a return value for each request in the service request list. Without exceeding the existing idle resources of the resource scheduler and using as many idle resources as possible, it uses integer planning to derive a list of service sets for a single allocation. It uses a tree search method to predict whether each service set will cause a certain request to have an excessively long final response time, selects the optimal service set, and checks whether there are containers locally that can complete the corresponding service according to the container service dictionary. If not, it starts a new container. If so, it sends a resource query request to the container resource management module to check if the container has idle resources to meet the various resource requirements of this type of service in the service set. If not, it adjusts container resources, allocates this service set to each container, updates the resource table of the container resource management module (the resource usage of each container), and after statistics, requests an update to the resource usage of the resource scheduler from the cluster resource management module. After the container completes the service, it writes the return value to the service process log file and sends a request ID to the container service module. The container service module updates the resource table of the container resource management module according to the container service type and container name, releases the resource usage, and periodically performs a new round of task allocation.

[0057] Examples of variables and formulas:

[0058] Service request list: {request id1: [{service A: return value}, {service B:}, ...], request id1: [{service B:}, {service C:}, ...], ...};

[0059] Container service dictionary: {Service A: container name 1, Service B: container name 2, ...};

[0060] Service set list: [[2 services A, 3 services B, 4 services C, ...], [1 service A, 2 services B, 3 services C, ...], ...];

[0061] Integer programming:

[0062] x1*A (CPU consumption) + y1*B (CPU consumption) + ... <= number of CPUs available from the resource scheduler;

[0063] x2*A (memory consumption) + y2*B (memory consumption) + ... <= the number of free memory units from the resource scheduler, and the total CPU consumption and total memory consumption should be as large as possible.

[0064] Tree search prediction: Assume the current allocation is [1 service A, 2 services B, 3 services C, ...];

[0065] The next allocation might be [0 services A, 4 services B, 2 services C, ...]...;

[0066] Request X: [Service A, Service B, Service C, ...] will complete all services after K allocations. The total service time = A (service time) + B (service time) + ... (each request has an expected total service time).

[0067] Iterate through all the planned service sets and find the service set with the fewest requests exceeding the expected total service time as the optimal set;

[0068] Service time calculation: Request X: [Service A, Service B, Service C, ...];

[0069] A (Service Time) = Time A Completed - Time Request X Was Added to Service Request List;

[0070] B (service time) = B completion time - A completion time;

[0071] …

[0072] Dynamically adjust container resources: Assume the current allocation is [1 service A, 2 services B, 3 services C, ...];

[0073] {Container Name 1: {Idle CPU: 0, Idle Memory: 0}, ...};

[0074] Then add: {container name 1: {cpu: 2, memory: 3}, ...}.

[0075] Step 4: After the container resource management module completes the resource release operation, it calculates the available and occupied resources of each type owned by the resource scheduler and requests an update to the resource information of that slave resource scheduler in the cluster resource management module. If the container resource management module does not receive a request to release resource occupation within a set time, it calculates the available and occupied resources of each type owned by all slave resource schedulers. When the available resources of each type exceed a certain value, it requests the cluster resource management module to reduce the resources of that slave resource scheduler.

[0076] Examples of variables and formulas:

[0077] Dynamically reduce resources: Let: {Service A: {CPU: 2, Memory: 3, Runtime: 5}, Service B: {CPU: 1, Memory: 4, Runtime: 2}, Service C: {CPU: 3, Memory: 2, Runtime: 3}};

[0078] From resource scheduler 1: {CPU: {Used: 6, Idle: 4}, Memory: {Used: 8, Idle: 6}};

[0079] If the number of idle CPUs and the number of memory units in resource scheduler 1 are consistently greater than 3 and greater than 4 within a set time period, then a request will be made to the cluster resource management module to reduce the resources of the resource scheduler, with the reduction amount being {CPU: 1, memory: 2}.

[0080] Example of a tree search process:

[0081] Service resource consumption dictionary: {Service A: {CPU: 2, Memory: 3, Runtime: 5}, Service B: {CPU: 1, Memory: 4, Runtime: 2}, Service C: {CPU: 3, Memory: 2, Runtime: 3}, ...};

[0082] Service request list: {

[0083] Request 1: [{Service A: return value}, {Service B: return value}, {Service C:}, {Service D:}],

[0084] Request 2: [{Service A: return value}, {Service B:}, {Service C:}, {Service D:}],

[0085] Request 3: [{Service B:}, {Service C:}, {Service D:}],

[0086] Request 4: [{Service A:}, {Service B:}, {Service C:}, {Service D:}],

[0087] Request 5: [{Service C:}, {Service D:}],

[0088] Request 6: [{Service A:}, {Service B:}, {Service C:}],

[0089] Request 7: [{Service B:}, {Service C:],

[0090] Request 8: [{Service A:}, {Service B:}, {Service C:}]}.

[0091] The current service requests 3 A services (requests 4, 6, and 8), 3 B services (requests 2, 3, and 7), and 2 C services (requests 1 and 5).

[0092] The planned service set list is as follows: [[1 service A, 1 service B, 2 services C], [2 services A, 1 service B, 0 services C], [3 services A, 0 services B, 0 services C]...].

[0093] Tree search process: Assume a service set of [2 services A, 1 service B, 0 services C] is selected, which is the root of the tree search. Services A can have requests 4 and 6, 4 and 8, or 6 and 8. Combined with the three options for service B, this creates nine possible combinations, resulting in nine new current service requests. The next possible service set (subtree) is a permutation of these nine new current service requests, with the constraint that the number of resources does not exceed the cluster's existing resources. With each layer added to the tree, the service time of the selected service in the service set is increased by the runtime value recorded in the service resource consumption dictionary. All other services without return values are added with a set average service time value. This continues until all services without return values in the service request list are selected. The tree search reaches a leaf node. During the search, the service time of each request is calculated at each layer. If the service time of a request exceeds the set maximum service time, the exploration stops and the parent node is returned. Finally, the search tree rooted at the service set [2 services A, 1 service B, 0 services C] is counted to determine whether there are leaf nodes and how many leaf nodes there are.

[0094] The above description is merely a specific embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any changes or substitutions conceived without creative effort should be included within the scope of protection of the present invention.

Claims

1. An automatic scaling system for cloud-native intelligent typesetting services, characterized in that, include: Master resource scheduler: Used to receive service requests, determine whether the current cluster resources are sufficient and automatically scale, and distribute service requests in stages; The service type determination module generates the ID of this request and initializes the service process record file of this request. It sends the request ID to the request distribution module. Based on the estimated resource consumption, the request distribution module queries the cluster resource management module for the most available slave resource scheduler of each type of resource, i.e. the optimal slave resource scheduler. The cluster resource management module returns the address of the slave resource scheduler. The resource scheduler receives service requests distributed in stages, determines whether the resources within the container providing the service are sufficient and automatically scales them, and reports the request results and resource usage to the master resource scheduler. Each time a task is allocated, it finds the first service without a return value for each request in the service request list. Without exceeding the available idle resources of the slave resource scheduler and using as many idle resources as possible, it uses integer programming to derive a list of service sets for a single allocation. It then uses a tree search method to predict whether each service set will cause a certain request to have an excessively long final response time, and selects the optimal service set.

2. The automatic scaling system for cloud-native intelligent typesetting services according to claim 1, characterized in that, The main resource scheduler includes: Service type determination module: used to determine the request type and service process, assign a unique ID to the request, initialize the service process record file and send the request ID to the request distribution module; Service distribution module: After receiving the request ID, it sends a resource query request to the cluster resource management module to find the optimal slave resource scheduler in the cluster to distribute the service request; Cluster Resource Management Module: Records all resources in the cluster, updates the number of available and occupied resources based on feedback from the resource scheduler; receives resource query requests and returns the number of available resources; receives requests to add cluster resources, sends the request to the cloud platform, and returns a message indicating whether the resource addition was successful or unsuccessful after receiving feedback from the cloud platform; after a certain period of time without receiving resource query requests, checks the number of available and occupied resources and sends a request to the cloud platform to reduce resources.

3. An automatic scaling system for cloud-native intelligent typesetting services according to claim 1 or 2, characterized in that, The resource scheduler includes: Container service module: Receives service requests, checks if there are containers on the local machine that can complete the corresponding service. If not, it starts a new container. If there are, it sends a resource query request to the container resource management module to check if the container has enough resources. If so, it sends the task to the container and sends resource usage information to the container resource management module. After receiving feedback from the container, it sends resource release information. Container resource management module: Records all resources within the resource scheduler, receives resource query requests, returns the number of resources allocated to containers, updates the total number of available and occupied resources within the resource scheduler based on feedback from the container service module, and sends statistical information to the main resource scheduler.

4. An automatic scaling method for cloud-native intelligent typesetting services, applicable to the automatic scaling system for cloud-native intelligent typesetting services as described in any one of claims 1-3, characterized in that, Includes the following steps: S1: The main resource scheduler receives service requests from users, determines the service process based on the request type, and performs statistics on the estimated resource consumption of the request and the idle resources of the cluster. S2: If the estimated resource consumption is less than the cluster's idle resources, then execute S3 directly; otherwise, send a request to the cluster resource management module to increase cluster resources, and then execute S3. S3: The service type judgment module initializes the service process record file, selects the optimal slave resource scheduler based on the estimated resource consumption, and allocates the resources to the optimal slave resource scheduler after the cluster resource request is successfully added. The slave resource scheduler receives the service request, and the container service module adds the request to the service request list for planning. S4: Check if there is a container that has completed the corresponding service locally. If there is, send a resource query request to the container resource management module and execute S7 directly. If not, start a new container and execute S5. S5: Check if the container has free resources to meet the resource requirements of this type of service in the service set. If it does, proceed directly to S6. If not, adjust the container resources, allocate this service set to each container, update the resource table of the container resource management module, and after statistics, request the cluster resource management module to update the resource usage from the resource scheduler. S6: After the container completes the phased service, it writes the return value to the service process record file and sends the request ID to the container service module. The container service module updates the resource table of the container resource management module according to the container service type and container name, releases the resource occupation, and returns to execute S4. S7: The container service module completes the request and returns the execution result to the user, and the process ends.

5. The automatic scaling method for cloud-native intelligent typesetting services according to claim 4, characterized in that, S1 specifically refers to the following steps: After the main resource scheduler receives a user's service request, the service type judgment module determines the request type based on the request type dictionary, determines the service process required for this request, calculates the estimated resource consumption for this request based on the service resource consumption dictionary, and queries the cluster resource management module for cluster resource statistics to obtain the cluster's idle resources.

6. The automatic scaling method for cloud-native intelligent typesetting services according to claim 4, characterized in that, If, after sending a request to add cluster resources to the cluster resource management module in step S2, no response is received within a certain period of time, or if the request to add cluster resources is unsuccessful, or if the request to add cluster resources is successful, step S3 will continue to be executed.

7. The automatic scaling method for cloud-native intelligent typesetting services according to claim 4, characterized in that, Specifically, S3 is as follows: If the cluster resource management module receives feedback from the cloud platform that it has sent a request to add resources, it will allocate the resources to the optimal slave resource scheduler. Even if the request to add resources fails, it will return the address of the slave resource scheduler. The slave resource scheduler receives the service request, and the container service module adds the request to the service request list. When the slave resource scheduler returns the request ID, it returns the result of this request to the client.

8. The automatic scaling method for cloud-native intelligent typesetting services according to claim 4, characterized in that, The step S4, checking whether there is a container that has completed the corresponding service locally, specifically involves checking whether there is a container that has completed the corresponding service locally based on the container service dictionary.

9. An automatic scaling method for cloud-native intelligent typesetting services according to any one of claims 4-8, characterized in that, After the container resource management module completes the resource release operation, it calculates the available and occupied amounts of various resources owned by the resource scheduler and requests an update to the resource information of the slave resource scheduler in the cluster resource management module. If the container resource management module does not receive a request to release the resource occupation within a set time, it calculates the available and occupied amounts of various resources owned by all slave resource schedulers. When the available amount of various resources is greater than a set value, it requests the cluster resource management module to reduce the various resources of the slave resource scheduler.

Citation Information

Patent Citations

CN115480900A

Patent Information

Abstract

Description

Patent Citations

CN115480900A