Micro-service deployment method, device and equipment in cloud-native environment and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By acquiring multi-dimensional data from microservices to determine popularity metrics and making dynamic adjustments, the problem of unbalanced load in cloud-native microservices is solved, enabling efficient resource scheduling and low-latency deployment across regions, thereby improving system stability and user experience.

CN122248052APending Publication Date: 2026-06-19LIAONING MOBILE COMM +1

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: LIAONING MOBILE COMM
Filing Date: 2026-02-24
Publication Date: 2026-06-19

Application Information

Patent Timeline

24 Feb 2026

Application

19 Jun 2026

Publication

CN122248052A

IPC: H04L67/51; H04L41/0896

AI Tagging

Application Domain

Transmission

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Network device and method with anticipatory path control
US20260172355A1Transmission
Network Cut Analysis using GPU Acceleration
US20260163807A1Transmission
Machine room network equipment intelligent monitoring method and system
CN122220185AHardware monitoring Alarms
Distributing pacing results for low latency content serving
US20260172485A1Transmission Computer network Engineering
A communication system external dependency protection method and system
CN122226792ATransmission

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

In cloud-native microservice deployments, high-demand microservices require frequent instance expansion, leading to increased load pressure, while low-demand microservices suffer from insufficient resource utilization. Cross-regional deployments face complex differences in demand and latency requirements. How to dynamically adjust service deployments to balance load distribution and reduce latency and cost of cross-regional calls is a pressing technical challenge that needs to be addressed.

Method used

By acquiring multi-dimensional service data of microservices, we can determine the popularity indicators and dynamically adjust instances when conditions are met, including scaling up or down. We can also use neural network models to predict the timing and magnitude of adjustments and use load balancing and cluster analysis to optimize deployment strategies, thereby achieving precise elastic scheduling across regions and availability zones.

Benefits of technology

It enables precise and elastic scheduling of microservice resources across regions and availability zones, improving resource utilization efficiency, reducing cross-domain access latency and costs, while ensuring service performance and response time.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122248052A_ABST

Patent Text Reader

Abstract

This disclosure provides a method, apparatus, device, and medium for deploying microservices in a cloud-native environment. The method includes: acquiring multi-dimensional service data of the microservices; determining a popularity index of the microservices based on the multi-dimensional service data; wherein the popularity index is used to indicate the overall load pressure and / or resource usage of the microservices; and dynamically adjusting the instances of the microservices when the microservices are determined to meet the conditions for dynamic instance adjustment based on the popularity index; wherein the dynamic instance adjustment includes adding or reducing instances.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to the field of IT technology, and more specifically, to a method, apparatus, device, and medium for deploying microservices in a cloud-native environment. Background Technology

[0002] Cloud-native architecture is a modern IT architecture based on containerization technology, designed to enable efficient deployment and flexible management of applications through microservice architecture, dynamic scheduling and automated management.

[0003] In cloud-native microservice deployments, popularity reflects the activity and load of microservices. High-population microservices require frequent instance scaling to handle peak demand, leading to increased instance pressure, longer response times, and decreased performance. Conversely, low-population microservices, while having lower loads, may suffer from insufficient network resource utilization, resulting in inefficient allocation of other resources. Cross-regional and cross-availability zone microservice deployments face even more complex differences in popularity and latency requirements. Significant differences in user activity times and access frequencies exist across different regions. Dynamically adjusting service deployment based on regional popularity to balance load distribution, meet the access needs of users in different regions, and simultaneously ensure service availability and performance is a pressing technical challenge. Furthermore, cross-regional deployments involve network transmission costs and latency issues, requiring comprehensive consideration of factors such as network quality and distance between regions to rationally plan service deployment layouts. Using models such as proximity access can reduce latency and costs of cross-regional calls, thereby improving user experience. Summary of the Invention

[0004] This disclosure provides at least one method, apparatus, device, and medium for deploying microservices in a cloud-native environment.

[0005] In a first aspect, embodiments of this disclosure provide a method for deploying microservices in a cloud-native environment, the method comprising: Obtain multi-dimensional service data for microservices; The popularity index of the microservice is determined based on the multi-dimensional service data; wherein, the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; When a microservice meets the conditions for dynamic instance adjustment based on its popularity metric, the instances of the microservice are dynamically adjusted; wherein, the dynamic instance adjustment includes adding or reducing instances.

[0006] In one optional implementation, the popularity metric includes a real-time popularity metric; the step of dynamically adjusting the instances of the microservice when it is determined, based on the microservice's popularity metric, that the microservice meets the conditions for dynamic instance adjustment includes: When the real-time popularity metric of the microservice is determined to be higher than the first popularity metric threshold, a first alarm message is generated, and the instance of the microservice is scaled up; the real-time popularity metric is used to indicate the current resource usage of the microservice; When the real-time popularity index of the microservice is determined to be lower than the second popularity index threshold, a second alarm message is generated, and the instance of the microservice is scaled down.

[0007] In one optional implementation, the popularity index includes a comprehensive popularity index; the step of dynamically adjusting the instances of the microservice when it is determined, based on the microservice's popularity index, that the microservice meets the conditions for dynamic instance adjustment includes: Based on the comprehensive popularity index, predict the future popularity trend of the microservice; When the trend of the popularity change is determined to be an increase in the comprehensive popularity index, the instances of the microservice are scaled up and adjusted; the comprehensive popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; When the trend of popularity change is determined to be a decrease in the overall popularity index, the instances of the microservice are scaled down.

[0008] In one optional implementation, dynamically adjusting the instances of the microservice includes: The popularity level of the microservice is determined based on the popularity index. Determine a microservice deployment strategy that matches the popularity level result; wherein the microservice deployment strategy is used to indicate the strategy for adjusting instances of the microservice; The instances of the microservices are adjusted based on the microservice deployment strategy.

[0009] In one optional implementation, adjusting the microservice instances based on the microservice deployment strategy includes: When the microservice deployment strategy determines to increase the number of instances of the microservice, a comprehensive load score is determined for multiple servers where the microservice has been deployed; wherein, the comprehensive load score is used to indicate the load status of the corresponding server; The scheduling priority score is determined for the newly added target instance of the microservice; The target instance is scheduled and deployed by combining the comprehensive load score and the scheduling priority score.

[0010] In one optional implementation, the step of scheduling and deploying the target instance by combining the comprehensive load score and the scheduling priority score includes: Deploy the target instance on the server with the lowest overall load score; or If the overall load scores of the servers are the same, the target instance shall be deployed on the server with the highest priority.

[0011] In one optional implementation, the step of dynamically adjusting the instances of the microservice when the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity metric includes: When it is determined that a microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index, a neural network model is used to predict the timing and magnitude of instance adjustment. The microservice instances are dynamically adjusted according to the timing and magnitude of the instance adjustments.

[0012] In one optional implementation, the method further includes: After receiving the target business request, obtain the load status data of each instance of the microservice and the number of requests processed by each instance; The load balancing weight for each instance is determined based on the load status data; The target service request is assigned to the instance with the highest load balancer weight and the smallest number of requests to be processed.

[0013] In one optional implementation, the method further includes: Obtain response time and network latency data for instances belonging to each region and / or each availability zone; Calculate the average response time and average network latency data for each instance in each of the said regions and / or each of the said availability zones; Based on the average response time and the average network latency data, cluster analysis is performed on the instances belonging to each region and / or each availability zone to obtain latency instance clusters, wherein the latency instance clusters include instances of microservices deployed in different locations whose latency exceeds a predetermined latency threshold; Deploy the instances in the latency instance cluster to regions and / or availability zones that meet preset distance and latency requirements.

[0014] In one optional implementation, the method further includes: Based on response time and request volume, the microservices are clustered into a first cluster group and a second cluster group; wherein the response time of the first cluster group is shorter than that of the second cluster group, and the request volume of the first cluster group is higher than that of the second cluster group. Microservices in the first cluster group are scheduled to high-performance nodes, and microservices in the second cluster group are scheduled to low-performance nodes.

[0015] Secondly, embodiments of this disclosure provide a microservice deployment apparatus in a cloud-native environment, the apparatus comprising: The acquisition unit is used to acquire multi-dimensional service data of microservices. A determining unit is configured to determine the popularity index of the microservice based on the multi-dimensional service data; wherein the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; An adjustment unit is used to dynamically adjust the instances of a microservice when the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index; wherein, the dynamic instance adjustment includes adding or reducing instances.

[0016] Thirdly, embodiments of this disclosure also provide an electronic device, including: a processor, a memory, and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device is running, the processor communicates with the memory via the bus, and when the machine-readable instructions are executed by the processor, the steps of the first aspect above, or any possible implementation of the first aspect, are performed.

[0017] Fourthly, embodiments of this disclosure also provide a computer-readable storage medium storing a computer program that, when executed by a processor, performs the steps of the first aspect or any possible implementation of the first aspect.

[0018] Fifthly, embodiments of this disclosure also provide a computer program product, including a computer program that, when executed by a processor, implements the steps of the first aspect described above, or any possible implementation of the first aspect.

[0019] This disclosure provides a method, apparatus, device, and medium for deploying microservices in a cloud-native environment. First, multi-dimensional service data of the microservices is acquired. Then, a heat index for the microservices is determined based on the multi-dimensional service data. The heat index indicates the overall load pressure and / or resource usage of the microservices. Finally, when the heat index determines that the microservices meet the conditions for dynamic instance adjustment, the instances of the microservices are dynamically adjusted. The dynamic instance adjustment includes adding or reducing instances.

[0020] In the above implementation, by acquiring and analyzing multi-dimensional service data of microservices in real time, a heat index reflecting the overall load and resource usage status is constructed, and the dynamic scaling of instances is triggered accordingly. This enables precise and elastic scheduling of microservice resources in cross-region and cross-availability zone environments, significantly improving resource utilization efficiency and reducing cross-domain access latency and costs while ensuring service performance and response time.

[0021] To make the above-mentioned objects, features and advantages of this disclosure more apparent and understandable, preferred embodiments are described below in detail with reference to the accompanying drawings. Attached Figure Description

[0022] To more clearly illustrate the technical solutions of the embodiments of this disclosure, the accompanying drawings used in the embodiments will be briefly described below. These drawings are incorporated in and constitute a part of this specification. They illustrate embodiments conforming to this disclosure and, together with the specification, serve to explain the technical solutions of this disclosure. It should be understood that the following drawings only show some embodiments of this disclosure and should not be considered as limiting the scope. Those skilled in the art can obtain other related drawings based on these drawings without creative effort.

[0023] Figure 1 A flowchart illustrating a microservice deployment method in a cloud-native environment provided by an embodiment of this disclosure is shown; Figure 2 This illustration shows a schematic diagram of a microservice deployment apparatus in a cloud-native environment provided by an embodiment of the present disclosure; Figure 3 A schematic diagram of an electronic device provided in an embodiment of the present disclosure is shown. Detailed Implementation

[0024] To make the objectives, technical solutions, and advantages of the embodiments of this disclosure clearer, the technical solutions of the embodiments of this disclosure will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this disclosure, and not all of them. The components of the embodiments of this disclosure described and shown in the accompanying drawings can generally be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of this disclosure provided in the accompanying drawings is not intended to limit the scope of the claimed disclosure, but merely represents selected embodiments of this disclosure. All other embodiments obtained by those skilled in the art based on the embodiments of this disclosure without inventive effort are within the scope of protection of this disclosure.

[0025] It should be noted that similar labels and letters in the following figures indicate similar items. Therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.

[0026] In this document, the term "and / or" merely describes a relationship, indicating that three relationships can exist. For example, A and / or B can represent three cases: A alone, A and B simultaneously, and B alone. Furthermore, the term "at least one" in this document means any combination of at least two of any one or more elements. For example, including at least one of A, B, and C can mean including any one or more elements selected from the set consisting of A, B, and C.

[0027] Research has shown that cloud-native architecture is a modern IT architecture based on containerization technology, designed to enable efficient deployment and flexible management of applications through microservice architecture, dynamic scheduling, and automated management.

[0028] In cloud-native microservice deployments, popularity reflects the activity and load of microservices. High-population microservices require frequent instance scaling to handle peak demand, leading to increased instance pressure, longer response times, and decreased performance. Conversely, low-population microservices, while having lower loads, may suffer from insufficient network resource utilization, resulting in inefficient allocation of other resources. Cross-regional and cross-availability zone microservice deployments face even more complex differences in popularity and latency requirements. Significant differences in user activity times and access frequencies exist across different regions. Dynamically adjusting service deployment based on regional popularity to balance load distribution, meet the access needs of users in different regions, and simultaneously ensure service availability and performance is a pressing technical challenge. Furthermore, cross-regional deployments involve network transmission costs and latency issues, requiring comprehensive consideration of factors such as network quality and distance between regions to rationally plan service deployment layouts. Using models such as proximity access can reduce latency and costs of cross-regional calls, thereby improving user experience.

[0029] Based on the above research, this disclosure provides a microservice deployment method, apparatus, device, and medium in a cloud-native environment. First, multi-dimensional service data of the microservice is acquired; then, a heat index of the microservice is determined based on the multi-dimensional service data; wherein, the heat index is used to indicate the overall load pressure and / or resource usage of the microservice; finally, when it is determined based on the heat index of the microservice that the microservice meets the conditions for dynamic instance adjustment, the instances of the microservice are dynamically adjusted; wherein, the dynamic instance adjustment includes adding instances or reducing instances.

[0030] In the above implementation, by acquiring and analyzing multi-dimensional service data of microservices in real time, a heat index reflecting the overall load and resource usage status is constructed, and the dynamic scaling of instances is triggered accordingly. This enables precise and elastic scheduling of microservice resources in cross-region and cross-availability zone environments, significantly improving resource utilization efficiency and reducing cross-domain access latency and costs while ensuring service performance and response time.

[0031] To facilitate understanding of this embodiment, a microservice deployment method in a cloud-native environment disclosed in this disclosure will first be described in detail. The execution subject of the microservice deployment method in a cloud-native environment provided in this disclosure is generally an electronic device with a certain computing power. In some possible implementations, this microservice deployment method in a cloud-native environment can be implemented by a processor calling computer-readable instructions stored in memory.

[0032] See Figure 1 The diagram shows a flowchart of a microservice deployment method in a cloud-native environment provided by an embodiment of this disclosure. The method includes steps S101 to S103, wherein: S101: Obtain multi-dimensional service data for microservices.

[0033] The multi-dimensional service data includes: request volume, CPU and memory usage, response time, throughput, user activity, network bandwidth, and error rate.

[0034] Specifically, multi-dimensional monitoring data from various microservices is collected, including metrics such as request volume, CPU utilization, memory utilization, response time, throughput, user activity, network bandwidth, and error rate. This data is then aggregated into a centralized monitoring platform. The collected data undergoes preprocessing and cleaning to remove outliers and invalid data, ensuring accuracy and completeness and providing a reliable data foundation for subsequent popularity calculations.

[0035] In a microservice architecture, ELK (Elasticsearch, Logstash, Kibana) can be pre-deployed as a log collection component to collect and centrally store runtime logs from various microservice instances in real time. ELK is chosen because it provides rich log analysis and visualization capabilities, facilitating the parsing and extraction of log content to obtain key metrics such as request volume and error rate. Prometheus is used as the monitoring system to collect and aggregate runtime performance metrics of microservice instances. Prometheus's NodeExporter is used to collect host-level resource usage such as CPU and memory, while JMXExporter is used to collect JVM metrics for Java applications.

[0036] S102: Determine the popularity index of the microservice based on the multi-dimensional service data; wherein, the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice.

[0037] Here, a weighted average method can be used to calculate the monitoring data of each dimension, thereby calculating the heat index of each microservice. The heat index can be used to quantitatively reflect the load pressure and / or resource usage of the microservice.

[0038] The importance of different indicators is quantitatively assessed. Sensitivity analysis and iterative optimization are used to determine the optimal weight combination. Reasonable weight coefficients are set, assigning higher weights to key indicators such as request volume, CPU utilization, and memory utilization, while assigning lower weights to secondary indicators such as response time and error rate. A weighted average method is used, multiplying each indicator data by its corresponding weight coefficient and then summing the results to obtain the heat index. This index quantitatively reflects the overall load pressure and / or resource utilization of each microservice. Let the weights of the eight indicators—request volume, CPU utilization, memory utilization, response time, throughput, user activity, network bandwidth, and error rate—be w1, w2, w3, w4, w5, w6, w7, and w8, respectively, and the corresponding indicator data be x1, x2, x3, x4, x5, x6, x7, and x8, respectively. Then, the formula for calculating the heat index H is: H=(w1×x1+w2×x2+w3×x3+w4×x4+w5×x5+w6×x6+w7×x7+w8×x8) / (w1+w2+w3+w4+w5+w6+w7+w8).

[0039] S103: When it is determined that the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index, the instances of the microservice are dynamically adjusted; wherein, dynamic instance adjustment includes adding or reducing instances.

[0040] After determining the popularity metric for a microservice, it is possible to determine whether the microservice meets the conditions for dynamic instance adjustment based on the popularity metric. If the conditions for dynamic instance adjustment are met, the instance deployment of the microservice is dynamically adjusted, such as adding or reducing instances.

[0041] In this embodiment of the application, when the popularity index includes a real-time popularity index, step S103 above, when determining that the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index, dynamically adjusts the instance of the microservice, specifically including the following steps: Step S11: When it is determined that the real-time popularity index of the microservice is higher than the first popularity index threshold, a first alarm message is generated, and the instance of the microservice is scaled up and adjusted; the real-time popularity index is used to indicate the current resource usage of the microservice. Step S12: When it is determined that the real-time popularity index of the microservice is lower than the second popularity index threshold, a second alarm message is generated, and the instance of the microservice is scaled down.

[0042] Here, multi-dimensional service data obtained from log collection and performance monitoring can be transmitted in real time to streaming computing engines such as Spark Streaming or Flink through message queue components such as Kafka. The data is then cleaned, transformed, and aggregated in real time to calculate the real-time popularity metrics of each microservice.

[0043] After determining the real-time popularity metric, the popularity level of microservices can be judged and identified in real time by setting threshold rules for the real-time popularity metric. When the real-time popularity metric of a microservice exceeds the preset upper threshold (i.e., the first popularity metric threshold) or falls below the lower threshold (i.e., the second popularity metric threshold), an alarm mechanism is triggered.

[0044] For example, when the real-time popularity metric is determined to be higher than the first popularity metric threshold, a first alarm message is generated; when the real-time popularity metric is determined to be lower than the second popularity metric threshold, a second alarm message is generated. Here, alarm notifications can be pushed to operations and maintenance personnel in real time via email, SMS, DingTalk messages, etc., to indicate the abnormal real-time popularity.

[0045] In this embodiment, different threshold standards (i.e., first popularity index threshold and second popularity index threshold) are set for different microservices based on their business importance and priority. For example, for core business microservices, a higher expansion threshold (i.e., first popularity index threshold) and a lower shrinkage threshold (second popularity index threshold) are set, while for non-critical business microservices, the threshold requirements are appropriately relaxed.

[0046] For persistent abnormal activity levels over a certain period, the first alert is automatically escalated to a high priority level. Raising the alert level ensures the operations team takes the issue very seriously. Real-time microservice activity metrics obtained through streaming computing are persistently stored and managed using time-series databases such as InfluxDB or OpenTSDB. Data visualization tools like Grafana are then used to create real-time monitoring dashboards and provide historical trend analysis, intuitively displaying changes in activity metrics.

[0047] In this embodiment of the application, when the popularity index includes a comprehensive popularity index, step S103 above, when determining that a microservice meets the conditions for dynamic instance adjustment based on the popularity index of each microservice, dynamically adjusts the instances of the microservices, specifically including the following steps: Step S21: Predict the future trend of the microservice's popularity based on the comprehensive popularity index; Step S22: When the trend of popularity change is determined to be an increase in the comprehensive popularity index, the instances of the microservice are scaled up and adjusted; the comprehensive popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; Step S23: When it is determined that the trend of the popularity change is a decrease in the comprehensive popularity index, the instances of the microservice are scaled down.

[0048] Here, the comprehensive popularity index includes both real-time popularity metrics calculated based on multi-dimensional service data from historical moments and real-time popularity metrics calculated based on multi-dimensional service data at the current moment. After determining the comprehensive popularity index, machine learning algorithms, such as LSTM (Long Short-Term Memory) or Prophet, can be used to predict and analyze the popularity trends of microservices based on the comprehensive popularity index.

[0049] When the trend of popularity indicates an upward trend in the overall popularity index, and it can be predicted that the popularity will continue to grow in the future, an automatic scaling-up mechanism is triggered to increase the number of instances of the microservice to cope with high concurrency requests. When the trend of popularity indicates a downward trend in the overall popularity index, and it can be predicted that the popularity will remain low in the future, an automatic scaling-down mechanism is triggered to reduce the number of instances of the microservice to save resource consumption.

[0050] In practical implementation, data points can be embedded in the microservice code to collect core metrics such as request volume, response time, and error rate. This data is then continuously reported to a time-series database (such as Prometheus) at fixed intervals (e.g., 30 seconds), with a data retention period of one month, providing historical data support for microservice popularity prediction. Simultaneously, statistical methods such as Z-score and GESD are used to detect outliers in the time-series data of the microservice's core metrics, identifying abrupt changes and patterns in popularity. Non-parametric tests such as Mann-Kendall trend analysis and Sen's Slope analysis are used to determine the monotonic trend of microservice popularity across different time scales and extract trend features. Combining outliers and trend features, a comprehensive assessment of the microservice's current popularity "state" is conducted, generating risk warning signals and decision-making suggestions (e.g., "Popularity increases by more than 20% for three consecutive days; 30% scaling up is recommended") to guide and optimize scaling decisions.

[0051] Furthermore, the type of microservice metric changes can be categorized by comprehensive popularity metrics, such as consistently high or sudden increases. For microservices with consistently high popularity metrics, users can be prompted to perform code-level performance optimizations, such as algorithm improvements, data structure adjustments, and caching strategy optimizations, to improve the processing efficiency of individual instances and reduce popularity metrics. For microservices with sudden increases in popularity metrics, users can be prompted to analyze their business logic and data access patterns, optimize key aspects such as database queries and network communication, and adjust relevant configuration parameters, such as connection pool size and thread count, to cope with sudden traffic surges. Popularity metrics can be dynamically adjusted based on the business importance and priority of microservices. For critical business microservices, the weight of their popularity metrics can be appropriately increased to ensure they receive sufficient system resources and maintain high performance and high availability.

[0052] In the above implementation, by periodically reviewing and analyzing historical data of comprehensive microservice popularity metrics, hot microservices and potential performance bottlenecks are identified. Targeted performance optimizations and architectural improvements are then implemented to address business growth and traffic peaks, thereby enhancing the stability and scalability of the entire microservice system. Continuous monitoring and analysis of popularity metrics allows for the timely detection of abnormal states and potential risks in microservices, enabling corresponding optimization measures to ensure the healthy operation of the microservice architecture. Popularity metrics, as a comprehensive indicator of microservice load and resource utilization, fully reflect the operational status of microservices. When popularity metrics consistently exceed preset thresholds, it indicates that the microservice may be facing high load pressure and resource bottlenecks, requiring timely measures such as scaling and optimization to ensure service quality.

[0053] In this embodiment of the application, step S103 dynamically adjusts the instance of the microservice, specifically including the following steps: Step S31: Determine the popularity level of the microservice based on the popularity index; Step S32: Determine a microservice deployment strategy that matches the popularity level results; wherein, the microservice deployment strategy is used to indicate the strategy for adjusting the instances of the microservice; Step S33: Adjust the instances of the microservices based on the microservice deployment strategy.

[0054] In this embodiment of the application, microservices can also be classified into levels based on their popularity according to their load conditions, thereby obtaining popularity level results. Then, a microservice deployment strategy matching the popularity level results is formulated. The microservice deployment strategy is used to adjust the number of running instances of microservices on each server, including using multi-instance deployment for high-popularity microservices and single-instance deployment for low-popularity microservices.

[0055] Here, the popularity level can be divided into three levels: high popularity, medium popularity, and low popularity. Different deployment strategies and resource allocation schemes should be formulated for microservices with different popularity levels.

[0056] The popularity level of each microservice can be determined based on data values from its multi-dimensional service data. For example, CPU utilization, request response time, and error rate can be used to determine the popularity level. The popularity level of each microservice, such as high, medium, or low, can be updated in real time to provide a basis for subsequent elastic scaling decisions. Different elastic scaling strategies and rules can be set for microservices with different popularity levels.

[0057] For example, for high-traffic microservices, a multi-instance deployment approach is used. The number of running instances is dynamically adjusted based on load level and traffic surge to handle high concurrency and traffic spikes. A certain proportion of spare resources (such as CPU and memory) are reserved for high-traffic microservices to improve system elasticity and stability. For medium-traffic microservices, a stable multi-instance deployment approach is used. A fixed number of running instances is set based on the average load level, and reasonable alarm thresholds are set in the central monitoring system. When the load level exceeds the threshold (e.g., CPU utilization exceeds 80%), an automatic scaling mechanism is triggered to increase the number of running instances to cope with the load increase. For low-traffic microservices, a single-instance deployment approach is used to save resource overhead.

[0058] At the same time, it is also necessary to set strict alarm thresholds in the central monitoring system (such as request failure rate exceeding 0.1%). When an anomaly occurs, the on-duty operation and maintenance personnel should be notified in a timely manner via email, SMS, telephone, etc., informing them of key information such as the name of the abnormal microservice, instance ID, abnormal indicators, and time of occurrence, and providing relevant logs and monitoring data to facilitate operation and maintenance personnel in quickly locating and handling the problem.

[0059] In this embodiment of the application, the above steps adjust the instances of the microservices based on the microservice deployment strategy, specifically including: First, given that the microservice deployment strategy determines to increase the number of microservice instances, the overall load score of multiple servers where microservices have been deployed is determined; the overall load score is used to indicate the load status of the corresponding server. Secondly, determine the scheduling priority score for newly added target instances of the microservice; Next, the target instance is scheduled and deployed by combining the comprehensive load score and the scheduling priority score.

[0060] For example, the target instance can be deployed on the server with the lowest overall load score; or, if the servers have the same overall load score, the target instance can be deployed on the server with the highest priority.

[0061] Here, when dynamically scheduling microservice instances based on microservice deployment strategies, it is necessary to comprehensively consider factors such as the resource usage of each server with deployed microservices, the popularity level of the deployed microservices, and the load of the instances of the deployed microservices, to calculate the overall load score for each server. A lower overall load score indicates a lighter server load, while a higher overall load score indicates a heavier server load.

[0062] Then, for the newly added microservice instance (i.e., the target instance), a scheduling priority score is calculated based on its popularity level and current load. A higher scheduling priority score indicates a higher scheduling priority for the instance, and a lower scheduling priority score indicates a lower scheduling priority for the instance.

[0063] Next, iterate through all servers to find the server with the lowest overall load score and sufficient resources to accommodate the newly added target instance, and schedule the new instance to that server. If multiple servers have the same overall load score, schedule the instance to the server with the highest priority based on the microservice instance's scheduling priority score. For high-traffic microservice instances, a resource reservation ratio (e.g., 20%) can be set to prioritize resource allocation during scheduling to handle traffic peaks. For low-traffic microservice instances, a resource reclamation threshold can be set (e.g., CPU utilization below 10%). When an instance's resource utilization remains below the threshold for an extended period, trigger instance scaling down or migration to save resource costs.

[0064] In this embodiment of the application, step S103, when determining that the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index, specifically includes the following steps: Step S41: When it is determined that the microservice meets the conditions for dynamic instance adjustment based on the microservice popularity index, the timing and magnitude of instance adjustment are predicted by a neural network model; Step S42: Dynamically adjust the microservice instances according to the timing and magnitude of instance adjustments.

[0065] In this embodiment, time series forecasting algorithms such as ARIMA and LSTM can be used to model and train historical load data of microservices to predict load trends over a future period. Simultaneously, external factors such as holidays and promotional activities are introduced as auxiliary variables to improve forecast accuracy. The time series forecasting model is built using Python's statsmodels library or the TensorFlow framework, and the model is periodically retrained to dynamically update the forecast results. This determines the optimal timing and scaling-down adjustment, avoiding frequent adjustments that could lead to service instability while minimizing resource waste.

[0066] Once the timing and extent of instance adjustments are determined, the microservice instances can be dynamically adjusted accordingly.

[0067] In this embodiment of the application, the method further includes: First, after receiving the target business request, obtain the load status data of each microservice instance and the number of requests processed by each instance; Secondly, the load status data determines the load balancing weight for each instance; Next, the target business requests will be assigned to the instance with the highest load balancer weight and the smallest number of requests to be processed.

[0068] Here, a load balancing and traffic scheduling mechanism can be established between microservices. This includes dynamically distributing requests to different running instances of microservices based on their popularity and load status, thereby achieving balanced load distribution and preventing individual instances from being overloaded and affecting overall performance.

[0069] Specifically, in a microservice architecture, service gateway components, such as Kong and Zuul, are introduced as a unified traffic entry point for microservices. Through service registration and discovery mechanisms, the routing information of microservice instances is dynamically managed to achieve load balancing and traffic scheduling of requests.

[0070] The service gateway obtains real-time load status data for each instance of each microservice from the microservice runtime monitoring system. This load status data includes key metrics such as CPU utilization, memory usage, and request response time. Based on this load status data, it comprehensively evaluates the load and health status of each instance. In the load balancing algorithm design, a weighted round-robin algorithm can be used to assign different weight values to each instance of different microservices. These weight values are dynamically adjusted based on the instance's popularity and load level. Furthermore, a list of microservice instances can be maintained, recording the weight value of each instance and the number of requests currently being processed.

[0071] Upon receiving a target business request, the instance list is traversed to find the instance with the highest weight and the fewest processed requests. The request is then assigned to this instance. After assigning the target business request, the number of requests processed by that instance is incremented by 1, and the instance's weight is recalculated using a weighted round-robin algorithm. The weight is calculated as weight = (1 - loadLevel) * heatFactor, where loadLevel represents the instance's load level (ranging from 0 to 1), and heatFactor represents the instance's heat level coefficient (also ranging from 0 to 1; higher heat levels result in larger heatFactor values). This process ensures load balancing while also considering the heat differences among microservice instances, making request distribution more intelligent.

[0072] In this embodiment of the application, the method further includes the following steps: First, obtain the response time and network latency data for each instance belonging to each region and / or each availability zone; Secondly, calculate the average response time and average network latency data for instances in each region and / or each availability zone; Next, based on average response time and average network latency data, cluster analysis is performed on the instances belonging to each region and / or each availability zone to obtain latency instance clusters. The latency instance clusters contain instances of microservices deployed in different locations whose latency exceeds a predetermined latency threshold. Finally, deploy the instances in the latency instance cluster to regions and / or availability zones that meet the preset distance and latency requirements.

[0073] In a microservice runtime environment, distributed tracing systems such as Zipkin or Jaeger inject identification information such as TraceID, SpanID, and sampling flags into the call chain between microservice instances to perform full-link tracing and performance analysis of microservice calls across regions and availability zones. In cross-region and cross-availability zone scenarios, multiple Zipkin or Jaeger Collector nodes are deployed to collect and aggregate tracing data locally.

[0074] The collected trace data from microservices is aggregated to a central data processing platform. Data cleaning and transformation tools, such as Apache Spark or Flink, are used to perform preprocessing operations such as cleaning, filtering, and aggregation on the raw trace data using common operators such as map, flatMap, filter, and reduce. Key feature fields such as the geographical location information, request timestamps, response times, and network latency of microservice instances are extracted, ultimately forming a structured latency feature dataset. The latency feature dataset contains at least the response time and network latency data of instances belonging to each region and / or each availability zone.

[0075] Next, statistical analysis is performed on the response time and network latency data of microservice instances to calculate the average response time and average network latency data of microservice instances in different regions and availability zones within a certain time window (such as the most recent 10 minutes). Then, a regional distribution map of microservice performance is generated based on the average response time and average network latency data.

[0076] The K-means clustering algorithm is used, with the average response time and average network latency of microservice instances as clustering features. Microservice instances across regions and availability zones are grouped according to latency features to obtain multiple latency instance clusters. First, k instances are randomly selected as initial cluster centers, and the following steps are repeated until the cluster centers no longer change or the maximum number of iterations is reached: For each instance, calculate its Euclidean distance to the k cluster centers and assign it to the nearest cluster. For each cluster, recalculate the cluster centers, which are the mean vectors of the p-dimensional features of all instances within the cluster. Output the final k clustering results. You can use the KMeans class from machine learning libraries such as scikit-learn, setting the n_clusters parameter to k, calling the fit method to train the clustering model, and then using the predict method to predict the clustering of new microservice instances.

[0077] Analyze the clusters of latency-prone instances formed by clustering, focusing on anomalous clusters with significantly higher response times or network latency. Determine if microservice instances in these clusters are deployed across regions or availability zones, identifying geographically isolated microservice instances with poor latency performance. For the identified geographically isolated microservice instances with high network latency, further analyze their operational status data, such as business attributes, load characteristics, and resource utilization, to assess the impact of geographical isolation on microservice performance and develop targeted optimization solutions. These solutions include dynamically adjusting the number of replicas of microservice instances in different regions for closer deployment, directing traffic to local instances with lower latency; scaling up high-latency instances vertically (increasing CPU, memory, and other resource quotas for individual instances) or horizontally (increasing the number of replicas) to improve overall service capacity; and optimizing links through technologies such as communication protocol compression, connection pooling, and asynchronous calls to reduce network overhead and waiting time for cross-regional calls.

[0078] Here, the deployment layout of microservices can be planned based on the latency of the instance cluster and the differences in regional popularity. The principle of proximity access is adopted. Under the premise of meeting the latency requirements, the microservice running instances are deployed in areas with high popularity and dense user population, shortening the access path and improving the service response speed.

[0079] In this embodiment of the application, the method further includes the following steps: First, the microservices are clustered according to their response time and request volume to obtain a first cluster group and a second cluster group; wherein the response time of the first cluster group is shorter than that of the second cluster group, and the request volume of the first cluster group is higher than that of the second cluster group. Secondly, the microservices in the first cluster group are scheduled to high-performance nodes, and the microservices in the second cluster group are scheduled to low-performance nodes.

[0080] Here, the automatic scheduling function of the Kubernetes container orchestration platform can be used to dynamically schedule containers based on the load requirements and popularity levels of microservices. High-popularity microservices can be scheduled to high-performance nodes, and low-popularity microservices can be scheduled to low-performance nodes, thus achieving on-demand allocation of resources.

[0081] In practice, runtime information such as the number of replicas, container configuration, and service access volume of each microservice in Kubernetes can be obtained. Combined with the business module and priority of the microservice, clustering algorithms such as K-means are used to select key indicators such as service response time, request volume, and CPU utilization as clustering features. The elbow rule is used to determine the optimal number of clusters k, and the microservices are automatically grouped. The clustering results can serve as an important reference for classifying microservice levels. For example, microservices with shorter response times and higher request volumes can be clustered into the core level (i.e., the first cluster group), while microservices with longer response times and lower request volumes can be clustered into the ordinary level (i.e., the second cluster group).

[0082] At this point, microservices with shorter response times and higher request volumes can be scheduled to high-performance nodes, while microservices with longer response times and lower request volumes can be scheduled to low-performance nodes.

[0083] As described above, the technical solution of this application can dynamically adjust service deployment based on regional popularity, balance load distribution, meet the access needs of users in different regions, and take into account service availability and performance. It can also take into account factors such as network quality and distance between different regions, rationally plan the service deployment layout, and reduce the latency and cost of cross-regional calls through modes such as proximity access, thereby improving user experience.

[0084] Those skilled in the art will understand that, in the above-described method of the specific implementation, the order in which each step is written does not imply a strict execution order and does not constitute any limitation on the implementation process. The specific execution order of each step should be determined by its function and possible internal logic.

[0085] Based on the same inventive concept, this disclosure also provides a microservice deployment device in a cloud-native environment corresponding to the microservice deployment method in a cloud-native environment. Since the principle of the device in this disclosure for solving the problem is similar to the microservice deployment method in a cloud-native environment described above, the implementation of the device can refer to the implementation of the method, and the repeated parts will not be described again.

[0086] Reference Figure 2 The diagram shown is a schematic of a microservice deployment device in a cloud-native environment provided by an embodiment of this disclosure. The device includes: an acquisition unit 10, a determination unit 20, and an adjustment unit 30; wherein, The acquisition unit is used to acquire multi-dimensional service data of microservices. A determining unit is configured to determine the popularity index of the microservice based on the multi-dimensional service data; wherein the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; An adjustment unit is used to dynamically adjust the instances of a microservice when the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index; wherein, the dynamic instance adjustment includes adding or reducing instances.

[0087] In one possible implementation, when the heat index includes a real-time heat index, the adjustment unit is further configured to: When the real-time popularity metric of the microservice is determined to be higher than the first popularity metric threshold, a first alarm message is generated, and the instance of the microservice is scaled up; the real-time popularity metric is used to indicate the current resource usage of the microservice; When the real-time popularity index of the microservice is determined to be lower than the second popularity index threshold, a second alarm message is generated, and the instance of the microservice is scaled down.

[0088] In one possible implementation, when the popularity index includes a comprehensive popularity index, the adjustment unit is further configured to: Based on the comprehensive popularity index, predict the future popularity trend of the microservice; When the trend of the popularity change is determined to be an increase in the comprehensive popularity index, the instances of the microservice are scaled up and adjusted; the comprehensive popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; When the trend of popularity change is determined to be a decrease in the overall popularity index, the instances of the microservice are scaled down.

[0089] In one possible implementation, the adjustment unit is further configured to: The popularity level of the microservice is determined based on the popularity index. Determine a microservice deployment strategy that matches the popularity level result; wherein the microservice deployment strategy is used to indicate the strategy for adjusting instances of the microservice; The instances of the microservices are adjusted based on the microservice deployment strategy.

[0090] In one possible implementation, the adjustment unit is further configured to: When the microservice deployment strategy determines to increase the number of instances of the microservice, a comprehensive load score is determined for multiple servers where the microservice has been deployed; wherein, the comprehensive load score is used to indicate the load status of the corresponding server; The scheduling priority score is determined for the newly added target instance of the microservice; The target instance is scheduled and deployed by combining the comprehensive load score and the scheduling priority score.

[0091] In one possible implementation, the adjustment unit is further configured to: Deploy the target instance on the server with the lowest overall load score; or If the overall load scores of the servers are the same, the target instance shall be deployed on the server with the highest priority.

[0092] In one possible implementation, the adjustment unit is further configured to: When it is determined that a microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index, a neural network model is used to predict the timing and magnitude of instance adjustment. The microservice instances are dynamically adjusted according to the timing and magnitude of the instance adjustments.

[0093] In one possible implementation, the device is further used for: After receiving the target business request, obtain the load status data of each instance of the microservice and the number of requests processed by each instance; The load balancing weight for each instance is determined based on the load status data; The target service request is assigned to the instance with the highest load balancer weight and the smallest number of requests to be processed.

[0094] In one possible implementation, the device is further used for: Obtain response time and network latency data for instances belonging to each region and / or each availability zone; Calculate the average response time and average network latency data for each instance in each of the said regions and / or each of the said availability zones; Based on the average response time and the average network latency data, cluster analysis is performed on the instances belonging to each region and / or each availability zone to obtain latency instance clusters, wherein the latency instance clusters include instances of microservices deployed in different locations whose latency exceeds a predetermined latency threshold; Deploy the instances in the latency instance cluster to regions and / or availability zones that meet preset distance and latency requirements.

[0095] In one possible implementation, the device is further used for: Based on response time and request volume, the microservices are clustered into a first cluster group and a second cluster group; wherein the response time of the first cluster group is shorter than that of the second cluster group, and the request volume of the first cluster group is higher than that of the second cluster group. Microservices in the first cluster group are scheduled to high-performance nodes, and microservices in the second cluster group are scheduled to low-performance nodes.

[0096] The processing flow of each module in the device and the interaction flow between each module can be referred to the relevant descriptions in the above method embodiments, and will not be detailed here.

[0097] Corresponding to Figure 1 This disclosure also provides an electronic device 300, such as a microservice deployment method in a cloud-native environment. Figure 3 The diagram shown is a structural schematic of an electronic device 300 provided in an embodiment of this disclosure, including: The system includes a processor 31, a memory 32, and a bus 33. The memory 32 stores execution instructions and includes main memory 321 and external memory 322. The main memory 321, also called internal memory, temporarily stores the computational data in the processor 31, as well as data exchanged with external memory such as a hard disk. The processor 31 exchanges data with the external memory 322 through the main memory 321. When the electronic device 300 is running, the processor 31 communicates with the memory 32 through the bus 33, causing the processor 31 to execute the following instructions: Obtain multi-dimensional service data for microservices; The popularity index of the microservice is determined based on the multi-dimensional service data; wherein, the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; When a microservice meets the conditions for dynamic instance adjustment based on its popularity metric, the instances of the microservice are dynamically adjusted; wherein, the dynamic instance adjustment includes adding or reducing instances.

[0098] This disclosure also provides a computer-readable storage medium storing a computer program, which, when executed by a processor, performs the steps of the microservice deployment method in a cloud-native environment described in the above method embodiments. The storage medium can be a volatile or non-volatile computer-readable storage medium.

[0099] This disclosure also provides a computer program product carrying program code. The program code includes instructions that can be used to execute the steps of the microservice deployment method in a cloud-native environment described in the above method embodiments. For details, please refer to the above method embodiments, which will not be repeated here.

[0100] The aforementioned computer program product can be implemented through hardware, software, or a combination thereof. In one optional embodiment, the computer program product is specifically embodied in a computer storage medium; in another optional embodiment, the computer program product is specifically embodied in a software product, such as a software development kit (SDK), etc.

[0101] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems and devices described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here. In the several embodiments provided in this disclosure, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. The device embodiments described above are merely illustrative. For example, the division of units is only a logical functional division; in actual implementation, there may be other division methods. Furthermore, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Another point is that the displayed or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces; the indirect coupling or communication connection of devices or units may be electrical, mechanical, or other forms.

[0102] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0103] In addition, the functional units in the various embodiments of this disclosure can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.

[0104] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a processor-executable, non-volatile, computer-readable storage medium. Based on this understanding, the technical solution of this disclosure, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this disclosure. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0105] Finally, it should be noted that the above-described embodiments are merely specific implementations of this disclosure, used to illustrate the technical solutions of this disclosure, and not to limit it. The protection scope of this disclosure is not limited thereto. Although this disclosure has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that any person skilled in the art can still modify or easily conceive of changes to the technical solutions described in the foregoing embodiments, or make equivalent substitutions for some of the technical features, within the scope of the technology disclosed in this disclosure. Such modifications, changes, or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of this disclosure, and should all be covered within the protection scope of this disclosure. Therefore, the protection scope of this disclosure should be determined by the protection scope of the claims.

Claims

1. A method for deploying microservices in a cloud-native environment, characterized in that, The method includes: Obtain multi-dimensional service data for microservices; The popularity index of the microservice is determined based on the multi-dimensional service data; wherein, the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice. When a microservice meets the conditions for dynamic instance adjustment based on its popularity metric, the instances of the microservice are dynamically adjusted; wherein, the dynamic instance adjustment includes adding or reducing instances.

2. The method according to claim 1, characterized in that, The popularity metric includes a real-time popularity metric; the step of dynamically adjusting the instances of the microservice when the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity metric includes: When the real-time popularity metric of the microservice is determined to be higher than the first popularity metric threshold, a first alarm message is generated, and the instance of the microservice is scaled up; the real-time popularity metric is used to indicate the current resource usage of the microservice; When the real-time popularity index of the microservice is determined to be lower than the second popularity index threshold, a second alarm message is generated, and the instance of the microservice is scaled down.

3. The method according to claim 1, characterized in that, The popularity index includes a comprehensive popularity index; the step of dynamically adjusting the instances of the microservice when the microservice meets the conditions for dynamic instance adjustment based on the microservice popularity index includes: Based on the comprehensive popularity index, predict the future popularity trend of the microservice; When the trend of the popularity change is determined to be an increase in the comprehensive popularity index, the instances of the microservice are scaled up and adjusted; the comprehensive popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; When the trend of popularity change is determined to be a decrease in the overall popularity index, the instances of the microservice are scaled down.

4. The method according to claim 1, characterized in that, The dynamic adjustment of the microservice instances includes: The popularity level of the microservice is determined based on the popularity index. Determine a microservice deployment strategy that matches the popularity level result; wherein the microservice deployment strategy is used to indicate the strategy for adjusting instances of the microservice; The instances of the microservices are adjusted based on the microservice deployment strategy.

5. The method according to claim 4, characterized in that, The adjustment of the microservice instance based on the microservice deployment strategy includes: When the microservice deployment strategy determines to increase the number of instances of the microservice, a comprehensive load score is determined for multiple servers where the microservice has been deployed; wherein, the comprehensive load score is used to indicate the load status of the corresponding server; The scheduling priority score is determined for the newly added target instance of the microservice; The target instance is scheduled and deployed by combining the comprehensive load score and the scheduling priority score.

6. The method according to claim 5, characterized in that, The step of scheduling and deploying the target instance by combining the comprehensive load score and the scheduling priority score includes: Deploy the target instance on the server with the lowest overall load score; or If the overall load scores of the servers are the same, the target instance shall be deployed on the server with the highest priority.

7. The method according to claim 1, characterized in that, When the microservice is determined to meet the conditions for dynamic instance adjustment based on the microservice's popularity metric, the dynamic adjustment of the microservice's instances includes: When it is determined that a microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index, a neural network model is used to predict the timing and magnitude of instance adjustment. The instances of the microservice are dynamically adjusted according to the timing and magnitude of the instance adjustment.

8. The method according to claim 1, characterized in that, The method further includes: After receiving the target business request, obtain the load status data of each instance of the microservice and the number of requests processed by each instance; The load balancing weight for each instance is determined based on the load status data; The target service request is assigned to the instance with the highest load balancer weight and the smallest number of requests to be processed.

9. The method according to claim 1, characterized in that, The method further includes: Obtain response time and network latency data for instances belonging to each region and / or each availability zone; Calculate the average response time and average network latency data for each instance in each of the said regions and / or each of the said availability zones; Based on the average response time and the average network latency data, cluster analysis is performed on the instances belonging to each region and / or each availability zone to obtain latency instance clusters, wherein the latency instance clusters include instances of microservices deployed in different locations whose latency exceeds a predetermined latency threshold; Deploy the instances in the latency instance cluster to regions and / or availability zones that meet preset distance and latency requirements.

10. The method according to claim 1, characterized in that, The method further includes: Based on response time and request volume, the microservices are clustered into a first cluster group and a second cluster group; wherein the response time of the first cluster group is shorter than that of the second cluster group, and the request volume of the first cluster group is higher than that of the second cluster group. Microservices in the first cluster group are scheduled to high-performance nodes, and microservices in the second cluster group are scheduled to low-performance nodes.

11. A microservice deployment device in a cloud-native environment, characterized in that, The device includes: The acquisition unit is used to acquire multi-dimensional service data of microservices. A determining unit is configured to determine the popularity index of the microservice based on the multi-dimensional service data; wherein the popularity index is used to indicate the overall load pressure and / or resource usage of the microservice; An adjustment unit is used to dynamically adjust the instances of a microservice when the microservice meets the conditions for dynamic instance adjustment based on the microservice's popularity index; wherein, the dynamic instance adjustment includes adding or reducing instances.

12. An electronic device, characterized in that, include: The device includes a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the processor communicates with the memory via the bus. When the machine-readable instructions are executed by the processor, they perform the steps of microservice deployment in a cloud-native environment as described in any one of claims 1 to 10.

13. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps of microservice deployment in a cloud-native environment as described in any one of claims 1 to 10.

14. A computer program product, comprising a computer program, characterized in that, When executed by a processor, the computer program implements the steps of microservice deployment in a cloud-native environment as described in any one of claims 1 to 10.