Abnormality identification method and device, storage medium and electronic device
By dynamically calculating abnormal parameter thresholds and interactive processing, the problem that fixed thresholds cannot adapt to dynamic load environments is solved, enabling accurate identification and rapid closed-loop processing of abnormal containers, thus improving identification accuracy and operational efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HANGZHOU NETEASE CLOUD MUSIC TECH CO LTD
- Filing Date
- 2026-02-10
- Publication Date
- 2026-06-19
AI Technical Summary
In existing anomaly identification methods, fixed thresholds are difficult to adapt to dynamic load environments, resulting in poor accuracy in identifying abnormal containers, low efficiency in manual investigation, and a high risk of secondary failures.
By acquiring resource usage parameters of multiple containers of the target application service, extracting parameter distribution feature information, dynamically calculating abnormal parameter thresholds, and combining preset deviation factors and resource usage parameters, abnormal containers are identified, and rapid closed-loop processing is achieved through interactive message cards.
It enables accurate identification of abnormal containers in dynamic load environments, reduces false alarm rates, shortens fault repair time, improves operational efficiency, and ensures high system availability and security.
Smart Images

Figure CN122240450A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and specifically to an anomaly identification method, apparatus, storage medium, and electronic device. Background Technology
[0002] With the rapid development of internet technology, cloud computing is becoming increasingly widespread. For example, applications are run in containers, enabling the deployment and operation of application services in a containerized manner. During application deployment, due to resource limiting mechanisms, multiple containers of the target application service are highly susceptible to single-machine failures. Existing anomaly detection methods often involve configuring fixed rate-limiting thresholds for resources, triggering alerts for abnormal containers when resource usage exceeds these thresholds.
[0003] However, due to the significant differences in traffic models across different businesses and time periods, fixed thresholds are difficult to adapt to dynamic load environments, resulting in poor accuracy in identifying anomalies in containers deployed for application services. Summary of the Invention
[0004] This application provides an anomaly identification method, apparatus, storage medium, and electronic device, which can accurately identify target containers with anomalies among multiple containers of a target application service based on dynamic anomaly parameter thresholds, effectively improving the accuracy of anomaly container identification.
[0005] This application provides an anomaly identification method, including:
[0006] Obtain resource usage parameters for multiple containers corresponding to the target application service; Based on the resource usage parameters, extract the parameter distribution feature information corresponding to the resource usage parameters; Based on the parameter distribution characteristic information, calculate the abnormal parameter threshold corresponding to the resource usage parameters; Based on the abnormal parameter threshold and the resource usage parameters, the target container with abnormality is identified among the multiple containers.
[0007] Accordingly, embodiments of this application provide an anomaly identification device, including: The acquisition unit is used to acquire resource usage parameters of multiple containers corresponding to the target application service; The extraction unit is used to extract parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; The calculation unit is used to calculate the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information; An identification unit is used to identify a target container with an anomaly among the plurality of containers based on the anomaly parameter threshold and the resource usage parameters.
[0008] Furthermore, embodiments of this application also provide a computer-readable storage medium storing a computer program adapted for loading by a processor to execute steps in any of the anomaly identification methods provided in embodiments of this application.
[0009] Furthermore, this application also provides an electronic device, including a processor and a memory, wherein the memory stores an application program, and the processor is used to run the application program in the memory to implement the anomaly identification method provided in this application.
[0010] This application also provides a computer program product, which includes a computer program stored in a computer-readable storage medium. When the processor of an electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, causing the electronic device to perform the steps in the anomaly identification method provided in this application.
[0011] This application embodiment obtains resource usage parameters of multiple containers corresponding to a target application service; extracts parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; calculates abnormal parameter thresholds corresponding to the resource usage parameters based on the parameter distribution feature information; and identifies the target container with abnormalities among the multiple containers based on the abnormal parameter thresholds and the resource usage parameters. Therefore, by dynamically calculating the abnormal parameter thresholds corresponding to the resource usage parameters based on the parameter distribution of the resource usage parameters of multiple containers corresponding to the target application service, the abnormal target container can be accurately identified among the multiple containers of the target application service based on the dynamic abnormal parameter thresholds, effectively improving the accuracy of abnormal container identification. Attached Figure Description
[0012] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0013] Figure 1 This is a schematic diagram illustrating an implementation scenario of an anomaly identification method provided in this application. Figure 2 This is a flowchart illustrating an anomaly identification method provided in an embodiment of this application; Figure 3This is a schematic diagram of the target message of an anomaly identification method provided in an embodiment of this application; Figure 4a This is a schematic flowchart of an anomaly identification method provided in an embodiment of this application; Figure 4b This is another specific flowchart illustrating an anomaly identification method provided in an embodiment of this application; Figure 5 This is a schematic diagram of the anomaly detection device provided in the embodiments of this application; Figure 6 This is a schematic diagram of the structure of the electronic device provided in the embodiments of this application. Detailed Implementation
[0014] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0015] This application provides an anomaly identification method, apparatus, storage medium, and electronic device. The anomaly identification apparatus can be integrated into an electronic device, which may be a server, a terminal, or other similar device.
[0016] The server can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery network (CDN) acceleration services, and big data and artificial intelligence platforms. The terminal can include, but is not limited to, mobile phones, computers, smart voice interaction devices, smart home appliances, vehicle terminals, and aircraft. The terminal and server can be directly or indirectly connected via wired or wireless communication, which is not limited herein.
[0017] Please see Figure 1 Taking the integration of anomaly detection devices into electronic devices as an example, Figure 1 This is a schematic diagram of an implementation scenario of the anomaly identification method provided in this application embodiment. The electronic device can obtain resource usage parameters of multiple containers corresponding to the target application service; extract parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; calculate the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information; and identify the target container with anomalies among the multiple containers based on the abnormal parameter threshold and the resource usage parameters.
[0018] It should be noted that, Figure 1 The illustrated scenario of the anomaly detection method is merely an example. The implementation environment of the anomaly detection method described in this application is intended to more clearly illustrate the technical solutions of this application and does not constitute a limitation on the technical solutions provided in this application. Those skilled in the art will recognize that, with the evolution of data processing and the emergence of new business scenarios, the technical solutions provided in this application are equally applicable to similar technical problems.
[0019] The solutions provided in this application are specifically illustrated through the following embodiments. It should be noted that the order of description of the following embodiments is not intended to limit the preferred order of the embodiments.
[0020] This embodiment will be described from the perspective of an anomaly detection device, which can be integrated into an electronic device, which can be a terminal and / or a server, and this application does not limit it.
[0021] Please see Figure 2 , Figure 2 This is a flowchart illustrating the anomaly identification method provided in an embodiment of this application. The anomaly identification method includes: In step 101, the resource usage parameters of multiple containers corresponding to the target application service are obtained.
[0022] The target application service can be an application service implemented based on multiple deployed containers. This application service can be a service provided by an application. The target application service can be configured with a corresponding deployment unit (Deployment) to implement the services provided by the target application service. A deployment unit can also be described as an application instance of the target application service. Each deployment unit can include multiple container groups (Pods), and each container group can include one or more containers. A container group can be the smallest deployment and management unit in a Kubernetes (K8s) container orchestration engine cluster. The resource usage parameters can be information measuring the container's use of relevant resources. These relevant resources can include Central Processing Unit (CPU) resources, storage resources, network resources, Graphics Processing Unit (GPU) resources, etc.
[0023] Optionally, the resource usage parameter may include at least one of the following: resource rate limiting rate and resource utilization rate for the target resource. For example, the target resource may include CPU resources, and correspondingly, the resource rate limiting rate may be the CPU rate limiting rate, and the resource utilization rate may be the CPU utilization rate, etc.
[0024] There are several ways to obtain the resource usage parameters of multiple containers corresponding to the target application service. For example, if the resource usage parameters include the resource rate limiting rate for the target resource, the statistical parameters of the restriction on the use of the target resource by multiple containers corresponding to the target application service under a preset period can be obtained. The statistical parameters include at least one of the restriction duration and the number of restrictions. Based on the statistical parameters and the preset period, the resource rate limiting rate of the container is calculated.
[0025] The preset period can be either a period for allocating resources to the target resource or a period for scheduling the target resource. For example, the preset period can be a fixed-length time-slicing window used for time-slicing and quota management of CPU resources; the CPU time-slicing window can be a CPU time slice divided according to a period. The statistical parameter can be a parameter used to measure the extent to which a container's use of the target resource is restricted; this statistical parameter can include, but is not limited to, the restriction duration and the number of restrictions. The restriction duration can be the time during which a container's use of the target resource is restricted, and the number of restrictions can be the number of times a container's use of the target resource is restricted.
[0026] Optionally, the resource rate limiting rate can be calculated as the ratio of the number of times a container is limited across multiple scheduling cycles to the total number of cycles, or as the ratio of the total duration of the container being limited across multiple scheduling cycles to the total duration of those multiple scheduling cycles. The resource rate limiting rate can be used to characterize the frequency or intensity of container resource requests being limited by the system.
[0027] With the widespread adoption of microservice architectures and containerized deployments (such as Kubernetes), container technology utilizes control groups to isolate and limit resources, including container CPU allocation mechanisms and CPU throttling mechanisms. Container CPU allocation refers to Kubernetes managing CPU resources through requests and limits. Underlying this is the Completely Fair Scheduler (CFS). For containers with CPU limits set, the kernel sets a preset period (typically 100 milliseconds) and a quota. For example, `limit=0.5 cores` means that the container can only use a maximum of 50 milliseconds of CPU resources per 100-millisecond period. As for CPU throttling, when a container process exhausts its CPU quota within the current period, even if the physical machine's CPU is still idle, the kernel will forcibly suspend the process until the next period. This phenomenon is called CPU throttling. Common causes include sudden traffic surges exceeding the limit, excessively low limit settings, or kernel scheduling delays.
[0028] Optionally, when the statistical parameter is the number of times the target application service is restricted, the number of times the target resource is restricted for multiple containers within multiple preset periods can be obtained, and the number of times the container is restricted within multiple preset periods can be obtained. Based on the number of times the container is restricted and the number of periods of multiple preset periods, the resource rate limit rate corresponding to each container can be calculated.
[0029] In one embodiment, there are multiple ways to calculate the resource rate limiting rate for each container based on the number of limit attempts and the number of preset cycles. For example, the ratio of the number of limit attempts to the number of preset cycles can be calculated to obtain the resource rate limiting rate for each container. For example, it can be expressed as:
[0030] Here, rate_throthled can be represented as the resource rate limit of the container, nr_throthled can be represented as the number of times the container is limited within multiple preset periods, and nr_periods can be represented as the number of periods within multiple preset periods.
[0031] Optionally, when the statistical parameter is the time limit, the time during which multiple containers corresponding to the target application service are restricted from using the target resources within multiple preset periods can be obtained to obtain the time limit during which the containers are restricted from using the target resources within multiple preset periods. Based on the time limit and the total period duration of multiple preset periods, the resource rate limit for each container can be calculated.
[0032] There are several ways to calculate the resource rate limiting rate for each container based on the limited duration and the total duration of multiple preset periods. For example, the ratio between the limited duration for each container and the total duration of multiple preset periods can be calculated to obtain the resource rate limiting rate for each container.
[0033] In one embodiment, the anomaly identification method provided in this application can be applied to an anomaly identification system. This system can periodically traverse the application service list configured for the application through a scheduled task (Job), automatically discovering all online application instance clusters corresponding to the application services under the application, and filtering out application instance clusters in unstable states (such as those being deployed or deleted). The application instance cluster can be a whole composed of multiple containers managed by a deployment unit corresponding to the application service. Then, by calling the application programming interface (API) of the underlying monitoring system, real-time monitoring data of all containers in the application instance cluster can be collected concurrently to obtain the resource usage parameters corresponding to each container.
[0034] In step 102, parameter distribution feature information corresponding to the resource usage parameters is extracted based on the resource usage parameters.
[0035] The parameter distribution characteristic information can be information that characterizes the distribution of resource usage parameters of multiple containers.
[0036] There are several ways to extract parameter distribution feature information corresponding to resource usage parameters based on resource usage parameters. For example, the concentrated distribution parameter of resource usage parameters can be calculated to obtain the first parameter distribution feature information; the fluctuation range parameter of resource usage parameters can be calculated to obtain the second parameter distribution feature information; and the parameter distribution feature information corresponding to resource usage parameters can be obtained based on the first parameter distribution feature information and the second parameter distribution feature information.
[0037] The central distribution parameter can be a parameter characterizing the central distribution of the data, such as the mean, median, or mode. The first parameter distribution characteristic information can be information indicating the central distribution of resource usage parameters of multiple containers of the target application service, such as the calculated average of the resource usage parameters of the multiple containers of the target application service. The fluctuation range parameter can be a parameter characterizing the fluctuation range of the data, such as the standard deviation, variance, or mean absolute deviation. The second parameter distribution characteristic information can be information indicating the fluctuation range of resource usage parameters of multiple containers of the target application service, such as the calculated standard deviation of the resource usage parameters of the multiple containers of the target application service.
[0038] In one embodiment, the average value of resource usage parameters of multiple containers corresponding to the target application service can be calculated to obtain first parameter distribution feature information, and the standard deviation of resource usage parameters of multiple containers can be calculated to obtain second parameter distribution feature information. Thus, based on the first parameter distribution feature information and the second parameter distribution feature information, parameter distribution feature information corresponding to resource usage parameters can be obtained.
[0039] In step 103, based on the parameter distribution characteristic information, the abnormal parameter thresholds corresponding to the resource usage parameters are calculated.
[0040] The abnormal parameter threshold can be determined based on the distribution of resource usage parameters of multiple containers of the target application service, and is used to determine whether the resource usage parameters of the container are abnormal.
[0041] There are several ways to calculate the abnormal parameter threshold corresponding to the resource usage parameter based on the parameter distribution feature information. For example, the product of the preset deviation factor and the second parameter distribution feature information can be calculated to obtain the product result. Based on the first parameter distribution feature information and the product result, the abnormal parameter threshold corresponding to the resource usage parameter can be determined.
[0042] The preset deviation factor can be a pre-set deviation factor, also known as a multiple factor or Z-score multiple, which can be used to control the sensitivity of the abnormal threshold.
[0043] For example, the mean (Mean) and standard deviation (stdDev) of resource usage parameters for multiple containers of a target application service can be calculated to obtain parameter distribution characteristics. Then, statistical principles can be used to dynamically calculate the outlier thresholds corresponding to the resource usage parameters. The specific calculation formula can be expressed as follows: threshold = Mean + n × stdDev Where n is the preset deviation factor (the default is 2, which covers about 95% of normal samples), and threshold can be represented as the threshold of abnormal parameters corresponding to resource usage parameters.
[0044] In step 104, based on the abnormal parameter threshold and resource usage parameters, the target container with anomalies is identified among multiple containers.
[0045] The target container can be one of the multiple containers in the target application service that contains an anomaly.
[0046] There are several ways to identify the target container with anomalies among multiple containers based on the abnormal parameter threshold and resource usage parameters. For example, the resource usage parameters of each container can be compared with the abnormal parameter threshold, and the container with the resource usage parameters greater than the abnormal parameter threshold can be identified as the target container with anomalies among the multiple containers of the target application service.
[0047] Therefore, the anomaly identification method provided in this application dynamically determines anomaly parameter thresholds based on the parameter distribution of resource usage parameters of multiple containers of the target application service. This dynamically determined thresholds adaptively adjust the current load level of the multiple containers within the target application service, enabling accurate identification of anomaly-prone containers. If the overall load of multiple containers is high, the average value increases, and the threshold automatically rises; conversely, if the overall load of multiple containers is low, the average value decreases, and the threshold automatically drops, thus avoiding false alarms. If only individual containers deviate from the overall distribution, they can be accurately identified using the anomaly identification method provided in this application, effectively improving the accuracy of anomaly identification for multiple containers of the target application service.
[0048] Optionally, to further improve the accuracy of anomaly identification, containers with resource usage parameters greater than the anomaly parameter threshold and CPU rate throttling or CPU utilization greater than a preset threshold can be identified as the target containers among multiple containers serving the target application that are abnormal.
[0049] There are several ways to identify a target container with an anomaly among multiple containers based on anomaly parameter thresholds and resource usage parameters. For example, a container whose resource usage parameter is greater than the anomaly parameter threshold and also greater than a preset resource usage parameter threshold can be identified as a target container with an anomaly.
[0050] The resource usage parameter threshold can be a threshold used to assist in determining whether a container is abnormal, or it can be a minimum fallback threshold corresponding to the resource usage parameter. When the resource usage parameter exceeds this threshold, it indicates that the container's resource usage may be abnormal. Setting a preset resource usage parameter threshold (or minimum fallback threshold) is to avoid situations where the dynamically calculated abnormal parameter threshold might be too low when the overall load of the application instance cluster is extremely low and the standard deviation is small, thus misjudging some containers with very low absolute resource utilization but relatively deviating resource usage as abnormal. Therefore, by combining dynamic thresholds and static fallback thresholds, the robustness of anomaly detection can be further improved.
[0051] For example, the resource usage parameters can include CPU rate limiting and CPU utilization. The threshold for these parameters can be the minimum fallback threshold corresponding to CPU utilization. When the CPU utilization of a container exceeds this threshold, it indicates a possible anomaly in the container's resource usage. Therefore, based on the CPU rate limiting of multiple containers of a target application service, parameter distribution characteristics can be statistically analyzed. Then, based on these characteristics, the abnormal parameter threshold corresponding to the CPU rate limiting of multiple containers of the target application service can be calculated. This allows containers among the multiple containers of the target application service whose CPU rate limiting is greater than the abnormal parameter threshold and whose CPU utilization is greater than the preset resource usage parameter threshold to be identified as abnormal target containers. It should be noted that the above explanation uses CPU resources as the target resource as an example; however, target resources are not limited to this and can also include various computing resources such as memory, network bandwidth, and GPUs.
[0052] For example, the resource usage parameter may include CPU rate limiting, and the threshold for this parameter can be the minimum fallback threshold corresponding to the CPU rate limiting. When the CPU rate limiting of a container exceeds this threshold, it indicates that the container's resource usage may be abnormal. Therefore, based on the CPU rate limiting of multiple containers of the target application service, parameter distribution characteristics can be statistically analyzed. Then, based on these characteristics, the abnormal parameter threshold corresponding to the CPU rate limiting of multiple containers of the target application service can be calculated. This allows containers among the multiple containers corresponding to the target application service whose CPU rate limiting is greater than both the abnormal parameter threshold and the resource usage parameter threshold to be identified as abnormal target containers.
[0053] Optionally, after identifying an abnormal target container, feedback and processing can be performed on the abnormal target container. For example, an exception handling prompt message can be generated for the target container, and the exception handling prompt message can be sent to the target object associated with the target application service; an exception handling request can be received from the target object in response to the exception handling prompt message, the exception handling request being used to request resource release processing for the target container; the target health rate of multiple containers corresponding to the target application service can be predicted if the target container is released; and the exception handling request can be responded to based on the target health rate.
[0054] The exception handling prompt information can be used to prompt the target container to handle the exception. The target object can be the relevant personnel handling the target container, such as the operations and maintenance personnel of the target application service. The exception handling request can be a request triggered by the target object to handle the target container, such as requesting the target container to be taken offline or destroyed. Resource release processing can be an operation to release resources from the target container, such as taking the target container offline or destroying it. The target health rate can be the predicted health rate of multiple containers corresponding to the target application service after releasing the target container, and can be used to characterize the health status of multiple containers corresponding to the target application service after releasing the target container.
[0055] There are several ways to generate exception handling prompts for the target container. For example, the exception handling prompts include target messages, which can be generated based on the state attribute information associated with the target container. The target message includes at least one processing control, each of which corresponds to an exception handling request for the target container. The processing control is used to provide feedback on its corresponding exception handling request after being triggered. The target message is sent to the target object associated with the target application service through an instant messaging application.
[0056] The status attribute information can be information about the status and attributes associated with the target container. For example, it may include the application instance cluster name of the target application service to which the target container belongs, the name of the application instance, the Internet Protocol Address (IP address), anomaly indicators, and the overall health of the application instance cluster. The target message can be a message sent to the target object to prompt it to make a decision regarding the handling of the abnormal target container. For example, the target message can be a visual message card, which can be sent to the target object via an instant messaging application, allowing the target object to conveniently and intuitively obtain relevant information about the abnormal target container based on the visual message card, thus enabling faster and more accurate handling of the target container. The handling control can be a control that allows the target object to select the handling method for the target container. Each handling control corresponds to an anomaly handling request for the target container, which may include a request to take the target container offline or destroy it. For example, the handling control may include a offline control and a destroy control. The offline control can be used to trigger a request to take the target container offline, and the destroy control can be used to trigger a request to destroy the target container, etc.
[0057] For example, please refer to Figure 3 , Figure 3 This is a schematic diagram of a target message provided in an embodiment of this application for an anomaly identification method. A visual message card, i.e., the target message, can be constructed based on the identified state attribute information related to the target container. This target message can then be sent to the application manager (i.e., the target object) via the robot interface of an instant messaging application. The message card content may include: the application instance cluster name, the name of the instance, the instance IP, anomaly indicators (CPU rate limiting, CPU utilization, etc.), and the overall health of the application instance cluster. Furthermore, the message card can directly embed processing controls such as "offline" and "destroy." These processing controls can be bound to a callback Uniform Resource Locator (URL) pointing to the security operation interface of the anomaly identification system to ensure secure handling of the containers corresponding to the application services.
[0058] Optionally, there are several ways to predict the target health rate of multiple containers corresponding to the target application service when the target container is released. For example, the total number of containers corresponding to the target application service and the number of containers in a healthy state can be obtained; based on the total number of containers and the number of containers, the target health rate of multiple containers corresponding to the target application service when the target container is released can be calculated.
[0059] The total number of containers can be the total number of containers corresponding to the target application service, and the number of containers can be the number of containers in a healthy state among the multiple containers. The healthy state can refer to the state when the container is in the ready state.
[0060] There are several ways to calculate the target health rate of multiple containers corresponding to the target application service when the target container is released, based on the total number of containers and the number of containers. For example, the number of containers can be reduced by one to obtain the number of containers after the reduction, and the ratio of the number of containers after the reduction to the total number of containers can be calculated to obtain the target health rate of multiple containers corresponding to the target application service when the target container is released.
[0061] For example, the specific formula for calculating the target health rate can be expressed as:
[0062] After calculating the target health rate of multiple containers corresponding to the target application service under the condition of releasing the target container, based on the total number of containers and the number of containers, exception handling requests can be responded to based on the target health rate. There are several ways to respond to exception handling requests based on the target health rate. For example, if the target health rate is not less than a preset health rate threshold, resource release processing can be performed on the target container based on the exception handling request; if the target health rate is less than the preset health rate threshold, the exception handling request will not be responded to, and an exception message will be generated for the exception handling request.
[0063] The preset health rate threshold refers to a pre-set health rate threshold for the target application service. If, after taking the target container offline or destroying it among multiple containers corresponding to the target application service, the health rate of the remaining containers is less than this threshold, it indicates that the target application service may have problems after removing the target container, and taking it offline or destroying it is not recommended to avoid a service avalanche due to insufficient available nodes. If the health rate of the remaining containers is not less than the threshold, it indicates that the target application service can still operate normally after removing the target container, and the target container can be taken offline or destroyed. The exception message can indicate that the currently triggered exception handling request has encountered an error.
[0064] In one embodiment, please refer to Figure 4a , Figure 4aThis is a schematic diagram illustrating a specific process of an anomaly identification method provided in this application embodiment. The anomaly identification method provided in this application embodiment can be applied to an anomaly identification system. This system can trigger a host diagnostic service through a scheduled task platform to execute a scheduled task, periodically traversing and scanning the application instance cluster list corresponding to the application and application services to check the application instance cluster status. It automatically discovers all application instance clusters corresponding to application services under the application that are online, and filters out application instance clusters that are in an unstable state (such as those being published or deleted). The application instance cluster can be a whole composed of multiple containers corresponding to the application service. Next, by calling the API of the underlying monitoring system, real-time monitoring data of CPU usage for all containers in the application instance cluster can be concurrently pulled from the monitoring system to obtain the resource usage parameters corresponding to each container. Then, statistical analysis can be performed on the resource usage parameters to calculate the mean and standard deviation, thereby obtaining parameter distribution characteristic information. Based on the parameter distribution characteristic information, the abnormal parameter threshold corresponding to the resource usage parameters can be calculated. Based on the abnormal parameter threshold and resource usage parameters, the target container with anomalies can be identified among multiple containers. Based on the relevant information of the target container, an alarm message card (i.e., a target message) can be generated and sent to the target object through an instant messaging application for further processing.
[0065] In one embodiment, please refer to Figure 4b , Figure 4bThis is another specific flowchart illustrating an anomaly identification method provided in this application embodiment. When the user (i.e., the target object) clicks the "Offline" or "Destroy" control, the anomaly identification system provided in this application embodiment will receive a callback request. For example, the user initiates a container offline request, triggering the host diagnostic service to request the application instance cluster interface status from the cloud-native container platform to check the status of the application instance cluster composed of containers corresponding to the target application service. For example, it can confirm that the application instance cluster is not currently in a change state such as deployment or scaling. When the application instance cluster status is abnormal, a rejection notification can be sent to the user through an instant messaging application to reject the user-triggered offline request. When the application instance cluster status is normal, the container status can be requested from the cloud-native container platform to check whether the container is ready and confirm that the target container is currently in the Ready state (to prevent duplicate operations). If the container is not ready, a rejection notification can be sent to the user through an instant messaging application to reject the user-triggered offline request. If the container is ready, a replica health rate simulation can be performed. By obtaining the current total number of containers and the number of healthy containers in the application instance cluster, the expected replica health rate (i.e., the target health rate) of the application instance cluster after taking down the abnormal target container can be calculated. If the expected replica health rate of the application instance cluster is less than the preset health rate threshold (e.g., 80%) after taking down the abnormal target container, the system will reject the takedown / destruction request and issue an alert to prevent service avalanche caused by insufficient available nodes due to the operation. If the expected replica health rate is greater than or equal to the preset health rate threshold, a request can be made to the cloud-native container platform to perform container takedown operations. For example, the target container can be gracefully taken down, destroyed, or restarted by calling the interfaces exposed by the container management platform or directly calling the underlying container orchestration interfaces (such as the Kubernetes API or Docker exec).
[0066] In containerized deployments of application services, resource limiting mechanisms can easily lead to single-machine failures (e.g., excessive load on a physical machine or severe rate throttling of individual containers). CPUThrottle mechanisms can cause container processes to suspend, preventing request processing, significantly increasing interface response time and causing long-tail latency. Frequent rate throttling increases context switching overhead, reduces overall throughput, and severe rate throttling can cause upstream service calls to time out, triggering circuit breakers or degradation, impacting user experience. For single-container anomalies, the primary solution is threshold alerts from the basic monitoring system. For example, collecting container CPU utilization and triggering real-time event responses (PagerDuty) or email alerts when it exceeds 80%. Upon receiving the alert, operations personnel log into the Kubernetes management platform (Dashboard) or command-line tools to view container logs and status, manually delete the container after confirming the problem, and trigger its rebuild. However, due to significant differences in traffic models across different businesses and time periods, fixed thresholds are difficult to adapt to various business scenarios, easily leading to numerous false positives and false negatives. Furthermore, there is a lack of targeted monitoring for CPU rate limiting. Traditional CPU utilization monitoring cannot directly reflect CPU thresholding issues. Sometimes, CPU utilization is not high, but due to over-selling of physical machines or scheduling contention, containers can still be severely rate-limited, causing service disruptions. In addition, the process from alert discovery to manual investigation and command execution is lengthy and time-consuming, impacting service availability. Moreover, the lack of security mechanisms makes it easy to overlook the overall health of the application instance cluster when manually taking containers offline. If the application instance cluster itself is already in an unhealthy state (e.g., multiple containers have crashed), blindly taking it offline may lead to a cascading failure of remaining nodes, causing a major incident.
[0067] To address this, this application provides an anomaly identification method that uses statistical methods instead of fixed thresholds to identify abnormal containers and achieves rapid closed-loop control through interactive message cards. It also introduces a replica health rate prediction mechanism to ensure safe container operation. This solves the problem of low anomaly identification accuracy caused by the inability of fixed threshold monitoring to adapt to dynamic load environments. Furthermore, it addresses the issues of low efficiency and slow response in manual troubleshooting and handling of single-machine faults, as well as the lack of systematic safety checks in manual maintenance operations, which easily leads to secondary faults.
[0068] Specifically, this application provides a container anomaly identification method based on the statistical characteristics of application instance clusters. It does not rely on preset absolute thresholds, but instead calculates the mean and standard deviation of all container monitoring metrics (especially CPU rate limiting) within the application instance cluster in real time. It dynamically determines the anomaly boundary using Mean+n×stdDev, thereby identifying outlier anomaly containers. Furthermore, it provides a container safety shutdown control mechanism based on replica health rate simulation. Before executing a container shutdown command, it automatically obtains the current health status of the application instance cluster, simulates and calculates the expected replica health rate after shutting down the target container, and only allows the container shutdown operation when the expected replica health rate is higher than a preset safety threshold; otherwise, it automatically blocks and triggers an alarm. In addition, this application also provides an interactive operation and maintenance system integrating diagnosis, notification, and handling. By encapsulating anomaly diagnosis results and operation and maintenance operations in the same message card and sending it through an instant messaging application, users can trigger callback interfaces through controls on the message card while receiving anomaly alarms. This triggers the system to automatically complete security verification and execution logic in the background, achieving a closed loop for fault handling. Therefore, the embodiments of this application can achieve adaptive and accurate diagnosis. By using statistical algorithms instead of fixed thresholds, it can automatically adapt to business peaks and troughs, accurately identify outlier abnormal container points, help reduce false alarm rates, and achieve second-level closed-loop through interactive cards. This eliminates the need for manual login to multiple systems for troubleshooting, significantly shortening the Mean Time To Repair (MTTR) and improving operational efficiency. In addition, the embodiments of this application introduce a check mechanism based on the expected replica health rate, which is equivalent to adding a safety lock to the operation and maintenance process. This effectively prevents service avalanche problems caused by misoperation or blind operation, effectively ensuring high system availability and operational security. At the same time, it can specifically identify and handle the highly concealed CPU rate limiting problem, reduce long-tail latency caused by performance jitter of individual nodes, improve user experience, and thus effectively improve the efficiency of abnormal container identification.
[0069] As described above, this embodiment of the application obtains resource usage parameters of multiple containers corresponding to the target application service; extracts parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; calculates the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information; and identifies the target container with anomalies among the multiple containers based on the abnormal parameter threshold and the resource usage parameters. Therefore, by dynamically calculating the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution of the resource usage parameters of the multiple containers corresponding to the target application service, the abnormal target container can be accurately identified among the multiple containers of the target application service based on the dynamic abnormal parameter threshold, effectively improving the accuracy of abnormal container identification.
[0070] To better implement the above methods, embodiments of the present invention also provide an anomaly identification device, which can be integrated into an electronic device, such as a server.
[0071] For example, such as Figure 5 The diagram shown is a structural schematic of an anomaly identification device provided in an embodiment of this application. The anomaly identification device may include an acquisition unit 201, an extraction unit 202, a calculation unit 203, and an identification unit 204, as follows: The acquisition unit 201 is used to acquire resource usage parameters of multiple containers corresponding to the target application service; Extraction unit 202 is used to extract parameter distribution feature information corresponding to resource usage parameters based on resource usage parameters; The calculation unit 203 is used to calculate the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution characteristic information. The identification unit 204 is used to identify the target container with anomalies among multiple containers based on anomaly parameter thresholds and resource usage parameters.
[0072] In some embodiments, the resource usage parameters include at least one of a resource throttling rate for the target resource and a resource utilization rate; when the resource usage parameters include a resource throttling rate for the target resource, the acquisition unit 201 is configured to: Obtain statistical parameters of the target application service and the restricted use of target resources by multiple containers within a preset period. The statistical parameters include at least one of the restriction duration and the number of restrictions. The resource throttling rate of the container is calculated based on statistical parameters and a preset period.
[0073] In some embodiments, the extraction unit 202 is configured to: The concentrated distribution parameters of the computational resource usage parameters are used to obtain the distribution characteristic information of the first parameter; The fluctuation range parameter of the resource usage parameter is calculated to obtain the distribution characteristic information of the second parameter; Based on the distribution characteristics of the first parameter and the distribution characteristics of the second parameter, the parameter distribution characteristics of the resource usage parameters are obtained.
[0074] In some embodiments, the calculation unit 203 is configured to: Calculate the product of the preset deviation factor and the distribution characteristic information of the second parameter to obtain the product result; Based on the distribution characteristics of the first parameter and the product result, the threshold values of abnormal parameters corresponding to the resource usage parameters are determined.
[0075] In some embodiments, the identification unit 204 is configured to: Among multiple containers, those whose resource usage parameters exceed the abnormal parameter threshold and are also greater than the preset resource usage parameter threshold are identified as target containers with abnormalities.
[0076] In some embodiments, the anomaly detection device is further configured to: Generate exception handling information for the target container and send the exception handling information to the target object associated with the target application service; Receive exception handling requests from the target object in response to exception handling prompts. The exception handling requests are used to request resource release processing for the target container. Predict the target health rate of multiple containers corresponding to the target application service when the target container is released; The system responds to exception handling requests based on the target health rate.
[0077] In some embodiments, the above-mentioned prediction of the target health rate of multiple containers corresponding to the target application service when the target container is released is specifically used for: Get the total number of containers corresponding to the target application service, and the number of containers in a healthy state among the multiple containers; Based on the total number of containers and the number of containers, calculate the target health rate of multiple containers corresponding to the target application service when the target container is released.
[0078] In some embodiments, the above-described response processing of anomaly handling requests based on the target health rate is specifically used for: If the target health rate is not less than the preset health rate threshold, the target container is released based on the exception handling request. If the target health rate is less than the preset health rate threshold, do not respond to the exception handling request, and do not generate an exception prompt message for the exception handling request.
[0079] In some embodiments, the exception handling prompt information includes a target message. The above-mentioned generation of exception handling prompt information for the target container is specifically used for: A target message is generated based on the state attribute information associated with the target container. The target message includes at least one processing control. Each processing control corresponds to an exception handling request for the target container. The processing control is used to return its corresponding exception handling request after being triggered. Send target messages to target objects associated with the target application service through instant messaging applications.
[0080] In practice, each of the above units can be implemented as an independent entity or can be arbitrarily combined to be implemented as the same or several entities. For the specific implementation of each of the above units, please refer to the previous method embodiments, which will not be repeated here.
[0081] As described above, this embodiment of the application obtains resource usage parameters of multiple containers corresponding to the target application service through the acquisition unit 201; the extraction unit 202 extracts parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; the calculation unit 203 calculates the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information; and the identification unit 204 identifies the target container with anomalies among the multiple containers based on the abnormal parameter threshold and the resource usage parameters. Therefore, by dynamically calculating the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution of the resource usage parameters of the multiple containers corresponding to the target application service, the abnormal target container can be accurately identified among the multiple containers of the target application service based on the dynamic abnormal parameter threshold, effectively improving the accuracy of abnormal container identification.
[0082] This application also provides an electronic device, such as... Figure 6 The diagram shows a schematic representation of the structure of an electronic device according to an embodiment of this application. This electronic device can be a terminal or a server. Specifically: The electronic device 300 includes a processor 301 with one or more processing cores, a memory 302 with one or more computer-readable storage media, and a computer program stored in the memory 302 and executable on the processor. The processor 301 and the memory 302 are electrically connected. Those skilled in the art will understand that the electronic device structure shown in the figures does not constitute a limitation on the electronic device, and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0083] The processor 301 is the control center of the electronic device 300. It connects various parts of the electronic device 300 through various interfaces and lines. By running or loading software programs and / or modules stored in the memory 302, and calling data stored in the memory 302, it performs various functions of the electronic device 300 and processes data, thereby monitoring the electronic device 300 as a whole.
[0084] In this embodiment, the processor 301 in the electronic device 300 loads the instructions corresponding to the processes of one or more applications into the memory 302 according to the following steps, and the processor 301 runs the applications stored in the memory 302 to realize various functions: Obtain resource usage parameters for multiple containers corresponding to the target application service; Based on resource usage parameters, extract parameter distribution feature information corresponding to resource usage parameters; Based on the parameter distribution characteristics, calculate the abnormal parameter thresholds corresponding to the resource usage parameters; Based on abnormal parameter thresholds and resource usage parameters, the target container with anomalies is identified among multiple containers.
[0085] This solution obtains resource usage parameters from multiple containers corresponding to a target application service; extracts parameter distribution features based on these parameters; calculates abnormal parameter thresholds based on these features; and identifies the abnormal target container among the multiple containers based on both the abnormal parameter thresholds and the resource usage parameters. In this way, by dynamically calculating the abnormal parameter thresholds based on the parameter distribution of resource usage parameters across multiple containers of the target application service, the solution can accurately identify abnormal target containers among the multiple containers of the target application service, effectively improving the accuracy of abnormal container identification.
[0086] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.
[0087] Optional, such as Figure 6 As shown, the electronic device 300 also includes: a touch display screen 303, a radio frequency circuit 304, an audio circuit 305, an input unit 306, and a power supply 307. The processor 301 is electrically connected to the touch display screen 303, the radio frequency circuit 304, the audio circuit 305, the input unit 306, and the power supply 307. Those skilled in the art will understand that... Figure 6 The electronic device structure shown does not constitute a limitation on the electronic device and may include more or fewer components than shown, or combine certain components, or have different component arrangements.
[0088] The touch display screen 303 can be used to display a graphical user interface (GUI) and receive operation commands generated by the user interacting with the GUI. The touch display screen 303 may include a display panel and a touch panel. The display panel can be used to display information input by the user or information provided to the user, as well as various graphical user interfaces of the electronic device. These graphical user interfaces can be composed of graphics, text, icons, video, and any combination thereof. Optionally, the display panel can be configured using a liquid crystal display (LCD), organic light-emitting diode (OLED), or other similar technologies. The touch panel can be used to collect touch operations performed by the user on or near it (such as operations performed by the user using a finger, stylus, or any suitable object or accessory on or near the touch panel), generate corresponding operation commands, and execute the corresponding program according to the operation commands. Optionally, the touch panel may include two parts: a touch detection device and a touch controller. The touch detection device detects the user's touch location and the signal generated by the touch operation, transmitting the signal to the touch controller. The touch controller receives touch information from the touch detection device, converts it into touch point coordinates, and sends it to the processor 301. It can also receive and execute commands from the processor 301. The touch panel can cover the display panel. When the touch panel detects a touch operation on or near it, it transmits the information to the processor 301 to determine the type of touch event. Subsequently, the processor 301 provides corresponding visual output on the display panel based on the type of touch event. In this embodiment, the touch panel and the display panel can be integrated into the touch display screen 303 to achieve input and output functions. However, in some embodiments, the touch panel and the touch display screen 303 can be implemented as two independent components to achieve input and output functions. That is, the touch display screen 303 can also be used as part of the input unit 306 to achieve input functions.
[0089] The radio frequency circuit 304 can be used to transmit and receive radio frequency signals to establish wireless communication with network devices or other electronic devices, and to transmit and receive signals with network devices or other electronic devices.
[0090] Audio circuitry 305 can be used to provide an audio interface between a user and an electronic device via a speaker and a microphone. Audio circuitry 305 converts received audio data into electrical signals, transmits them to the speaker, and the speaker converts them into sound signals for output. Conversely, the microphone converts collected sound signals into electrical signals, which are then received by audio circuitry 305, converted back into audio data, and then processed by processor 301 before being transmitted via radio frequency circuitry 304 to, for example, another electronic device, or output to memory 302 for further processing. Audio circuitry 305 may also include an earphone jack to facilitate communication between peripheral headphones and electronic devices.
[0091] The input unit 306 can be used to receive input numbers, characters, or user characteristic information (such as fingerprints, iris, facial information, etc.), and to generate keyboard, mouse, joystick, optical, or trackball signal inputs related to user settings and function control.
[0092] Power supply 307 is used to supply power to various components of electronic device 300. Optionally, power supply 307 can be logically connected to processor 301 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. Power supply 307 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.
[0093] although Figure 6 As not shown in the diagram, the electronic device 300 may also include a camera, sensor, wireless fidelity module, Bluetooth module, etc., which will not be described in detail here.
[0094] In the above embodiments, the descriptions of each embodiment have different focuses. Parts not described in detail in a particular embodiment can be found in the relevant descriptions of other embodiments. It should be noted that the electronic device provided in this application's embodiments belongs to the same concept as the anomaly identification method described in the above embodiments, and its specific implementation process is detailed in the above method embodiments, and will not be repeated here.
[0095] As can be seen from the above, the electronic device provided in this application embodiment can obtain resource usage parameters of multiple containers corresponding to a target application service; extract parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; calculate the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information; and identify the target container with abnormality among multiple containers based on the abnormal parameter threshold and the resource usage parameters. Therefore, by dynamically calculating the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution of the resource usage parameters of multiple containers corresponding to the target application service, the abnormal target container can be accurately identified among multiple containers of the target application service based on the dynamic abnormal parameter threshold, effectively improving the accuracy of abnormal container identification.
[0096] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be performed by a computer program, or by a computer program controlling related hardware. The computer program can be stored in a computer-readable storage medium and loaded and executed by a processor.
[0097] Therefore, embodiments of this application provide a computer-readable storage medium storing a computer program that can be loaded by a processor to execute the steps of any of the anomaly identification methods provided in embodiments of this application. For example, the computer program can execute the following steps: Obtain resource usage parameters for multiple containers corresponding to the target application service; Based on resource usage parameters, extract parameter distribution feature information corresponding to resource usage parameters; Based on the parameter distribution characteristics, calculate the abnormal parameter thresholds corresponding to the resource usage parameters; Based on abnormal parameter thresholds and resource usage parameters, the target container with anomalies is identified among multiple containers.
[0098] This solution obtains resource usage parameters from multiple containers corresponding to a target application service; extracts parameter distribution features based on these parameters; calculates abnormal parameter thresholds based on these features; and identifies the abnormal target container among the multiple containers based on both the abnormal parameter thresholds and the resource usage parameters. In this way, by dynamically calculating the abnormal parameter thresholds based on the parameter distribution of resource usage parameters across multiple containers of the target application service, the solution can accurately identify abnormal target containers among the multiple containers of the target application service, effectively improving the accuracy of abnormal container identification.
[0099] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.
[0100] The computer-readable storage medium may include: read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.
[0101] Since the computer program stored in the computer-readable storage medium can execute the steps in any of the anomaly identification methods provided in the embodiments of this application, the beneficial effects that any of the anomaly identification methods provided in the embodiments of this application can achieve can be realized, as detailed in the preceding embodiments, and will not be repeated here.
[0102] According to one aspect of this application, a computer program product is provided, comprising a computer program stored in a computer-readable storage medium; when a processor of an electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, causing the electronic device to perform the methods provided in the various optional implementations of the above embodiments.
[0103] The above provides a detailed description of an anomaly identification method, apparatus, storage medium, and electronic device provided in the embodiments of this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. An anomaly identification method, characterized in that, include: Obtain resource usage parameters for multiple containers corresponding to the target application service; Based on the resource usage parameters, extract the parameter distribution feature information corresponding to the resource usage parameters; Based on the parameter distribution characteristic information, calculate the abnormal parameter threshold corresponding to the resource usage parameters; Based on the abnormal parameter threshold and the resource usage parameters, the target container with abnormality is identified among the multiple containers.
2. The anomaly identification method as described in claim 1, characterized in that, The resource usage parameters include at least one of the following: resource throttling rate and resource utilization rate for the target resource. When the resource usage parameters include a resource rate limiting rate for the target resource, obtaining the resource usage parameters of multiple containers corresponding to the target application service includes: Obtain statistical parameters showing that multiple containers corresponding to the target application service are restricted from using the target resource under a preset period. The statistical parameters include at least one of the restriction duration and the number of restrictions. Based on the statistical parameters and the preset period, the resource throttling rate of the container is calculated.
3. The anomaly identification method as described in claim 1, characterized in that, The step of extracting parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters includes: Calculate the centralized distribution parameters of the resource usage parameters to obtain the distribution characteristic information of the first parameter; Calculate the fluctuation range parameter of the resource usage parameter to obtain the distribution characteristic information of the second parameter; Based on the first parameter distribution feature information and the second parameter distribution feature information, the parameter distribution feature information corresponding to the resource usage parameters is obtained.
4. The anomaly identification method as described in claim 3, characterized in that, The step of calculating the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information includes: Calculate the product of the preset deviation factor and the distribution characteristic information of the second parameter to obtain the product result; Based on the first parameter distribution feature information and the product result, the abnormal parameter threshold corresponding to the resource usage parameter is determined.
5. The anomaly identification method as described in claim 1, characterized in that, The step of identifying the target container with an anomaly among the plurality of containers based on the anomaly parameter threshold and the resource usage parameters includes: Among the multiple containers, the container whose resource usage parameter is greater than the abnormal parameter threshold and also greater than the preset resource usage parameter threshold is identified as the target container with an anomaly.
6. The anomaly identification method according to any one of claims 1 to 5, characterized in that, After identifying the target container with an anomaly among the multiple containers based on the anomaly parameter threshold and the resource usage parameters, the method further includes: Generate exception handling prompt information for the target container, and send the exception handling prompt information to the target object associated with the target application service; Receive an exception handling request from the target object in response to the exception handling prompt information, wherein the exception handling request is used to request resource release processing for the target container; Predict the target health rate of multiple containers corresponding to the target application service when the target container is released; The anomaly handling request is responded to based on the target health rate.
7. The anomaly identification method as described in claim 6, characterized in that, The prediction of the target health rate of multiple containers corresponding to the target application service when the target container is released includes: Obtain the total number of containers corresponding to the target application service, and the number of containers in a healthy state among the multiple containers; Based on the total number of containers and the number of containers, calculate the target health rate of multiple containers corresponding to the target application service when the target container is released.
8. An anomaly detection device, characterized in that, include: The acquisition unit is used to acquire resource usage parameters of multiple containers corresponding to the target application service; The extraction unit is used to extract parameter distribution feature information corresponding to the resource usage parameters based on the resource usage parameters; The calculation unit is used to calculate the abnormal parameter threshold corresponding to the resource usage parameters based on the parameter distribution feature information; An identification unit is used to identify a target container with an anomaly among the plurality of containers based on the anomaly parameter threshold and the resource usage parameters.
9. An electronic device, characterized in that, It includes a processor and a memory, wherein the memory stores a computer program that, when executed by the processor, causes the processor to perform the steps of any one of the methods described in claims 1 to 7.
10. A computer-readable storage medium, characterized in that, It includes a computer program that, when run on an electronic device, causes the electronic device to perform the steps of any of the methods described in claims 1 to 7.