Method and apparatus for anomaly detection and label generation

By encoding the sequence data of key performance indicators of a business system using a variational autoencoder, and utilizing relative entropy and anomaly thresholds to detect and generate anomaly labels, the problem of difficulty in timely and accurate detection of anomalies in key performance indicators of a business system in existing technologies is solved, achieving efficient anomaly detection and label generation.

CN115981903BActive Publication Date: 2026-06-30STATE GRID INFORMATION & TELECOMM BRANCH +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
STATE GRID INFORMATION & TELECOMM BRANCH
Filing Date
2022-12-21
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing technologies struggle to detect anomalies in key performance indicators within business systems in a timely and accurate manner and generate anomaly tags.

Method used

A variational autoencoder (VAE) is used to encode the sequence data of key performance indicators. By calculating the relative entropy between the encoded feature distribution and the multivariate Gaussian distribution, the anomaly threshold is determined and anomaly labels are generated. Anomaly detection is optimized using the category information of the training samples and the objective function.

Benefits of technology

It enables accurate anomaly detection and tag generation for key performance indicators, improving the timeliness and accuracy of anomaly detection in business systems.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115981903B_ABST
    Figure CN115981903B_ABST
Patent Text Reader

Abstract

This application provides a method and apparatus for anomaly detection and label generation. The method includes: obtaining indicator sequence data of a key performance indicator to be detected in a business system, the indicator sequence data including indicator data of the key performance indicator at multiple time points; inputting the indicator sequence data into a trained variational autoencoder to obtain the encoding feature distribution of the encoder output in the variational autoencoder; determining a first relative entropy between the encoding feature distribution and a multivariate Gaussian distribution used as a prior distribution; if the first relative entropy is greater than a set anomaly threshold, determining that the key performance indicator is abnormal, and generating an anomaly label for the key performance indicator. The solution of this application can detect anomalies in key performance indicators in a business system relatively accurately, thereby accurately generating anomaly labels for key performance indicators.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of monitoring technology for key performance indicators, and in particular to a method and apparatus for detecting indicator anomalies and generating labels. Background Technology

[0002] To ensure that the business system can provide stable and reliable services, it is necessary to promptly detect and address any operational anomalies during the system's operation.

[0003] To detect operational anomalies in business systems in a timely manner, key performance indicators (KPIs) related to these systems can be monitored. Therefore, after obtaining the KPI data, determining whether the KPI data contains anomalies and identifying the KPIs with anomalies is a technical problem that needs to be solved by those skilled in the art. Summary of the Invention

[0004] This application provides a method and apparatus for detecting and generating labels for key performance indicators, which can accurately detect anomalies in key performance indicators in a business system and generate anomaly labels for key performance indicators that have anomalies.

[0005] On the one hand, this application provides a method for anomaly detection and label generation, including:

[0006] Obtain the indicator sequence data of the key performance indicators to be detected in the business system, wherein the indicator sequence data includes: indicator data of the key performance indicators at multiple time points;

[0007] The index sequence data is input into a trained variational autoencoder to obtain the coding feature distribution of the encoder output in the variational autoencoder.

[0008] Determine the first relative entropy between the encoded feature distribution and the multivariate Gaussian distribution used as the prior distribution;

[0009] If the first relative entropy is greater than the set anomaly threshold, it is determined that the key performance indicator is abnormal, and an anomaly label is generated for the key performance indicator.

[0010] The variational autoencoder is trained using multiple labeled index sequence data samples corresponding to the key performance indicators, with the goal of minimizing the second relative entropy and sample error, and maximizing the third relative entropy of abnormal samples and the multivariate Gaussian distribution. The index sequence data samples belong to one of the categories of normal and abnormal.

[0011] The second relative entropy is the relative entropy between the normal indicator sequence data sample and the multivariate Gaussian distribution, wherein the normal indicator sequence data sample is an indicator data sequence sample whose category is normal;

[0012] The sample error is the error between the normal index sequence data sample and the reconstructed sequence data sample reconstructed by the variational autoencoder from the normal index sequence data sample.

[0013] The abnormal samples include: an indicator sequence data sample classified as abnormal; the reconstructed sequence data sample; and a random sequence data sample randomly extracted from the encoded feature distribution sample obtained by encoding the indicator sequence data sample by the encoder of the variational autoencoder.

[0014] Preferably, if the first relative entropy is greater than a set anomaly threshold, it is determined that the key performance indicator is abnormal, and an anomaly label is generated for the key performance indicator, including:

[0015] Determine the difference between the first relative entropy and the anomaly threshold;

[0016] Determine the percentage of the absolute value of the difference to the anomaly threshold;

[0017] Based on the percentage and multiple set percentage threshold ranges, the abnormality level of the key performance indicator is determined.

[0018] Generate anomaly labels corresponding to the anomaly levels for the key performance indicators.

[0019] Preferably, the value of the abnormality threshold is obtained in the following manner:

[0020] For each labeled index sequence data sample, the index sequence data sample is input into a trained variational autoencoder to obtain the encoded feature distribution sample output by the variational autoencoder.

[0021] For each index sequence data sample, determine the fourth relative entropy between the encoding feature distribution sample corresponding to the index sequence data sample and the multivariate Gaussian distribution;

[0022] Set the initial value for the anomaly threshold;

[0023] Using the fourth relative entropy being greater than the anomaly threshold as the anomaly detection criterion for determining that the index sequence data sample has an anomaly, the anomaly detection result of the index sequence data sample is determined.

[0024] By combining the categories labeled in the data samples of each indicator sequence, the recall or precision corresponding to the anomaly detection results of multiple indicator data sequence samples can be determined.

[0025] If the recall or precision does not meet the conditions, the value of the anomaly threshold is adjusted, and the operation of determining the anomaly detection result of the indicator sequence data sample is performed based on the adjusted anomaly threshold.

[0026] If the recall rate or precision meets the condition, the current value of the anomaly threshold is determined as the value set as the anomaly threshold.

[0027] Preferably, the variational autoencoder is obtained by training multiple labeled index sequence data samples corresponding to the key performance indicators based on an objective function.

[0028] The objective function includes:

[0029] The encoder in the variational autoencoder corresponds to the following first objective function J. E :

[0030]

[0031] And, the second objective loss function J corresponding to the decoder in the variational autoencoder is as follows: G :

[0032]

[0033] Where y represents a normal index sequence data sample, y S Belongs to y r y p and y n , where y S For the reconstructed sequence data sample, y p For the random sequence data sample, y n This is a sample of indicator sequence data categorized as anomaly;

[0034] Enc(y) represents the coded feature distribution sample obtained by passing y through the encoder in the variational autoencoder;

[0035] Enc(y S ) represents y S The encoded feature distribution sample obtained by the encoder in the variational autoencoder;

[0036] KL(Enc(y)) represents the relative entropy of Enc(y) with respect to the multivariate Gaussian distribution;

[0037] KL(Enc(y S )) represents Enc(y S The relative entropy of the multivariate Gaussian distribution;

[0038] α and β are different weighting coefficients, and m is a set parameter value;

[0039] Enc(ng(y S )) indicates that in the variational autoencoder, y is the result of the encoder being trained and kept constant. S The encoded feature distribution sample obtained by the encoder;

[0040] KL(Enc(ng(y S ))) means Enc(ng(y S The relative entropy of the multivariate Gaussian distribution.

[0041] Preferably, the indicator sequence data sample is obtained in the following manner:

[0042] Obtain candidate indicator sequence data for the key indicator data, wherein the candidate indicator sequence data includes candidate indicator data from multiple different historical time points;

[0043] Determine the standard score of each candidate indicator data in the candidate indicator sequence data respectively, and replace the candidate indicator data in the candidate indicator sequence data with the standard score of the candidate indicator data to obtain the reconstructed candidate indicator sequence data;

[0044] The candidate index sequence data is sampled using a sliding window of a set length to obtain multiple sampled index sequence data samples.

[0045] Furthermore, this application also provides an indicator anomaly detection device, comprising:

[0046] The data acquisition unit is used to acquire the indicator sequence data of the key performance indicators to be detected in the business system. The indicator sequence data includes the indicator data of the key performance indicators at multiple time points.

[0047] A model processing unit is configured to input the indicator sequence data into a trained variational autoencoder (VAE) to obtain the encoding feature distribution output by the encoder in the VAE. The VAE is trained using multiple labeled indicator sequence data samples corresponding to the key performance indicators, with the objective of minimizing the second relative entropy and sample error, and maximizing the third relative entropy between abnormal samples and the multivariate Gaussian distribution. The indicator sequence data belongs to either normal or abnormal categories. The second relative entropy is the relative entropy between normal indicator sequence data samples and the multivariate Gaussian distribution, and the normal indicator sequence data samples are those with a normal category. The sample error is the error between the normal indicator sequence data samples and the reconstructed sequence data samples reconstructed by the VAE from the normal indicator sequence data samples. The abnormal samples include: indicator sequence data samples with an abnormal category; the reconstructed sequence data samples; and random sequence data samples randomly extracted from the encoding feature distribution samples obtained by the encoder of the VAE from the indicator sequence data samples.

[0048] An entropy determination unit is used to determine the first relative entropy between the encoded feature distribution and the multivariate Gaussian distribution used as a prior distribution;

[0049] An anomaly detection unit is used to determine that the key performance indicator is abnormal if the first relative entropy is greater than a set anomaly threshold, and to generate an anomaly label for the key performance indicator.

[0050] Preferably, the anomaly detection unit includes:

[0051] The difference determination unit is used to determine the difference between the first relative entropy and the abnormal threshold if the first relative entropy is greater than a set abnormal threshold.

[0052] A percentage determination unit is used to determine the percentage of the absolute value of the difference to the abnormal threshold.

[0053] An anomaly determination unit is used to determine the anomaly level of the key performance indicator based on the percentage and multiple set percentage threshold ranges.

[0054] The tag generation unit is used to generate anomaly tags corresponding to the anomaly level for the key performance indicators.

[0055] Preferably, the device further includes: an anomaly threshold determination unit, used to obtain the value of the anomaly threshold in the following manner:

[0056] For each labeled index sequence data sample, the index sequence data sample is input into a trained variational autoencoder to obtain the encoded feature distribution sample output by the variational autoencoder.

[0057] For each index sequence data sample, determine the fourth relative entropy between the encoding feature distribution sample corresponding to the index sequence data sample and the multivariate Gaussian distribution;

[0058] Set the initial value for the anomaly threshold;

[0059] Using the fourth relative entropy being greater than the anomaly threshold as the anomaly detection criterion for determining that the index sequence data sample has an anomaly, the anomaly detection result of the index sequence data sample is determined.

[0060] By combining the categories labeled in the data samples of each indicator sequence, the recall or precision corresponding to the anomaly detection results of multiple indicator data sequence samples can be determined.

[0061] If the recall or precision does not meet the conditions, the value of the anomaly threshold is adjusted, and the operation of determining the anomaly detection result of the indicator sequence data sample is performed based on the adjusted anomaly threshold.

[0062] If the recall rate or precision meets the condition, the current value of the anomaly threshold is determined as the value set as the anomaly threshold.

[0063] As can be seen from the above, in this embodiment, after obtaining the index sequence data of the key performance indicators to be detected in the business system, the encoder in the trained variational autoencoder can be used to obtain the encoded feature distribution of the index sequence data after encoding. Since the variational autoencoder is trained using multiple index sequence data samples labeled with categories corresponding to the key performance indicators, and with the goal of maximizing the relative entropy between abnormal samples and the multivariate Gaussian distribution, after using the encoder in the variational autoencoder to encode the index sequence data of the key performance coordinates to obtain the encoded feature distribution, the relative entropy between the encoded feature distribution and the multivariate Gaussian distribution can accurately reflect whether the index sequence data is abnormal data. Based on this, based on the relative entropy between the encoded feature distribution and the multivariate Gaussian distribution corresponding to the index sequence data of the key performance coordinates and the set abnormality threshold, it is possible to accurately determine whether the key performance indicator is abnormal, and thus detect the abnormality of the key performance indicator and label the key performance indicator with abnormality. Attached Figure Description

[0064] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only embodiments of this application. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.

[0065] Figure 1 This illustration shows a flowchart of an anomaly detection method for indicators provided in an embodiment of this application;

[0066] Figure 2 This paper illustrates a flowchart of a process for obtaining indicator sequence data samples in an embodiment of this application.

[0067] Figure 3 This paper illustrates a schematic diagram of an implementation process for determining an anomaly threshold in an embodiment of this application.

[0068] Figure 4 This paper shows a schematic diagram of the composition structure of an indicator anomaly detection device provided in an embodiment of this application. Detailed Implementation

[0069] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those of ordinary skill in the art without creative effort are within the scope of protection of this application.

[0070] like Figure 1 The illustration shows a flowchart of an anomaly detection method provided in this application embodiment. The method of this embodiment can be applied to computer devices, such as servers or other computing nodes with data processing capabilities.

[0071] The method in this embodiment may include:

[0072] S101, obtain the indicator sequence data of the key performance indicators to be tested in the business system.

[0073] It is understandable that a business system can be a business application system that provides various application services, and the business system will also be different in different application scenarios.

[0074] This business system can take many forms, such as a cloud business system based on a microservices architecture. Microservices are a software architectural style that focuses on small functional blocks with a single responsibility and function; these small functional blocks are also called services. Each service is built around a specific business function and can be deployed independently. Services communicate with each other using lightweight mechanisms and have clearly defined interfaces. This allows them to coordinate and cooperate with each other to provide end-user value. Of course, the business system can also take other forms without restriction.

[0075] Key Performance Indicators (KPIs) of a business system are metrics that reflect the system's operational status and fault conditions. Monitoring KPIs helps to promptly identify the system's operational status. Therefore, based on the detection results of any anomalies in KPIs, operations and maintenance personnel can intelligently assess or diagnose the system's health status, the degree of abnormality in business metrics, and maintenance risks, enabling them to more rationally plan and develop operations and maintenance solutions and schedules.

[0076] It is understandable that there may be multiple key performance indicators for a business system, but the solution proposed in this application can be used for anomaly detection for each key performance indicator.

[0077] The key performance indicator (KPI) sequence data includes KPI data at multiple time points. Specifically, the KPI may include: multiple time points at which the KPI was collected, and the KPI data collected at each time point.

[0078] S102, input the index sequence data into the trained variational autoencoder to obtain the coding feature distribution of the encoder output in the variational autoencoder.

[0079] Variational Auto Encoder (VAE) is an autoencoder network model that uses variational ideas. VAE includes an encoder and a decoder. The encoder is also called the inference model, and the decoder is also called the generative model.

[0080] In this application, the variational autoencoder is trained using multiple labeled index sequence data samples corresponding to key performance indicators, with the goal of minimizing the relative entropy between normal index sequence data samples and the multivariate Gaussian distribution used as the prior distribution, minimizing sample error, and maximizing the relative entropy between abnormal samples and the multivariate Gaussian distribution used as the prior distribution.

[0081] Relative entropy, also known as KL divergence, is a metric used to measure the similarity between two distributions. In training a VAE, relative entropy is the KL divergence between the distribution of the index sequence data samples and the prior distribution, assuming that the prior distribution is a multivariate Gaussian distribution N(0,I).

[0082] The indicator sequence data sample refers to the indicator sequence data used for training the VEA, which belongs to the key performance indicator. Similar to the indicator data sequence, each indicator sequence data sample can include indicator data samples of the key performance indicator at multiple historical time points. For ease of distinction, the indicator sequence data used for training the VEA is called the indicator sequence data sample, the time points involved in the indicator sequence data sample are called historical time points, and the indicator data corresponding to the historical time points are called indicator data samples.

[0083] The category of an indicator sequence data sample indicates whether it belongs to abnormal data samples. For example, the category can be divided into two types: abnormal and normal. Therefore, for an indicator sequence data sample, its labeled category can be either normal or abnormal. If the category of the indicator sequence data sample is normal, it means that the indicator sequence data sample belongs to normal data under normal conditions for key performance indicators; conversely, if the category of the indicator sequence data sample is abnormal, it means that the indicator sequence data sample belongs to abnormal data under abnormal conditions for key performance indicators.

[0084] Accordingly, indicator sequence data samples can be divided into normal indicator sequence data samples and abnormal indicator sequence data samples. Normal indicator sequence data samples are those with a normal category, while abnormal indicator sequence data samples are those with an abnormal category.

[0085] The sample error is the error between the normal index sequence data sample and the reconstructed sequence data sample reconstructed by the variational autoencoder from the normal index sequence data sample.

[0086] In this application, the indicator sequence data samples can be obtained by sampling indicator sequence data collected in the history of big data or business systems, and there are no restrictions on the specific process of obtaining the indicator sequence data.

[0087] Abnormal samples refer to indicator sequence data samples used for training, as well as other indicator sequence data samples generated during the training of the variational encoder, in addition to the indicator sequence data samples classified as normal. Specifically, abnormal samples can include: indicator sequence data samples classified as anomalous; reconstructed sequence data samples corresponding to normal indicator sequence data samples; and random sequence data samples randomly drawn from the encoded feature distribution samples obtained by the encoder of the variational autoencoder encoding the indicator sequence data samples.

[0088] The random sequence data samples can be obtained by the VAE encoder based on random noise.

[0089] Understandably, the VAE encoder outputs a series of means and variances corresponding to the indicator sequence data, based on the inherent nature of the indicator sequence data. The same applies to indicator sequence data samples. Therefore, random sequence data samples are also constructed by extracting data from the means and variances output by the VAE encoder.

[0090] Based on the training objectives of the VAE in this application, it is understood that the VAE is required to minimize the relative entropy between normal indicator sequence data samples and their corresponding encoded feature distribution samples, and to minimize the error between normal indicator sequence data samples and their corresponding reconstructed sequence data samples. Therefore, when applying the trained VAE, the relative entropy between the indicator sequence data collected under normal conditions and the encoded distribution features obtained through the VAE is small, while the relative entropy between abnormal indicator sequence data and their corresponding encoded distribution features is large, thus helping to accurately screen out abnormal indicator sequence data.

[0091] Furthermore, in order to more accurately detect abnormal data of key performance indicators based on VAE, this application also considers the evaluation of the quality of VAE-generated samples when training VAE, that is, the penalty of KL divergence corresponding to abnormal samples. This enables the VAE trained to significantly increase the relative entropy of abnormal indicator sequence data and its encoded distribution characteristics, thereby further improving the accuracy of anomaly detection of key performance indicators.

[0092] S103, determine the first relative entropy between the encoded feature distribution and the multivariate Gaussian distribution used as the prior distribution.

[0093] As mentioned earlier, relative entropy is also known as KL divergence. In this application, for ease of distinction, the relative entropy between the coding feature distribution corresponding to the index data sequence obtained based on VAE and the multivariate Gaussian distribution is referred to as the first relative entropy.

[0094] The specific method for calculating relative entropy is not restricted.

[0095] S104, if the first relative entropy is greater than the set abnormal threshold, it is determined that the key performance indicator is abnormal, and an abnormal label is generated for the key performance indicator.

[0096] Of course, if the first relative entropy is not greater than the set abnormal threshold, it is determined that there is no abnormality in the key performance indicator, and a normal label can be generated for the key performance indicator.

[0097] It is understandable that in practical applications, when key performance indicators (KPIs) exhibit anomalies, the degree of anomaly will vary depending on the specific anomalies in the KPI sequence data. Based on this, this application can determine the anomaly level of a KPI and generate an anomaly label corresponding to that level. An implementation example will be provided later, and will not be elaborated upon here.

[0098] As can be seen from the above, in this embodiment, after obtaining the index sequence data of the key performance indicators to be detected in the business system, the encoder in the trained variational autoencoder can be used to obtain the encoded feature distribution of the index sequence data after encoding. Since the variational autoencoder is trained using multiple index sequence data samples labeled with categories corresponding to the key performance indicators, and with the goal of maximizing the relative entropy between abnormal samples and the multivariate Gaussian distribution, after using the encoder in the variational autoencoder to encode the index sequence data of the key performance coordinates to obtain the encoded feature distribution, the relative entropy between the encoded feature distribution and the multivariate Gaussian distribution can accurately reflect whether the index sequence data is abnormal data. Based on this, based on the relative entropy between the encoded feature distribution and the multivariate Gaussian distribution corresponding to the index sequence data of the key performance coordinates and the set abnormality threshold, it is possible to accurately determine whether the key performance indicator has any abnormalities, and thus accurately detect the abnormality of the key performance indicator and label the key performance indicator with abnormality.

[0099] It is understood that, in this application, the index sequence data samples used to train the VAE can be obtained in a variety of ways, and there is no limitation on this.

[0100] To facilitate understanding, the following example illustrates one method for obtaining indicator sequence data samples. Figure 2 The diagram illustrates a process for obtaining indicator sequence data samples for training a VAE in this application. This embodiment may include:

[0101] S201, obtain candidate indicator sequence data for key indicator data.

[0102] The candidate indicator sequence data includes candidate indicator data from multiple different historical time points.

[0103] Candidate indicator data is a sequence of indicator data suitable for generating indicator sequence data samples for training.

[0104] Understandably, in order to reduce the abnormal impact on VAE training, after obtaining the candidate indicator sequence data, the candidate indicator data can be preprocessed first, such as removing candidate indicator data with obvious errors from the candidate indicator sequence data.

[0105] S202, determine the standard score of each candidate indicator data in the candidate indicator sequence data respectively, replace the candidate indicator data in the candidate indicator sequence data with the standard score of the candidate indicator data, and obtain the reconstructed candidate indicator sequence data.

[0106] The standardized score of the candidate indicator data is also called the Z-score.

[0107] For example, the time series composed of candidate indicator data in the candidate indicator sequence data can be represented as: X=(x'1,x'2,…,x' N-1 ), where N is the total number of candidate indicator data in the candidate indicator sequence data, which is the total number of historical time points.

[0108] Based on this, for any candidate indicator data x′ i Let i be any natural number from 1 to N-1, and x′ i Z-score x i The formula can be expressed as follows: Formula 1

[0109]

[0110] Where, μ x σ is the mean of all candidate indicator data in the candidate indicator sequence data. x is the standard deviation of each candidate indicator data in the candidate indicator sequence data.

[0111] Replacing candidate indicator data in the candidate indicator sequence data with standard scores of the candidate indicator data is beneficial to improving the training effect of subsequent VAE training.

[0112] S203, a sliding window of a set length is used to sample the candidate indicator sequence data to obtain multiple sampled indicator sequence data samples.

[0113] In this application, the candidate indicator sequence data may embed historical time points from which the candidate indicator data was collected. These historical time points can be refined to hours, minutes, seconds, etc., as needed, without limitation.

[0114] To facilitate computer recognition, each historical time point can be represented in an encoded form, such as converting the historical time point representation into one-hot encoding. Based on this, the encoded information of a historical time point can be represented as t. i Let i be any natural number from 1 to N-1. Correspondingly, the candidate index sequence data, represented as a matrix Y, can be expressed as follows:

[0115]

[0116] Based on this, the sliding window length can be set to L, and each time a sample of index data sequence of length L is selected from the candidate index sequence data, then the index data sequence sample Y (j) Let j represent the j-th indicator data sequence sample, and the value of j can be expressed as follows:

[0117]

[0118] It is understandable that there may be one or more candidate indicator sequence data, but the process of extracting indicator sequence data samples from each candidate indicator sequence data is similar and will not be elaborated further.

[0119] After obtaining multiple indicator sequence data samples, this application can label the indicator sequence data samples with categories based on whether the corresponding candidate indicator sequence data samples are normal or abnormal. Of course, the categories of indicator sequence data samples can also be labeled manually according to the actual situation, without any restrictions.

[0120] In this application, the process of training a VAE based on multiple indicator sequence data samples can employ any existing VAE training method, without any restrictions. However, the training of the VAE in this application must meet the training objectives mentioned above.

[0121] In one possible implementation, the variational autoencoder is trained using multiple labeled index sequence data samples corresponding to key performance indicators and based on an objective function.

[0122] The objective function includes:

[0123] The encoder in a variational autoencoder corresponds to the following first objective function J. E As shown in Formula 2 below:

[0124]

[0125] And, the second objective loss function J corresponding to the decoder in the variational autoencoder is as follows: G As shown in Formula 3 below:

[0126]

[0127] Where y represents a normal indicator sequence data sample, i.e., a normal indicator sequence data sample. S Belongs to y r y p and y n , where y S For reconstructed sequence data samples corresponding to normal indicator sequence data samples; y p For random sequence data samples; y nThese are indicator sequence data samples categorized as anomalous, i.e., anomalous indicator sequence data samples.

[0128] Enc(y) represents the encoded feature distribution sample obtained by passing the encoder in the variational autoencoder through the index sequence data sample of the normal category;

[0129] Enc(y S ) represents y S The encoded feature distribution sample obtained by the encoder in the variational autoencoder;

[0130] KL(Enc(y)) represents the relative entropy of Enc(y) with respect to the multivariate Gaussian distribution;

[0131] KL(Enc(y S )) represents Enc(y S The relative entropy of the multivariate Gaussian distribution;

[0132] α and β are different weighting coefficients, and their specific values ​​can be set as needed.

[0133] m is a set parameter value, and the specific value of the parameter can be preset.

[0134] Enc(ng(y S )) indicates that in a variational autoencoder, y is the result of training the encoder and keeping it constant. s The encoded feature distribution samples are obtained through the encoder in this variational autoencoder. It is understandable that this is a variational autoencoder.

[0135] KL(Enc(ng(y S ))) means Enc(ng(y S The relative entropy of the multivariate Gaussian distribution.

[0136] [m-KL(Enc(y S ))] + This indicates that m-KL(Enc(y) S When )) is less than zero, the result is zero; m-KL(Enc(y S When )) is greater than zero, take m-KL(Enc(y) S )).

[0137] The calculation of relative entropy, or KL divergence, can be unrestricted. For ease of understanding, let's take one example: assuming any index sequence data sample y... i (Similarly, the mean and standard deviation obtained by the encoder in the variational autoencoder are μ.) i and σ iHere, the mean and standard deviation can be a matrix, so the index sequence data sample y i The encoded feature distribution Enc(y) output by the encoder i The KL divergence between the y and multivariate Gaussian distributions is KL(Enc(y)). i The result can be obtained using the following formula:

[0138]

[0139] Where M is μ i Or σ i Dimensions.

[0140] Based on the above, after training the VAE, in order to more accurately identify abnormal data in key indicator data, the VAE can be used to test multiple indicator sequence data samples, and the aforementioned abnormal threshold can be reasonably set in combination with recall or precision.

[0141] like Figure 3 The diagram illustrates one implementation flow for determining the value of the abnormal threshold in this application. This embodiment may include:

[0142] S301, for each index sequence data sample labeled with a category, the index sequence data sample is input into the trained variational autoencoder to obtain the encoded feature distribution sample output by the variational autoencoder.

[0143] S302, for each index sequence data sample, determine the fourth relative entropy between the encoding feature distribution sample and the multivariate Gaussian distribution corresponding to the index sequence data sample.

[0144] For ease of distinction, the KL divergence between the coding feature distribution corresponding to the index sequence data samples in this embodiment and the multivariate Gaussian distribution is referred to as the fourth relative entropy. The process of calculating the fourth relative entropy can be found in the previous related introduction, and will not be repeated here.

[0145] S303, Set the initial value for the abnormal threshold.

[0146] The initial value can be set according to the actual situation.

[0147] S304, using the fourth relative entropy being greater than the anomaly threshold as the anomaly detection criterion for determining the presence of anomalies in the indicator sequence data sample, and determining the anomaly detection result of the indicator sequence data sample.

[0148] The anomaly detection result for the indicator sequence data sample is a prediction of whether the indicator sequence data sample is normal, based on the encoded feature distribution obtained from the encoder in the VAE and the set anomaly threshold. Accordingly, the anomaly detection result characterizes whether the indicator sequence data sample has been detected as an anomalous indicator sequence data sample.

[0149] For example, using the index sequence data sample y i The corresponding fourth relative entropy is represented as KL(Enc(y)). i For example, if KL(Enc(y)) i If y is less than or equal to the anomaly threshold λ, it can be considered that y is abnormal. i The indicator sequence data samples were detected as normal; conversely, if KL(Enc(y) i The index sequence data sample that is greater than the abnormal threshold λ is detected as abnormal.

[0150] S305, combining the categories labeled in the data samples of each indicator sequence, determines the recall or precision corresponding to the anomaly detection results of multiple indicator data sequence samples.

[0151] Understandably, the category labeled on an indicator sequence data sample can characterize whether the sample is actually an anomalous. Based on this, and combined with the category, it can be determined whether the anomaly detection result is correct. Furthermore, by combining the categories of multiple indicator sequence data samples and the information on the correctness of anomaly detection results, the recall and precision for multiple indicator sequence data samples can ultimately be obtained.

[0152] The recall rate is the ratio of the number of index sequence data samples that are predicted to be normal to the total number of normal index sequence data samples.

[0153] Precision rate refers to the proportion of indicator sequence data samples that are predicted to be normal, and which are actually also normal indicator sequence data samples.

[0154] S306, determine whether the recall rate or precision meets the conditions. If yes, set the current value of the anomaly threshold to the value of the anomaly threshold. If no, proceed to step S307.

[0155] The conditions that recall and precision must each meet can be set as needed. For example, recall can be greater than a first threshold, and similarly, precision can be greater than a second threshold. There are no restrictions on this.

[0156] Understandably, if the recall or precision meets the requirements, the current anomaly threshold setting can be considered reasonable, and there is no need to adjust the value of the anomaly threshold. Conversely, if the recall or precision does not meet the requirements, the value of the anomaly threshold can be adjusted, and the recall and precision can continue to be tested to see if they meet the requirements.

[0157] S307, adjust the value of the abnormal threshold, and return to execute S304 based on the adjusted abnormal threshold.

[0158] In this application, after determining whether there is an anomaly in the key performance indicators based on the indicator sequence data, in order to more accurately analyze the degree of anomaly of the key performance indicators, this application can also determine the anomaly level of the key performance indicators.

[0159] Specifically, the difference between the first relative entropy between the indicator sequence data and the coding feature distribution of the indicator sequence data and the anomaly threshold can be determined. Based on this, the absolute value of the difference and the percentage dp of the anomaly threshold can be determined. Accordingly, based on the percentage dp and multiple set percentage threshold ranges, the anomaly level of the key performance indicator can be determined. Correspondingly, anomaly labels corresponding to the determined anomaly levels can be generated for the key performance indicators.

[0160] In this application, the number and classification method of abnormal levels can be set as needed. Correspondingly, the percentage threshold range can be set as needed without restriction.

[0161] For example, three thresholds can be preset: a1, a2, and a3, where a1 is the smallest and a3 is the largest. These three thresholds can be used to divide the data into three percentage threshold ranges. Correspondingly, for the key performance indicator (KPI) sequence data, if the percentage dp determined in the above manner is greater than a1 but less than or equal to a2, then the KPI is determined to be in a general anomaly state; if the percentage dp is greater than a2 but less than or equal to a3, then the KPI has a significant anomaly, falling into the significant anomaly category; similarly, if the percentage dp is greater than a3, then the KPI is in the severe anomaly category.

[0162] Corresponding to the indicator anomaly detection method provided in the embodiments of this application, this application also provides an indicator anomaly detection device. For example... Figure 4 The diagram illustrates a structural composition of an anomaly detection device provided in this embodiment of the application. The device in this embodiment may include:

[0163] The data acquisition unit 401 is used to acquire the indicator sequence data of the key performance indicators to be detected in the business system. The indicator sequence data includes the indicator data of the key performance indicators at multiple time points.

[0164] Model processing unit 402 is used to input the indicator sequence data into a trained variational autoencoder to obtain the encoding feature distribution output by the encoder in the variational autoencoder. The variational autoencoder is trained using multiple labeled indicator sequence data samples corresponding to the key performance indicators, with the goal of minimizing the second relative entropy and sample error, and maximizing the third relative entropy between abnormal samples and the multivariate Gaussian distribution. The indicator sequence data samples belong to either normal or abnormal categories. The second relative entropy is the relative entropy between normal indicator sequence data samples and the multivariate Gaussian distribution. The normal indicator sequence data samples are those with a normal category. The sample error is the error between the normal indicator sequence data samples and the reconstructed sequence data samples reconstructed by the variational autoencoder from the normal indicator sequence data samples. The abnormal samples include: indicator sequence data samples with an abnormal category; the reconstructed sequence data samples; and random sequence data samples randomly extracted from the encoding feature distribution samples obtained by the encoder of the variational autoencoder from the indicator sequence data samples.

[0165] Entropy determination unit 403 is used to determine the first relative entropy between the encoded feature distribution and the multivariate Gaussian distribution used as a prior distribution;

[0166] Anomaly detection unit 404 is used to determine that the key performance indicator is abnormal if the first relative entropy is greater than a set anomaly threshold, and to generate an anomaly label for the key performance indicator.

[0167] In one possible implementation, the anomaly detection unit includes:

[0168] The difference determination unit is used to determine the difference between the first relative entropy and the abnormal threshold if the first relative entropy is greater than a set abnormal threshold.

[0169] A percentage determination unit is used to determine the percentage of the absolute value of the difference to the abnormal threshold.

[0170] An anomaly determination unit is used to determine the anomaly level of the key performance indicator based on the percentage and multiple set percentage threshold ranges.

[0171] The tag generation unit is used to generate anomaly tags corresponding to the anomaly level for the key performance indicators.

[0172] In another possible implementation, it further includes: an anomaly threshold determination unit, used to obtain the value of the anomaly threshold in the following manner:

[0173] For each labeled index sequence data sample, the index sequence data sample is input into a trained variational autoencoder to obtain the encoded feature distribution sample output by the variational autoencoder.

[0174] For each index sequence data sample, determine the fourth relative entropy between the encoding feature distribution sample corresponding to the index sequence data sample and the multivariate Gaussian distribution;

[0175] Set the initial value for the anomaly threshold;

[0176] Using the fourth relative entropy being greater than the anomaly threshold as the anomaly detection criterion for determining that the index sequence data sample has an anomaly, the anomaly detection result of the index sequence data sample is determined.

[0177] By combining the categories labeled in the data samples of each indicator sequence, the recall or precision corresponding to the anomaly detection results of multiple indicator data sequence samples can be determined.

[0178] If the recall or precision does not meet the conditions, the value of the anomaly threshold is adjusted, and the operation of determining the anomaly detection result of the indicator sequence data sample is performed based on the adjusted anomaly threshold.

[0179] If the recall rate or precision meets the condition, the current value of the anomaly threshold is determined as the value set as the anomaly threshold.

[0180] In another possible implementation, the variational autoencoder in this application is obtained by training based on an objective function using multiple labeled index sequence data samples corresponding to the key performance indicators.

[0181] The objective function includes:

[0182] The encoder in the variational autoencoder corresponds to the following first objective function J. E :

[0183]

[0184] And, the second objective loss function J corresponding to the decoder in the variational autoencoder is as follows: G :

[0185]

[0186] Where y represents a normal index sequence data sample, y S Belongs to yr y p and y n , where y S For the reconstructed sequence data sample, y p For the random sequence data sample, y n This is a sample of indicator sequence data categorized as anomaly;

[0187] Enc(y) represents the coded feature distribution sample obtained by passing y through the encoder in the variational autoencoder;

[0188] Enc(y S ) represents y S The encoded feature distribution sample obtained by the encoder in the variational autoencoder;

[0189] KL(Enc(y)) represents the relative entropy of Enc(y) with respect to the multivariate Gaussian distribution;

[0190] KL(Enc(y S )) represents Enc(y S The relative entropy of the multivariate Gaussian distribution;

[0191] α and β are different weighting coefficients, and m is a set parameter value;

[0192] Enc(ng(y S )) indicates that in the variational autoencoder, y is the result of the encoder being trained and kept constant. S The encoded feature distribution sample obtained by the encoder;

[0193] KL(Enc(ng(y S ))) means Enc(ng(y S The relative entropy of the multivariate Gaussian distribution.

[0194] In another possible implementation, the device further includes a sample acquisition unit for obtaining the index sequence data sample in the following manner:

[0195] Obtain candidate indicator sequence data for the key indicator data, wherein the candidate indicator sequence data includes candidate indicator data from multiple different historical time points;

[0196] Determine the standard score of each candidate indicator data in the candidate indicator sequence data respectively, and replace the candidate indicator data in the candidate indicator sequence data with the standard score of the candidate indicator data to obtain the reconstructed candidate indicator sequence data;

[0197] The candidate index sequence data is sampled using a sliding window of a set length to obtain multiple sampled index sequence data samples.

[0198] It should be noted that the various embodiments in this specification are described in a progressive manner, with each embodiment focusing on its differences from other embodiments. Similar or identical parts between embodiments can be referred to interchangeably. Furthermore, the features described in the various embodiments of this specification can be substituted or combined with each other, enabling those skilled in the art to implement or use this application. For apparatus embodiments, since they are basically similar to method embodiments, the description is relatively simple; relevant parts can be referred to the descriptions of the method embodiments.

[0199] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes the element.

[0200] The above description of the disclosed embodiments enables those skilled in the art to make or use this application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of this application. Therefore, this application is not to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

[0201] The above are merely preferred embodiments of this application. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principles of this application, and these improvements and modifications should also be considered within the scope of protection of this application.

Claims

1. A method for detecting anomalies in indicators and generating labels, characterized in that, include: Obtain the indicator sequence data of the key performance indicators to be detected in the business system, wherein the indicator sequence data includes: indicator data of the key performance indicators at multiple time points; The index sequence data is input into a trained variational autoencoder to obtain the coding feature distribution of the encoder output in the variational autoencoder. Determine the first relative entropy between the encoded feature distribution and the multivariate Gaussian distribution used as a prior distribution, wherein the first relative entropy is a metric used to measure the similarity between the encoded feature distribution and the multivariate Gaussian distribution, in order to determine whether there are any anomalies in the key performance indicators; If the first relative entropy is greater than the set anomaly threshold, it is determined that the key performance indicator is abnormal, and an anomaly label is generated for the key performance indicator. The variational autoencoder is obtained by using multiple labeled index sequence data samples corresponding to the key performance indicators, with the goal of minimizing the second relative entropy and sample error, and maximizing the third relative entropy of abnormal samples and the multivariate Gaussian distribution, and is trained based on the objective function. The index sequence data samples belong to one of the categories of normal and abnormal. The abnormal samples include: an indicator sequence data sample classified as abnormal; a reconstructed sequence data sample; and a random sequence data sample constructed by randomly extracting data from the mean and variance corresponding to the encoded feature distribution sample obtained by the encoder of the variational autoencoder encoding the indicator sequence data sample. The second relative entropy is the relative entropy between the normal indicator sequence data sample and the multivariate Gaussian distribution, wherein the normal indicator sequence data sample is an indicator sequence data sample whose category is normal; The sample error is the error between the normal index sequence data sample and the reconstructed sequence data sample reconstructed by the variational autoencoder from the normal index sequence data sample. The objective function includes: The encoder in the variational autoencoder corresponds to the following first objective function. : And, the second objective loss function corresponding to the decoder in the variational autoencoder is as follows: : in, This represents a sample of indicator sequence data categorized as normal. belong , and ,in, For the reconstructed sequence data sample, For the random sequence data sample, This is a sample of indicator sequence data categorized as anomaly; express The encoded feature distribution sample obtained by the encoder in the variational autoencoder; express The encoded feature distribution sample obtained by the encoder in the variational autoencoder; express The relative entropy with respect to the multivariate Gaussian distribution; express The relative entropy with respect to the multivariate Gaussian distribution; and For the different weighting coefficients set, The set parameter value; This indicates that in the variational autoencoder, the encoder is trained and remains unchanged. The encoded feature distribution sample obtained by the encoder; express The relative entropy with the multivariate Gaussian distribution.

2. The method according to claim 1, characterized in that, If the first relative entropy is greater than a set anomaly threshold, it is determined that the key performance indicator is abnormal, and an anomaly label is generated for the key performance indicator, including: Determine the difference between the first relative entropy and the anomaly threshold; Determine the percentage of the absolute value of the difference to the anomaly threshold; Based on the percentage and multiple set percentage threshold ranges, the abnormality level of the key performance indicator is determined. Generate anomaly labels corresponding to the anomaly levels for the key performance indicators.

3. The method according to claim 1, characterized in that, The value of the anomaly threshold is obtained in the following way: For each labeled index sequence data sample, the index sequence data sample is input into a trained variational autoencoder to obtain the encoded feature distribution sample output by the encoder of the variational autoencoder. For each index sequence data sample, determine the fourth relative entropy between the encoding feature distribution sample corresponding to the index sequence data sample and the multivariate Gaussian distribution; Set the initial value for the anomaly threshold; Using the fourth relative entropy being greater than the anomaly threshold as the anomaly detection criterion for determining that the index sequence data sample has an anomaly, the anomaly detection result of the index sequence data sample is determined. By combining the categories labeled in the data samples of each indicator sequence, the recall or precision corresponding to the anomaly detection results of multiple indicator sequence data samples can be determined. If the recall or precision does not meet the conditions, the value of the anomaly threshold is adjusted, and the operation of determining the anomaly detection result of the indicator sequence data sample is performed based on the adjusted anomaly threshold. If the recall rate or precision meets the condition, the current value of the anomaly threshold is determined as the value set as the anomaly threshold.

4. The method according to claim 1, characterized in that, The indicator sequence data sample was obtained in the following manner: Obtain candidate indicator sequence data for the key performance indicator data, wherein the candidate indicator sequence data includes candidate indicator data from multiple different historical time points; Determine the standard score of each candidate indicator data in the candidate indicator sequence data respectively, and replace the candidate indicator data in the candidate indicator sequence data with the standard score of the candidate indicator data to obtain the reconstructed candidate indicator sequence data; A sliding window of a set length is used to sample the reconstructed candidate index sequence data to obtain multiple sampled index sequence data samples.

5. An abnormal indicator detection device, characterized in that, include: The data acquisition unit is used to acquire the indicator sequence data of the key performance indicators to be detected in the business system. The indicator sequence data includes the indicator data of the key performance indicators at multiple time points. The model processing unit is used to input the index sequence data into the trained variational autoencoder to obtain the coding feature distribution of the encoder output in the variational autoencoder. The entropy determination unit is used to determine the first relative entropy between the encoded feature distribution and the multivariate Gaussian distribution used as a prior distribution. The first relative entropy is a metric used to measure the similarity between the encoded feature distribution and the multivariate Gaussian distribution, so as to determine whether there is an anomaly in the key performance indicator. An anomaly detection unit is used to determine that the key performance indicator is abnormal if the first relative entropy is greater than a set anomaly threshold, and to generate an anomaly label for the key performance indicator. The variational autoencoder is trained using multiple labeled index sequence data samples corresponding to the key performance indicators, aiming to minimize the second relative entropy and sample error, and maximize the third relative entropy of abnormal samples and the multivariate Gaussian distribution. The index sequence data is categorized as either normal or abnormal. Abnormal samples include: index sequence data samples categorized as abnormal; reconstructed sequence data samples; and random sequence data samples constructed by randomly extracting data from the mean and variance of the encoded feature distribution samples obtained by the encoder of the variational autoencoder. The second relative entropy is the relative entropy between normal index sequence data samples and the multivariate Gaussian distribution, and the normal index sequence data samples are those categorized as normal. The sample error is the error between the normal index sequence data samples and the reconstructed sequence data samples reconstructed by the variational autoencoder from the normal index sequence data samples. The objective function includes: The encoder in the variational autoencoder corresponds to the following first objective function. : And, the second objective loss function corresponding to the decoder in the variational autoencoder is as follows: : in, This represents a sample of indicator sequence data categorized as normal. belong , and ,in, For the reconstructed sequence data sample, For the random sequence data sample, This is a sample of indicator sequence data categorized as anomaly; express The encoded feature distribution sample obtained by the encoder in the variational autoencoder; express The encoded feature distribution sample obtained by the encoder in the variational autoencoder; express The relative entropy with respect to the multivariate Gaussian distribution; express The relative entropy with respect to the multivariate Gaussian distribution; and For the different weighting coefficients set, The set parameter value; This indicates that in the variational autoencoder, the encoder is trained and remains unchanged. The encoded feature distribution sample obtained by the encoder; express The relative entropy with the multivariate Gaussian distribution.

6. The apparatus according to claim 5, characterized in that, The anomaly detection unit includes: The difference determination unit is used to determine the difference between the first relative entropy and the abnormal threshold if the first relative entropy is greater than a set abnormal threshold. A percentage determination unit is used to determine the percentage of the absolute value of the difference to the abnormal threshold. An anomaly determination unit is used to determine the anomaly level of the key performance indicator based on the percentage and multiple set percentage threshold ranges. The tag generation unit is used to generate anomaly tags corresponding to the anomaly level for the key performance indicators.

7. The apparatus according to claim 5, characterized in that, Also includes: An anomaly threshold determination unit is used to obtain the value of the anomaly threshold in the following manner: For each labeled index sequence data sample, the index sequence data sample is input into a trained variational autoencoder to obtain the encoded feature distribution sample output by the encoder of the variational autoencoder. For each index sequence data sample, determine the fourth relative entropy between the encoding feature distribution sample corresponding to the index sequence data sample and the multivariate Gaussian distribution; Set the initial value for the anomaly threshold; Using the fourth relative entropy being greater than the anomaly threshold as the anomaly detection criterion for determining that the index sequence data sample has an anomaly, the anomaly detection result of the index sequence data sample is determined. By combining the categories labeled in the data samples of each indicator sequence, the recall or precision corresponding to the anomaly detection results of multiple indicator sequence data samples can be determined. If the recall or precision does not meet the conditions, the value of the anomaly threshold is adjusted, and the operation of determining the anomaly detection result of the indicator sequence data sample is performed based on the adjusted anomaly threshold. If the recall rate or precision meets the condition, the current value of the anomaly threshold is determined as the value set as the anomaly threshold.