Abnormal information detection model construction method and grayscale environment anomaly detection method
By constructing an anomaly detection model and utilizing the prediction accuracy of training samples and the initial model, combined with detection deviation and alarm information, the problems of low accuracy and poor timeliness of anomaly detection in grayscale environments are solved, achieving fast and accurate system anomaly detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2022-02-11
- Publication Date
- 2026-06-16
AI Technical Summary
In existing technologies, the detection system for abnormal situations in grayscale environments, which relies on acquired operational data and pre-set detection thresholds, has a low level of intelligence, low detection accuracy, and cannot detect abnormal situations in a timely manner.
An anomaly detection model is constructed by acquiring training samples and initial models, training multiple initial models using training data, constructing an anomaly detection model based on prediction accuracy, and combining detection deviation and alarm information for anomaly detection.
It improves the accuracy of anomaly detection in grayscale environments, enables rapid location of system malfunctions, reduces manual analysis time, and enhances the timeliness and accuracy of detection.
Smart Images

Figure CN114461499B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of artificial intelligence, specifically to a method for constructing an anomaly detection model, a method for detecting anomalies in grayscale environments, and related devices, equipment, media, and program products. Background Technology
[0002] To adapt to the rapid launch of business, the system function upgrade cycle has been further shortened. By deploying the system to a gray-scale environment, which is a process of releasing the system to production, abnormal situations in the system operation can be detected and targeted adjustments can be made, thereby effectively reducing the risks that may be caused by the system going live.
[0003] In realizing the inventive concept of this disclosure, the inventors found that the related technologies that detect abnormal situations in the operation of the system in a grayscale environment by acquiring operational data and setting pre-set detection thresholds have a low level of intelligence and low detection accuracy. Summary of the Invention
[0004] In view of the above problems, this disclosure provides a method for constructing an anomaly detection model, a method for detecting anomalies in grayscale environments, an apparatus, equipment, media, and program products.
[0005] According to the first aspect of this disclosure, a method for constructing an anomaly information detection model is provided, comprising:
[0006] Obtain training samples, wherein the training samples include training data and sample labels corresponding to the training data, the training data includes a first historical detection deviation, the first historical deviation is generated based on the first historical detection data of the system to be detected running in a grayscale environment and the second historical detection data running in a formal environment, the sample labels include a second historical deviation, and the time sequence label of the training data is earlier than the sample labels.
[0007] Obtain N initial models, where each of the N initial models has a different network structure, and N≥2;
[0008] Using the training data described above, train each of the N initial models mentioned above to obtain N prediction models after training, where each prediction model corresponds to one of the N initial models mentioned above.
[0009] Based on the above sample labels, determine the prediction accuracy of each of the N prediction models mentioned above;
[0010] Based on the prediction accuracy of each of the N prediction models mentioned above, an anomaly detection model is constructed.
[0011] According to embodiments of this disclosure, the prediction accuracy of each of the N prediction models is determined based on the aforementioned sample labels, including:
[0012] The training data is input into N prediction models to obtain the prediction results output by each prediction model.
[0013] Using the aforementioned sample labels, process the prediction results output by each of the N prediction models to determine the prediction accuracy of each of the N prediction models.
[0014] According to embodiments of this disclosure, constructing an anomaly detection model based on the prediction accuracies of each of the N aforementioned prediction models includes:
[0015] Based on the prediction accuracy of each of the N prediction models described above, determine the detection weight corresponding to each prediction model.
[0016] Based on N prediction models and the corresponding detection weights for each prediction model, the above-mentioned anomaly detection model is constructed.
[0017] According to embodiments of this disclosure, the initial model described above includes a model constructed based on a time series algorithm.
[0018] According to embodiments of this disclosure, the model constructed based on the time series algorithm described above includes at least one of the following:
[0019] Exponential smoothing model, autoregressive moving average model, Prophet model, autoregressive model, moving average model.
[0020] According to embodiments of this disclosure, the first historical detection data or the second historical detection data mentioned above includes at least one of the following:
[0021] The response time of the system under test, the system capacity of the system under test, the frequency of operational errors of the system under test, and the request frequency of the system under test.
[0022] The second aspect of this disclosure provides a method for detecting anomalies in grayscale environments, including:
[0023] The detection deviation is obtained based on the first detection data of the system under test running in a grayscale environment and the second detection data of the system running in a formal environment.
[0024] The above-mentioned detection deviation is input into the anomaly detection model, and the anomaly detection result is output. The above-mentioned anomaly detection result represents the abnormal operation of the above-mentioned system under test. The above-mentioned anomaly detection model is constructed by the above-mentioned anomaly detection model construction method.
[0025] According to embodiments of this disclosure, the above-described grayscale environment anomaly detection method further includes:
[0026] Based on the above abnormal information detection results, and / or based on the first matching result between the first alarm information generated by the system under test running in the grayscale environment and the target alarm information in the target alarm information database, the abnormal operation of the above-mentioned system under test is determined, wherein the above-mentioned target alarm information database is constructed based on the second alarm information generated by the above-mentioned system under test running in the formal environment.
[0027] According to embodiments of this disclosure, the above-described grayscale environment anomaly detection method further includes:
[0028] Obtain the second alarm information generated by the system under test running in the aforementioned formal environment;
[0029] The second alarm information is matched with each target alarm information in the target alarm information database to obtain a second matching result corresponding to each target alarm information.
[0030] If the second matching result corresponding to each of the above-mentioned second alarm information and each of the above-mentioned target alarm information indicates a mismatch, the above-mentioned second alarm information is added to the above-mentioned target alarm information database as a new target alarm information, thus obtaining an updated target alarm information database.
[0031] A third aspect of this disclosure provides an apparatus for constructing an anomaly detection model, comprising:
[0032] The sample acquisition module is used to acquire training samples, wherein the training samples include training data and sample labels corresponding to the training data, the training data includes a first historical detection deviation, the first historical deviation is generated based on the first historical detection data of the system to be detected running in a grayscale environment and the second historical detection data running in a formal environment, the sample labels include a second historical deviation, and the time sequence label of the training data is earlier than the sample labels.
[0033] The initial model acquisition module is used to acquire N initial models, each of which has a different network structure.
[0034] The training module is used to train each of the N initial models mentioned above using the training data to obtain N prediction models after training. Each prediction model corresponds to one of the N initial models mentioned above, and N≥2.
[0035] The determination module is used to determine the prediction accuracy of each of the N prediction models mentioned above, based on the sample labels; and
[0036] The module is used to construct an anomaly detection model based on the prediction accuracy of each of the N prediction models mentioned above.
[0037] The fourth aspect of this disclosure provides a grayscale environment anomaly detection device, comprising:
[0038] The acquisition module is used to acquire the detection deviation, which is generated based on the first detection data of the system under test running in a grayscale environment and the second detection data of the system running in a formal environment; and
[0039] The detection module is used to input the above-mentioned detection deviation into the anomaly detection model and output the anomaly detection result. The above-mentioned anomaly detection result represents the abnormal operation of the above-mentioned system under test. The above-mentioned anomaly detection model is constructed by the above-mentioned anomaly detection model construction method.
[0040] The fifth aspect of this disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors cause the one or more processors to perform the above-described method for constructing an anomaly detection model or the above-described method for detecting anomalies in grayscale environments.
[0041] A sixth aspect of this disclosure also provides a computer-readable storage medium having executable instructions stored thereon, which, when executed by a processor, cause the processor to perform the above-described method for constructing an anomaly detection model or the above-described method for detecting anomalies in grayscale environments.
[0042] The seventh aspect of this disclosure also provides a computer program product, including a computer program that, when executed by a processor, implements the above-described method for constructing an anomaly detection model or the above-described method for detecting anomalies in grayscale environments. Attached Figure Description
[0043] The foregoing contents, as well as other objects, features, and advantages of this disclosure, will become clearer from the following description of embodiments with reference to the accompanying drawings, in which:
[0044] Figure 1 The illustration schematically shows an application scenario of the method for constructing an anomaly detection model, the grayscale environment anomaly detection method, and the apparatus according to embodiments of the present disclosure;
[0045] Figure 2 A flowchart illustrating a method for constructing an anomaly detection model according to an embodiment of the present disclosure is shown schematically.
[0046] Figure 3 This schematically illustrates a flowchart of determining the prediction accuracy of each of N prediction models based on sample labels according to an embodiment of the present disclosure;
[0047] Figure 4 This schematically illustrates a flowchart of constructing an anomaly detection model based on the prediction accuracy of each of N prediction models according to an embodiment of the present disclosure;
[0048] Figure 5 A flowchart illustrating a grayscale environment anomaly detection method according to an embodiment of the present disclosure is shown schematically.
[0049] Figure 6 This diagram illustrates an application scenario of the grayscale environment anomaly detection method according to an embodiment of the present disclosure.
[0050] Figure 7 A flowchart illustrating a grayscale environment anomaly detection method according to another embodiment of the present disclosure is shown schematically;
[0051] Figure 8 This schematic diagram illustrates a structural block diagram of an apparatus for constructing an anomaly detection model according to an embodiment of the present disclosure.
[0052] Figure 9 A schematic diagram illustrating the structure of a grayscale environment anomaly detection device according to an embodiment of the present disclosure is shown; and
[0053] Figure 10 A block diagram of an electronic device suitable for implementing a method for constructing an anomaly information detection model and a grayscale environment anomaly detection method according to embodiments of the present disclosure is shown schematically. Detailed Implementation
[0054] The embodiments of the present disclosure will now be described with reference to the accompanying drawings. However, it should be understood that these descriptions are exemplary only and are not intended to limit the scope of the disclosure. In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the embodiments of the present disclosure for ease of explanation. However, it will be apparent that one or more embodiments may be practiced without these specific details. Furthermore, descriptions of well-known structures and techniques are omitted in the following description to avoid unnecessarily obscuring the concepts of the present disclosure.
[0055] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit this disclosure. The terms “comprising,” “including,” etc., as used herein indicate the presence of the stated features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.
[0056] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.
[0057] When using expressions such as "at least one of A, B, and C", they should generally be interpreted in accordance with the meaning that is commonly understood by a person skilled in the art (e.g., "a system having at least one of A, B, and C" should include, but is not limited to, a system having A alone, a system having B alone, a system having C alone, a system having A and B, a system having A and C, a system having B and C, and / or a system having A, B, and C, etc.).
[0058] A gray-scale environment can refer to an online simulation environment or a pre-release environment. A system can be deployed in a gray-scale environment to test its operational status, promptly identify and address problems that arise during operation, thus enabling a smooth transition to a full release. The production environment can include the runtime environment where the system is officially running.
[0059] In the process of realizing the inventive concept of this disclosure, the inventors found that the accuracy of detecting abnormal situations when the system is running in a grayscale environment is low, and abnormal situations cannot be detected in a timely manner.
[0060] Embodiments of this disclosure provide a method for constructing an anomaly information detection model, including:
[0061] Obtain training samples, which include training data and corresponding sample labels. The training data includes a first historical detection deviation, generated based on the first historical detection data of the system under test running in a grayscale environment and the second historical detection data of the system running in a formal environment. The sample labels include a second historical deviation, and the temporal label of the training data is earlier than the sample label. Obtain N initial models, each with a different network structure, where N ≥ 2. Train each of the N initial models using the training data to obtain N trained prediction models, each prediction model corresponding to one of the N initial models. Determine the prediction accuracy of each of the N prediction models based on the sample labels. Construct an anomaly detection model based on the prediction accuracy of each of the N prediction models.
[0062] According to embodiments of this disclosure, a first historical detection deviation is generated using first historical detection data from the system under test running in a grayscale environment and second historical detection data from the system under test running in a formal environment. N initial models with different network structures are then trained using the first historical detection deviation to obtain N prediction models for predicting the detection deviation of the system under test. The prediction accuracy of each of the N prediction models is determined based on the sample labels, and an anomaly detection model is constructed based on the prediction accuracy of each of the N prediction models. This allows the constructed anomaly detection model to improve the accuracy of detecting anomalies in the system under test.
[0063] Embodiments of this disclosure also provide a method for detecting anomalies in grayscale environments, including:
[0064] The detection deviation is obtained based on the first detection data of the system under test running in a grayscale environment and the second detection data of the system under test running in a formal environment. The detection deviation is input into the anomaly detection model and the anomaly detection result is output. The anomaly detection result represents the abnormal operation of the system under test. The anomaly detection model is constructed by the above-mentioned anomaly detection model construction method.
[0065] According to embodiments of this disclosure, a detection deviation is generated based on first detection data from the system under test running in a grayscale environment and second detection data from the system under test running in a formal environment. This allows for real-time acquisition of the system under test's operation in both grayscale and formal environments. By processing the detection deviation, the system under test's operation in both environments can be analyzed. The anomaly detection model constructed using the above method processes the detection deviation, enabling the obtained anomaly detection results to quickly identify anomalies in the system under test's operation in the grayscale environment. This allows relevant maintenance personnel to quickly locate anomalies in the system under test, and compared to manual analysis of anomaly information, it effectively improves the accuracy of anomaly detection.
[0066] In the technical solution disclosed herein, the collection, storage, use, processing, transmission, provision, disclosure, and application of user personal information comply with the provisions of relevant laws and regulations, necessary confidentiality measures have been taken, and there is no violation of public order and good morals.
[0067] In the technical solution disclosed herein, the user's authorization or consent is obtained before acquiring or collecting the user's personal information.
[0068] Figure 1 The illustration schematically shows an application scenario of the method for constructing an anomaly detection model, the grayscale environment anomaly detection method, and the apparatus according to embodiments of the present disclosure.
[0069] like Figure 1 As shown, application scenario 100 according to this embodiment may include terminal devices 101, 102, and 103, network 104, and server 105. Network 104 is used as a medium to provide a communication link between terminal devices 101, 102, and 103 and server 105. Network 104 may include various connection types, such as wired or wireless communication links or fiber optic cables, etc.
[0070] Users can use terminal devices 101, 102, and 103 to interact with server 105 via network 104 to receive or send messages, etc. Various communication client applications can be installed on terminal devices 101, 102, and 103, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social media platform software, etc. (for example only).
[0071] Terminal devices 101, 102, and 103 can be various electronic devices with displays and web browsing capabilities, including but not limited to smartphones, tablets, laptops, and desktop computers.
[0072] Server 105 can be a server that provides various services, such as a backend management server that supports websites browsed by users using terminal devices 101, 102, and 103 (for example only). The backend management server can analyze and process data such as received user requests, and feed back the processing results (such as web pages, information, or data obtained or generated according to user requests) to the terminal devices.
[0073] It should be noted that the anomaly detection model construction method and grayscale environment anomaly detection method provided in this embodiment can generally be executed by server 105. Correspondingly, the anomaly detection model construction device and grayscale environment anomaly detection device provided in this embodiment can generally be located in server 105. The anomaly detection model construction method and grayscale environment anomaly detection method provided in this embodiment can also be executed by a server or server cluster that is different from server 105 and capable of communicating with terminal devices 101, 102, 103 and / or server 105. Correspondingly, the anomaly detection model construction device and grayscale environment anomaly detection device provided in this embodiment can also be located in a server or server cluster that is different from server 105 and capable of communicating with terminal devices 101, 102, 103 and / or server 105.
[0074] The method for constructing the anomaly detection model and the grayscale environment anomaly detection method provided in this embodiment can also be executed by terminal devices 101, 102, and 103. Correspondingly, the apparatus for constructing the anomaly detection model and the grayscale environment anomaly detection apparatus provided in this embodiment can generally be located in terminal devices 101, 102, and 103. The method for constructing the anomaly detection model and the grayscale environment anomaly detection method provided in this embodiment can also be executed by terminal devices different from terminal devices 101, 102, and 103 and / or server 105. Correspondingly, the apparatus for constructing the anomaly detection model and the grayscale environment anomaly detection apparatus provided in this embodiment can also be located in terminal devices different from terminal devices 101, 102, and 103 and / or server 105.
[0075] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.
[0076] The following will be based on Figure 1 The described scene, through Figures 2-4 The method for constructing the anomaly detection model of the disclosed embodiments is described in detail.
[0077] Figure 2 A flowchart illustrating a method for constructing an anomaly information detection model according to an embodiment of the present disclosure is shown.
[0078] like Figure 2 As shown, the method for constructing the above-mentioned anomaly detection model may include operations S210 to S250.
[0079] In operation S210, training samples are acquired, wherein the training samples include training data and sample labels corresponding to the training data. The training data includes a first historical detection deviation, which is generated based on the first historical detection data of the system to be detected running in a grayscale environment and the second historical detection data running in a formal environment. The sample labels include a second historical deviation, and the time sequence label of the training data is earlier than the sample label.
[0080] According to embodiments of this disclosure, the first historical detection data may include detection index data generated by the system under test running in a grayscale environment within a historical time period, such as the response time of the system under test in a grayscale environment. Correspondingly, the second historical detection data may include detection index data generated by the system under test running in a formal environment within a historical time period.
[0081] According to embodiments of this disclosure, the first historical detection deviation can be generated based on the first historical detection data and the second historical detection data. For example, the first historical detection deviation can be generated based on the difference between the first historical detection data and the second historical detection data.
[0082] It should be noted that the first historical detection data and the second historical detection data can be generated at the same time or at different times, and those skilled in the art can set them according to actual needs.
[0083] According to embodiments of this disclosure, the sample label includes a second historical deviation, and the temporal label of the training data is earlier than the sample label, so the second historical deviation is later in time than the first historical deviation.
[0084] In operation S220, N initial models are obtained, where each of the N initial models has a different network structure, and N≥2.
[0085] According to embodiments of this disclosure, the initial model may be constructed based on a time series prediction algorithm, but is not limited thereto; it may also be constructed based on a machine learning model.
[0086] In operation S230, each of the N initial models is trained using the training data to obtain N predicted models after training, where each predicted model corresponds to one of the N initial models.
[0087] According to embodiments of this disclosure, training the initial model can be achieved by testing the residuals. For example, when the initial model is an autoregressive moving average model (ARMA model) built based on a time series prediction algorithm, the initial model can be trained by testing the residuals. For example, if the model residuals obtained by testing the residuals are normally distributed white noise, the trained prediction model can be obtained.
[0088] It should be noted that the embodiments disclosed herein do not limit the specific training process of the initial model, and those skilled in the art can make selections based on the network structure of the initial model.
[0089] In operation S240, the prediction accuracy of each of the N prediction models is determined based on the sample labels.
[0090] In operation S250, an anomaly detection model is constructed based on the prediction accuracy of each of the N prediction models.
[0091] According to embodiments of this disclosure, the prediction accuracy of the prediction model can be generated based on the comparison between the prediction results output by the prediction model and the sample labels. It should be understood that different prediction models may have the same or different prediction accuracies.
[0092] According to embodiments of this disclosure, constructing an anomaly detection model based on the prediction accuracy of each of the N prediction models may include selecting the prediction model with the highest prediction accuracy from the N prediction models as the anomaly detection model, or constructing M of the N prediction models as an anomaly monitoring model, where N≥M≥2.
[0093] Since the detection index data generated by the system under test in both grayscale and formal environments are of various types, an anomaly detection model can be constructed by comprehensively considering the prediction accuracy of each of the N prediction models. This model can effectively improve the accuracy of anomaly detection during the operation of the system under test.
[0094] According to embodiments of this disclosure, a first historical detection deviation is generated using first historical detection data from the system under test running in a grayscale environment and second historical detection data from the system under test running in a formal environment. N initial models with different network structures are then trained using the first historical detection deviation to obtain N prediction models for predicting the detection deviation of the system under test. The prediction accuracy of each of the N prediction models is determined based on the sample labels, and an anomaly detection model is constructed based on the prediction accuracy of each of the N prediction models. This allows the constructed anomaly detection model to improve the accuracy of detecting anomalies in the system under test.
[0095] According to embodiments of this disclosure, the initial model includes a model constructed based on a time series algorithm.
[0096] According to embodiments of this disclosure, the model constructed based on time series algorithms includes at least one of the following:
[0097] Exponential smoothing model, autoregressive moving average model, Prophet model, autoregressive model, moving average model.
[0098] According to embodiments of this disclosure, the initial model may also include other models built based on time series forecasting algorithms, such as autoregressive moving average models (ARMA models), etc.
[0099] In this embodiment, the exponential smoothing model, the autoregressive moving average model, and the Prophet model can be selected as the initial models.
[0100] According to embodiments of this disclosure, the first historical detection data or the second historical detection data includes at least one of the following:
[0101] The response time of the system under test, the system capacity of the system under test, the frequency of operational errors of the system under test, and the request frequency of the system under test.
[0102] Figure 3The flowchart illustrating the determination of the prediction accuracy of each of N prediction models based on sample labels according to an embodiment of the present disclosure is illustrated.
[0103] like Figure 3 As shown, operation S240, which determines the prediction accuracy of each of the N prediction models based on the sample labels, may include operations S310 to S320.
[0104] In operation S310, the training data is input into N prediction models respectively, and the prediction results output by each prediction model are obtained.
[0105] In operation S320, the prediction results output by each of the N prediction models are processed using the sample labels to determine the prediction accuracy of each of the N prediction models.
[0106] According to embodiments of this disclosure, the prediction result output by the prediction model can be a prediction result for the second historical deviation, and the prediction results output by different prediction models can be the same or different.
[0107] According to embodiments of this disclosure, the prediction results output by each of the N prediction models are processed using sample labels to determine the prediction accuracy of each of the N prediction models. This can include determining the prediction accuracy of each of the N prediction models based on a comparison between the sample labels and each prediction result.
[0108] In this embodiment, for example, for a prediction model, the prediction error can be obtained by using the difference between the sample label and the prediction result, and then the ratio of the prediction error to the sample label can be determined. If the ratio of the prediction error to the sample label is less than or equal to a preset threshold, the prediction result is determined to be correct. If the ratio of the prediction error to the sample label is greater than the preset threshold, the prediction result is determined to be incorrect. By statistically analyzing the ratio of correct predictions to the total number of prediction results, the prediction accuracy of the prediction model can be determined. Using the same or similar methods, the prediction accuracy of each of N prediction models can be further determined.
[0109] It should be noted that the preset threshold can be designed according to actual needs, and may include, for example, 1%, 3%, 5%, etc. The embodiments of this disclosure do not limit the specific setting of the preset threshold.
[0110] According to embodiments of this disclosure, by processing the prediction results output by each of the N prediction models using sample labels, the prediction accuracy of each of the N prediction models can be determined. This can further determine the detection accuracy of each prediction model for abnormal information in the system under test during operation, providing a valid basis for the subsequent construction of abnormal information detection models.
[0111] Figure 4The flowchart illustrating the process of constructing an anomaly detection model based on the prediction accuracy of N prediction models according to an embodiment of the present disclosure is shown.
[0112] like Figure 4 As shown, operation S250, which constructs an anomaly detection model based on the prediction accuracy of each of the N prediction models, may include operations S410 to S420.
[0113] In operation S410, the detection weight corresponding to each prediction model is determined based on the prediction accuracy of each of the N prediction models.
[0114] In operation S420, an anomaly detection model is constructed based on N prediction models and the corresponding detection weights for each prediction model.
[0115] According to embodiments of this disclosure, the prediction accuracy of a prediction model can characterize the accuracy of the prediction model in detecting abnormal information. By comprehensively considering the prediction accuracy of each of the N prediction models, the prediction accuracy of each of the N prediction models is converted into the corresponding detection weight of each prediction model according to the same proportional relationship. Thus, the detection weight can be used to characterize the accuracy of each of the N prediction models in detecting abnormal information.
[0116] According to embodiments of this disclosure, an anomaly detection model is constructed based on N prediction models and their corresponding detection weights. Specifically, the weighted prediction result for each prediction model is obtained by multiplying the prediction results of each of the N prediction models by its corresponding detection weight. The anomaly detection result of the anomaly detection model is then obtained by averaging the weighted prediction results of the N prediction models. This allows the obtained anomaly detection result to comprehensively consider the prediction accuracy of the N prediction models, quickly identifying anomalies in the system under test operating in a grayscale environment. This enables relevant maintenance personnel to quickly locate operational anomalies in the system under test, effectively improving the accuracy of anomaly detection.
[0117] The embodiments of this disclosure also provide a method for detecting anomalies in grayscale environments, which will be described below. Figures 5-7 The grayscale environment anomaly detection method of the disclosed embodiments is described in detail.
[0118] Figure 5 A flowchart illustrating a grayscale environment anomaly detection method according to an embodiment of the present disclosure is shown schematically.
[0119] like Figure 5 As shown, the grayscale environment anomaly detection method may include operations S510 to S520.
[0120] In operation S510, the detection deviation is obtained. The detection deviation is generated based on the first detection data of the system under test running in the grayscale environment and the second detection data running in the formal environment.
[0121] According to embodiments of this disclosure, the first detection data may include detection index data generated in real time by the system under test running in a grayscale environment, such as the response time of the system under test in a grayscale environment. Correspondingly, the second detection data may include detection index data generated in real time by the system under test running in a formal environment.
[0122] According to embodiments of this disclosure, the detection deviation can be generated based on the first detection data and the second detection data, for example, the detection deviation can be generated based on the difference between the first detection data and the second detection data.
[0123] It should be noted that the first detection data and the second detection data can be generated at the same time or at different times, and those skilled in the art can set them according to actual needs.
[0124] In operation S520, the detected deviation is input into the anomaly detection model, and the anomaly detection result is output. The anomaly detection result represents the abnormal operation of the system under test. The anomaly detection model is constructed by the above-mentioned anomaly detection model construction method.
[0125] According to the embodiments of this disclosure, the anomaly detection model is constructed by the above-described method for constructing the anomaly detection model. Therefore, the output anomaly detection result can be obtained by comprehensively considering the prediction accuracy of multiple prediction models.
[0126] According to embodiments of this disclosure, a detection deviation is generated based on first detection data from the system under test running in a grayscale environment and second detection data from the system under test running in a formal environment. This allows for real-time acquisition of the system under test's operation in both grayscale and formal environments. By processing the detection deviation, the system under test's operation in both environments can be analyzed. The anomaly detection model constructed using the above method processes the detection deviation, enabling the obtained anomaly detection results to quickly identify anomalies in the system under test's operation in the grayscale environment. This allows relevant maintenance personnel to quickly locate anomalies in the system under test, effectively improving the accuracy of anomaly detection.
[0127] According to embodiments of this disclosure, the grayscale environment anomaly detection method may further include the following operations.
[0128] Based on the abnormal information detection results, and / or based on the first matching result between the first alarm information generated by the system under test running in the grayscale environment and the target alarm information in the target alarm information database, the abnormal operation of the system under test is determined, wherein the target alarm information database is constructed based on the second alarm information generated by the system under test running in the formal environment.
[0129] According to embodiments of this disclosure, an abnormal operation of the system under test is determined based on the anomaly detection results. For example, this may include determining that the system under test is abnormal when the anomaly detection result exceeds a preset detection threshold. It should be understood that the preset detection threshold can be set according to actual needs. For example, it may be the average value of 3 sigma representing the deviation of real-time detection index data from the same time yesterday, the deviation of real-time detection index data from the same time last week, the deviation of real-time detection index data from the same time two weeks ago, and the deviation of real-time detection index data from the same time three weeks ago.
[0130] According to embodiments of this disclosure, the first matching result between the first alarm information generated by the system under test operating in a grayscale environment and the target alarm information in the target alarm information database can characterize whether the first alarm information matches the target alarm information. If the first matching result indicates a mismatch, it can be determined that there is no target alarm information identical to the first alarm information in the target alarm information database, meaning the first alarm information is a new alarm information generated by the system under test during its operation in a grayscale environment. By determining that the system under test generates new alarm information in a grayscale environment, the abnormal operating condition of the system under test can be identified as an operational anomaly. This facilitates timely acquisition of important new alarm information by relevant personnel, avoiding the impact on timeliness caused by sifting through duplicate alarms to find new alarm information.
[0131] According to embodiments of this disclosure, a first matching result can be obtained by performing full-word matching on fields of the first alarm information and fields of the target alarm information, but it is not limited to this. Alternatively, the first matching result can be obtained based on the cosine similarity between the first alarm information and the target alarm information. Embodiments of this disclosure do not limit the specific method for obtaining the first matching result.
[0132] Figure 6 The illustration shows an application scenario of the grayscale environment anomaly detection method according to an embodiment of the present disclosure.
[0133] like Figure 6As shown, after obtaining the detection deviation 610 of the system to be detected, the detection deviation 610 can be input into the anomaly detection model 620. The anomaly detection model 620 can be constructed based on prediction model A 621, prediction model B 622, and prediction model C 623. Each prediction model in the anomaly detection model 620 has a corresponding detection weight.
[0134] The detection deviation 610 can be processed in parallel using prediction models A 621, B 622, and C 623, and the anomaly detection result 630 can be output according to the detection weights of each prediction model. If the anomaly detection result 630 is greater than the preset detection threshold, it can be determined that the system under test is malfunctioning, and the anomaly point is marked so that relevant personnel can handle the anomaly.
[0135] In this embodiment, for example, if the anomaly detection result is greater than the historical average value of 3 sigma, it can be determined that the system under test is malfunctioning.
[0136] Figure 7 A flowchart illustrating a grayscale environment anomaly detection method according to another embodiment of the present disclosure is shown.
[0137] like Figure 7 As shown, the above-mentioned grayscale environment anomaly detection method may also include operations S710 to S730.
[0138] By operating the S710, the second alarm information generated by the system under test during operation in a production environment is obtained.
[0139] In operation S720, the second alarm information is matched with each target alarm information in the target alarm information database to obtain the second matching result corresponding to each target alarm information.
[0140] In operation S730, if the second matching result corresponding to the second alarm information and each target alarm information indicates a mismatch, the second alarm information is added to the target alarm information database as a new target alarm information, thus obtaining an updated target alarm information database.
[0141] According to embodiments of this disclosure, the matching method between the second alarm information and the target alarm information may include whole-word matching, or the second matching result may be obtained through the cosine similarity between the second alarm information and the target alarm information.
[0142] In this embodiment, second alarm information stop words and alarm type words can also be extracted, and the stop words and alarm type words are concatenated into a second concatenated alarm information corresponding to the second alarm information. Correspondingly, each target alarm information in the target alarm information library is also the target concatenated alarm information obtained by the same or similar method. Matching the second concatenated alarm information with the target concatenated alarm information can obtain a second matching result. By extracting the stop words and alarm type words, and representing the second alarm information and the target alarm information after concatenating the stop words and alarm type words, the number of characters can be effectively reduced, thereby reducing the amount of computation for matching the second alarm information with each target alarm information in the target alarm information library and improving the computational efficiency of obtaining the second matching result.
[0143] According to an embodiment of the present disclosure, the stop words may include, for example, "de", "di", "de", "application name", "number", "is", "ErrMsg", "apptime", "java", "timestamp", "is", etc. The alarm type words may include the alarm type field in the alarm information, and may include, for example, "mysql exception", etc.
[0144] Based on the above method for constructing an abnormal information detection model, the present disclosure also provides a device for constructing an abnormal information detection model. The following will be combined with Figure 8 to describe this device in detail.
[0145] Figure 8 Schematically shows a structural block diagram of a device for constructing an abnormal information detection model according to an embodiment of the present disclosure.
[0146] As Figure 8 shown, the device 800 for constructing an abnormal information detection model includes a sample acquisition module 810, an initial model acquisition module 820, a training module 830, a determination module 840, and a construction module 850.
[0147] The sample acquisition module 810 is used to acquire training samples. Among them, the training samples include training data and sample labels corresponding to the training data. The training data includes a first historical detection deviation, and the first historical deviation is generated based on the first historical detection data of the to-be-detected system running in a gray environment and the second historical detection data running in a formal environment. The sample labels include the second historical deviation, and the time series mark of the training data is earlier than the sample labels.
[0148] The initial model acquisition module 820 is used to acquire N initial models, where the network structures of the N initial models are different from each other.
[0149] The training module 830 is used to train each of the N initial models using the training data to obtain N predicted models after training. Each predicted model corresponds to one of the N initial models, and N ≥ 2.
[0150] The determination module 840 is used to determine the prediction accuracy of each of the N prediction models based on the sample labels.
[0151] The construction module 850 is used to construct an anomaly detection model based on the prediction accuracy of each of the N prediction models.
[0152] According to embodiments of this disclosure, the determining module may include a prediction unit and a first determining unit.
[0153] The prediction unit is used to input the training data into N prediction models respectively, and obtain the prediction results output by each prediction model.
[0154] The first determining unit is used to process the prediction results output by each of the N prediction models using the sample labels, and to determine the prediction accuracy of each of the N prediction models.
[0155] According to embodiments of this disclosure, the construction module may include a second determining unit and a construction unit.
[0156] The second determining unit is used to determine the detection weight corresponding to each prediction model based on the prediction accuracy of each of the N prediction models.
[0157] The building unit is used to construct an anomaly detection model based on N prediction models and the corresponding detection weights for each prediction model.
[0158] According to embodiments of this disclosure, the initial model includes a model constructed based on a time series algorithm.
[0159] According to embodiments of this disclosure, the model constructed based on time series algorithms includes at least one of the following:
[0160] Exponential smoothing model, autoregressive moving average model, Prophet model, autoregressive model, moving average model.
[0161] According to embodiments of this disclosure, the first historical detection data or the second historical detection data includes at least one of the following:
[0162] The response time of the system under test, the system capacity of the system under test, the frequency of operational errors of the system under test, and the request frequency of the system under test.
[0163] Figure 9 A schematic block diagram of a grayscale environment anomaly detection device according to an embodiment of the present disclosure is shown.
[0164] like Figure 9 As shown, the grayscale environment anomaly detection device 900 of this embodiment includes an acquisition module 910 and a detection module 920.
[0165] The acquisition module 910 is used to acquire the detection deviation, which is generated based on the first detection data of the system under test running in the grayscale environment and the second detection data running in the formal environment.
[0166] The detection module 920 is used to input the detection deviation into the anomaly detection model and output the anomaly detection result. The anomaly detection result represents the abnormal operation of the system under test. The anomaly detection model is constructed by the above-mentioned anomaly detection model construction method.
[0167] According to embodiments of this disclosure, the above-described anomaly detection device may further include an anomaly determination module.
[0168] The anomaly determination module is used to determine the abnormal operation of the system under test based on the anomaly information detection results and / or based on the first matching result between the first alarm information generated by the system under test running in the grayscale environment and the target alarm information in the target alarm information database. The target alarm information database is constructed based on the second alarm information generated by the system under test running in the formal environment.
[0169] According to embodiments of this disclosure, the above-mentioned anomaly detection device may further include: an alarm information acquisition module, a matching module, and an update module.
[0170] The alarm information acquisition module is used to acquire the second alarm information generated by the system under test when it is running in a formal environment.
[0171] The matching module is used to match the second alarm information with each target alarm information in the target alarm information database to obtain the second matching result corresponding to each target alarm information.
[0172] The update module is used to add the second alarm information as a new target alarm information to the target alarm information database when the second matching result corresponding to the second alarm information and each target alarm information do not indicate a mismatch, thus obtaining the updated target alarm information database.
[0173] According to embodiments of this disclosure, any multiple modules among the sample acquisition module 810, initial model acquisition module 820, training module 830, determination module 840, construction module 850, acquisition module 910, and detection module 920 can be combined into one module, or any one of these modules can be split into multiple modules. Alternatively, at least some of the functions of one or more of these modules can be combined with at least some of the functions of other modules and implemented in one module. According to embodiments of this disclosure, at least one of the sample acquisition module 810, initial model acquisition module 820, training module 830, determination module 840, construction module 850, acquisition module 910, and detection module 920 can be at least partially implemented as hardware circuitry, such as a field-programmable gate array (FPGA), a programmable logic array (PLA), a system-on-a-chip, a system-on-a-substrate, a system-on-package, an application-specific integrated circuit (ASIC), or implemented in hardware or firmware by any other reasonable means of integrating or packaging the circuitry, or implemented in software, hardware, or firmware, or in any suitable combination of any of these three implementation methods. Alternatively, at least one of the sample acquisition module 810, the initial model acquisition module 820, the training module 830, the determination module 840, the construction module 850, the acquisition module 910, and the detection module 920 may be implemented at least partially as a computer program module, which can perform corresponding functions when the computer program module is run.
[0174] Figure 10 A block diagram of an electronic device suitable for implementing a method for constructing an anomaly information detection model and a grayscale environment anomaly detection method according to embodiments of the present disclosure is shown schematically.
[0175] like Figure 10 As shown, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 1002 or a program loaded from a storage portion 1008 into a random access memory (RAM) 1003. The processor 1001 may include, for example, a general-purpose microprocessor (e.g., a CPU), an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (e.g., an application-specific integrated circuit (ASIC)), etc. The processor 1001 may also include onboard memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of the method flow according to an embodiment of the present disclosure.
[0176] RAM 1003 stores various programs and data required for the operation of electronic device 1000. Processor 1001, ROM 1002, and RAM 1003 are interconnected via bus 1004. Processor 1001 performs various operations of the method flow according to embodiments of the present disclosure by executing programs in ROM 1002 and / or RAM 1003. It should be noted that programs may also be stored in one or more memories other than ROM 1002 and RAM 1003. Processor 1001 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in one or more memories.
[0177] According to embodiments of this disclosure, the electronic device 1000 may further include an input / output (I / O) interface 1005, which is also connected to a bus 1004. The electronic device 1000 may also include one or more of the following components connected to the I / O interface 1005: an input section 1006 including a keyboard, mouse, etc.; an output section 1007 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 1008 including a hard disk, etc.; and a communication section 1009 including a network interface card such as a LAN card, modem, etc. The communication section 1009 performs communication processing via a network such as the Internet. A drive 1010 is also connected to the I / O interface 1005 as needed. A removable medium 1011, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 1010 as needed so that computer programs read from it can be installed into the storage section 1008 as needed.
[0178] This disclosure also provides a computer-readable storage medium, which may be included in the device / apparatus / system described in the above embodiments; or it may exist independently and not assembled into the device / apparatus / system. The computer-readable storage medium carries one or more programs that, when executed, implement the method according to the embodiments of this disclosure.
[0179] According to embodiments of this disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, such as including, but not limited to: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In this disclosure, the computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. For example, according to embodiments of this disclosure, the computer-readable storage medium may include ROM 1002 and / or RAM 1003 and / or one or more memories other than ROM 1002 and RAM 1003 described above.
[0180] Embodiments of this disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowchart. When the computer program product is run on a computer system, the program code is used to cause the computer system to implement the methods provided in the embodiments of this disclosure.
[0181] When the computer program is executed by the processor 1001, it performs the functions defined in the system / apparatus of this disclosure embodiments. According to embodiments of this disclosure, the systems, apparatuses, modules, units, etc., described above can be implemented by computer program modules.
[0182] In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device or a magnetic storage device. In another embodiment, the computer program may also be transmitted and distributed in the form of signals over a network medium, and may be downloaded and installed via the communication section 1009, and / or installed from a removable medium 1011. The program code contained in the computer program can be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination thereof.
[0183] In such an embodiment, the computer program can be downloaded and installed from a network via communication section 1009, and / or installed from removable medium 1011. When the computer program is executed by processor 1001, it performs the functions defined in the system of this disclosure embodiment. According to embodiments of this disclosure, the systems, devices, apparatuses, modules, units, etc., described above can be implemented by computer program modules.
[0184] According to embodiments of this disclosure, program code for executing the computer programs provided in embodiments of this disclosure can be written in any combination of one or more programming languages. Specifically, these computational programs can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. Programming languages include, but are not limited to, languages such as Java, C++, Python, "C", or similar programming languages. The program code can execute entirely on the user's computing device, partially on the user's device, partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).
[0185] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0186] Those skilled in the art will understand that the features described in the various embodiments and / or claims of this disclosure can be combined or combined in various ways, even if such combinations or combinations are not explicitly described in this disclosure. In particular, the features described in the various embodiments and / or claims of this disclosure can be combined or combined in various ways without departing from the spirit and teachings of this disclosure. All such combinations and / or combinations fall within the scope of this disclosure.
[0187] The embodiments of this disclosure have been described above. However, these embodiments are for illustrative purposes only and are not intended to limit the scope of this disclosure. Although various embodiments have been described above, this does not mean that the measures in the various embodiments cannot be used advantageously in combination. The scope of this disclosure is defined by the appended claims and their equivalents. Various substitutions and modifications can be made by those skilled in the art without departing from the scope of this disclosure, and all such substitutions and modifications should fall within the scope of this disclosure.
Claims
1. A method for constructing an anomaly information detection model, comprising: Acquire training samples, wherein the training samples include training data and sample labels corresponding to the training data, the training data includes a first historical deviation, the first historical deviation is generated based on a first historical detection data of the system to be detected running in a grayscale environment and a second historical detection data of the system running in a formal environment, the sample labels include a second historical deviation, the temporal marker of the training data is earlier than the sample labels, and the second historical deviation is later than the first historical deviation in time. Obtain N initial models, wherein each of the N initial models has a different network structure, and N≥2; Using the training data, train each of the N initial models to obtain N predicted models after training, wherein each predicted model corresponds to one of the N initial models; Based on the sample labels, determine the prediction accuracy of each of the N prediction models; Based on the prediction accuracy of each of the N prediction models, the detection weight corresponding to each prediction model is determined; An anomaly detection model is constructed based on N prediction models and the corresponding detection weights for each prediction model.
2. The construction method according to claim 1, wherein, Based on the sample labels, the prediction accuracy of each of the N prediction models is determined as follows: The training data is input into N prediction models respectively to obtain the prediction results output by each prediction model. The prediction results output by each of the N prediction models are processed using the sample labels to determine the prediction accuracy of each of the N prediction models.
3. The construction method according to claim 1, wherein, The initial model includes models built based on time series algorithms.
4. The construction method according to claim 3, wherein, The model constructed based on the time series algorithm includes at least one of the following: Exponential smoothing model, autoregressive moving average model, Prophet model, autoregressive model, moving average model.
5. The construction method according to claim 1, wherein, The first historical detection data or the second historical detection data includes at least one of the following: The response time of the system under test, the system capacity of the system under test, the frequency of operational errors of the system under test, and the request frequency of the system under test.
6. A method for detecting grayscale environmental anomalies, comprising: The detection deviation is obtained based on the first detection data of the system under test running in a grayscale environment and the second detection data of the system running in a formal environment. The detection deviation is input into the anomaly detection model, and the anomaly detection result is output. The anomaly detection result represents the abnormal operation of the system under test. The anomaly detection model is constructed by the construction method of the anomaly detection model according to any one of claims 1 to 5.
7. The method according to claim 6, further comprising: Based on the abnormal information detection results and / or based on the first matching result between the first alarm information generated by the system under test running in the grayscale environment and the target alarm information in the target alarm information database, the abnormal operation of the system under test is determined, wherein the target alarm information database is constructed based on the second alarm information generated by the system under test running in the formal environment.
8. The method according to claim 7, further comprising: Acquire the second alarm information generated by the system under test during operation in the formal environment; The second alarm information is matched with each target alarm information in the target alarm information database to obtain a second matching result corresponding to each target alarm information; If the second alarm information and the second matching result corresponding to each of the target alarm information both indicate a mismatch, the second alarm information is added to the target alarm information database as a new target alarm information, thus obtaining an updated target alarm information database.
9. An apparatus for constructing an anomaly information detection model, comprising: A sample acquisition module is used to acquire training samples, wherein the training samples include training data and sample labels corresponding to the training data, the training data includes a first historical deviation, the first historical deviation is generated based on a first historical detection data of the system to be detected running in a grayscale environment and a second historical detection data of the system running in a formal environment, the sample labels include a second historical deviation, the temporal marker of the training data is earlier than the sample labels, and the second historical deviation is later than the first historical deviation in time; The initial model acquisition module is used to acquire N initial models, wherein each of the N initial models has a different network structure; The training module is used to train each of the N initial models using the training data to obtain N predicted models after training, wherein each predicted model corresponds to one of the N initial models, and N≥2; The determination module is used to determine the prediction accuracy of each of the N prediction models based on the sample labels; and A construction module is used to construct an anomaly detection model based on the prediction accuracy of each of the N prediction models. Build modules, including: The second determining unit is used to determine the detection weight corresponding to each of the N prediction models based on their respective prediction accuracies. The construction unit is used to construct an anomaly detection model based on the N prediction models and the detection weights corresponding to each prediction model.
10. A grayscale environment anomaly detection device, comprising: The acquisition module is used to acquire the detection deviation, which is generated based on the first detection data of the system under test running in a grayscale environment and the second detection data running in a formal environment. as well as The detection module is used to input the detection deviation into the anomaly detection model and output the anomaly detection result, wherein the anomaly detection result characterizes the abnormal operation of the system under test, and the anomaly detection model is constructed by the construction method of the anomaly detection model according to any one of claims 1 to 5.
11. An electronic device, comprising: One or more processors; Storage device for storing one or more programs. Wherein, when the one or more programs are executed by the one or more processors, the one or more processors perform the method according to any one of claims 1 to 8.
12. A computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the method according to any one of claims 1 to 8.
13. A computer program product comprising a computer program that, when executed by a processor, implements the method according to any one of claims 1 to 8.