a framework for human understandable interpretation of detected anomalies

By calculating the explanation vector to determine the contribution metric of the abnormal state from the input vector, this method solves the problem of the lack of explanation in machine learning algorithms in anomaly detection, and achieves transparent and interpretable explanation of the cause of the anomaly, which is applicable to complex data structures.

CN122249819APending Publication Date: 2026-06-19SIEMENS AG

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SIEMENS AG
Filing Date
2024-10-29
Publication Date
2026-06-19

Smart Images

  • Figure CN122249819A_ABST
    Figure CN122249819A_ABST
Patent Text Reader

Abstract

The invention relates to a method and a device for determining a cause of detecting an abnormal system state. To this end, a normalized system state is determined as the state closest to the abnormal system state which is not determined as an abnormal system state.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] The present invention relates to an anomaly determination and interpretation method and an anomaly determination and interpretation device for determining an interpretation vector, wherein the interpretation vector is used to interpret the determination of an anomaly state represented by a state vector comprising n components.

[0002] Sensors are ubiquitous in all kinds of machines and equipment, and are therefore highly relevant to many different business sectors. One particularly important application area for sensors is monitoring the function of machinery and equipment, such as die-casting machines, operating equipment, and PCB board measurements. To this end, sensors are installed on these machines and equipment to measure various physical conditions, such as current, temperature, pressure, and position, enabling the monitoring of the system's status.

[0003] If the machine is damaged or malfunctioning, sensor values ​​will often show suspicious patterns and anomalies in the measurement data. Machine learning algorithms can be trained to detect these anomalies.

[0004] Therefore, a wide variety of anomaly detection devices and methods are based on machine learning algorithms. Data obtained from sensors often includes complex time-series data, represented, for example, as complex data structures. The analysis methods used for such complex time series are often sophisticated enough to no longer be understandable to humans, effectively forming a black box. This leads to various inconveniences: humans often cannot understand how the underlying model behaves and therefore cannot infer its properties and / or behavior. This creates an obstacle for developers to build better and more robust models. It also leads to problems in detecting the root causes of anomalies, and thus a lack of trust and acceptance from users.

[0005] When only normal state data is known and abnormal state data is unavailable, the interpretation and reasoning of detected anomalies and underlying anomaly detection become further complicated. Furthermore, anomaly detection is performed to derive actions to respond to abnormal states, among other objectives. However, many anomaly detection results lack an inherent interpretation from which such actions can be derived.

[0006] In light of the above, there is a need for an improved method for interpreting which components of the input data are relevant to the detection of any particular anomaly.

[0007] This problem is solved by the method according to claim 1 and the apparatus according to claim 5. Advantageous embodiments are the subject of the dependent claims.

[0008] To address this problem, a computer-implemented anomaly determination interpretation method is provided for determining an interpretation vector for interpreting the determination of an abnormal state from an input vector representing the state of a machine, wherein the state is obtained at least partially from sensor data and for examining sensor values ​​that cause the abnormal state. The method includes performing the following steps: determining an anomaly metric from the input vector using a state evaluation model; determining from the anomaly metric whether the input vector represents an abnormal or normal state; thereby estimating a normalized input vector from the input vector, wherein the normalized input vector is the closest input vector that is not determined to be abnormal based on its calculated anomaly metric; and determining an interpretation vector from the normalized input vector, the input vector, and the state evaluation model, such that the value of each component of the interpretation vector corresponds to a metric of the contribution of one or more associated components of the input vector to the determination that the state vector represents an abnormal state.

[0009] This method can handle complex data structures, such as high-dimensional datasets, time-series data (including their dynamics), and interdependencies presented as input vectors. It remains computationally efficient even for high-dimensional data.

[0010] If the resulting explanatory vector contains information about which components of the state vector contribute to the determination of whether the state vector represents an abnormal or normal state, then it is transparent and interpretable in the sense that we understand which features trigger a given anomaly.

[0011] Furthermore, it does not require information about the actual anomalies at hand. Rather, that information is included, for example, in anomaly metrics. Therefore, it is reliable in the sense that the interpretation is unbiased and should be well consistent with the actual causes leading to the predicted anomalies. Moreover, its interpretability does not depend on additional data that needs to be accessed during deployment.

[0012] In some embodiments, the step of calculating the explanatory vector includes the following sub-steps: determining an anomaly metric of the input vector as a baseline metric, and for each component of the explanatory vector, determining a partially normalized input vector by replacing the component of the state vector corresponding to the component of the explanatory vector with the corresponding associated component of the normalized input vector; and determining the component of the explanatory vector by subtracting the anomaly metric of the partially normalized input vector from the baseline metric.

[0013] This method can be easily implemented with very low CPU and memory requirements.

[0014] In some embodiments, the step of computing the explanatory vector includes a method for determining the explanatory vector relative to a baseline (such as a Shapley value, Lime, DeepLift, or integral gradient), wherein a normalized input vector is selected as the baseline.

[0015] Therefore, this method can also be integrated into more complex models.

[0016] In some embodiments, the state is obtained at least in part from sensor data.

[0017] One of the main applications of this invention is to determine which sensor values ​​contribute to the determination that the state vector x represents an abnormal state.

[0018] The problem is further addressed by an anomaly determination and interpretation device for determining an interpretation vector (E) that interprets an input vector (x) representing the state of the machine as indicating an anomalous state, wherein the state is at least partially obtained from sensor data and for examining sensor values ​​that cause the anomalous state. The anomaly determination and interpretation device includes: an anomaly metric calculation device configured to calculate an anomaly metric from the input vector representing the state using a state evaluation model; and an anomaly detection device configured to receive the anomaly metric from the anomaly metric calculation device and determine from the anomaly metric whether the input vector represents an anomalous or normal state. A state normalization device is configured to estimate a normalized input vector from the input vector, wherein the normalized input vector is the closest vector to the input vector that is not determined to be anomalous based on its calculated anomaly metric. An interpretation vector computation device is configured to determine the interpretation vector from the normalized input vector, the input vector, and the state evaluation model, such that the value of each component of the interpretation vector corresponds to a measure of the contribution of one or more associated components of the input vector to the determination of the input vector representing an anomalous state.

[0019] This device is capable of handling complex data structures, such as high-dimensional datasets, time-series data (including their dynamics), and interdependencies presented as input vectors. Even with high-dimensional data, it remains computationally efficient.

[0020] If the resulting explanatory vector contains information about which components of the state vector contribute to the determination of whether the state vector represents an abnormal or normal state, then it is transparent and interpretable in the sense that we understand which features trigger a given anomaly.

[0021] Furthermore, it does not require information about the actual anomalies at hand. Rather, that information is included, for example, in anomaly metrics. Therefore, it is reliable in the sense that the interpretation is unbiased and should be well consistent with the actual causes leading to the predicted anomalies. Moreover, its interpretability does not depend on additional data that needs to be accessed during deployment.

[0022] In some embodiments, the interpretation vector computing device is configured such that the step of determining the interpretation vector includes the following sub-steps: The anomaly metric of the input vector is determined as the baseline metric, and For each component of the interpretation vector: A partially normalized input vector is determined by replacing one or more components of the state vector corresponding to the components of the interpretation vector with corresponding associated one or more components of the normalized input vector. The components of the explanatory vector are determined by subtracting the anomaly metric of the partially normalized input vector from the baseline metric.

[0023] This method can be easily implemented with very low CPU and memory requirements.

[0024] In some embodiments, the explanatory vector computing device is configured such that the step of determining the explanatory vector includes a method for determining the explanatory vector relative to a baseline (such as a Shapley value, Lime, DeepLift, or integral gradient), wherein a normalized input vector is selected as the baseline.

[0025] Therefore, this method can also be integrated into more complex models.

[0026] In some embodiments, the input vector is a state vector obtained at least in part from sensor data.

[0027] In some embodiments, the state is obtained at least in part from sensor data.

[0028] One of the main applications of this invention is to determine which sensor values ​​contribute to the determination that the state vector x represents an abnormal state.

[0029] The problem is also addressed by a computer program product that can be directly loaded into the internal memory of a digital computer, the computer program product including software code portions for performing the steps of the methods mentioned above when the product is run on the digital computer.

[0030] This achieves the advantages of each method as stated above.

[0031] Further features and variations of the invention will be apparent to those skilled in the art, particularly from the accompanying drawings illustrating exemplary embodiments of the invention. The drawings show: Figure 1 A schematic representation of an anomaly determination and interpretation device according to an embodiment of the present invention is shown. Figure 2 The example distribution of sensor data including an abnormal state is shown, and Figure 3 A flowchart illustrating an embodiment of the method according to the present invention is shown.

[0032] In particular, time-series data from sensors that describe the state of a device can be represented as an input vector in the form of a state vector x, where the components of the state vector x are x1, x2, ..., xc. i Each includes sensor data, for example, each from a different sensor.

[0033] Abnormal device status can be caused by, for example Figure 1 The sensor anomaly determination and interpretation device 10 shown is used to detect anomalies, and the sensor anomaly determination and interpretation device 10 is configured to perform... Figure 3 The steps are shown in the diagram. To this end, the sensor anomaly determination and interpretation device 10 includes multiple inputs 12 for receiving sensor data and is configured to represent the state described by the sensor data as a state vector x comprising n components. Typically, the number of sensors equals the number of components n. However, this is not strictly necessary.

[0034] The sensor anomaly determination and interpretation device 10 also includes an anomaly metric calculation device 14, which may also be referred to as an anomaly detection system, including models of the sensors and devices, such as functions F(x) that provide anomaly scores or other anomaly metrics f. In particular, the anomaly metric calculation device 14 is configured to determine anomalies by, for example, calculating the anomaly metric f from the state vector x in the first step S1 using a state evaluation model that may have been derived from the sensors and devices.

[0035] Anomaly detection device 16 is configured to receive anomaly metric f from anomaly metric calculation device and, in the second step S2, determine whether state vector x represents an anomalous or normal state based on the anomaly metric f. If the anomaly metric f is a scalar, the simplest implementation of anomaly detection device 16 can be configured to compare the anomaly metric f with a threshold. Depending on the definition of the state assessment model, the value of the anomaly metric f above or below the threshold can indicate an anomalous state.

[0036] The state normalization device 18 is configured in the third step S3 to determine a normalized input vector N comprising n components from the state vector x, wherein the normalized input vector N is the closest state vector x that is not determined to be abnormal based on the anomaly metric f calculated from it. Therefore, the state normalization device 18 is configured to search for a normalized input vector N that is as close as possible to the state vector x, while still being determined by the anomaly detection device 16 to represent a normal state.

[0037] In some embodiments, if the state vector x is determined to represent an abnormal state, the state normalization device 18 can be configured to perform a determination step.

[0038] Interpretive vector computing device 20 is configured in the fourth step S4 to determine an interpretive vector E comprising n components from the normalized input vector N, the state vector x, and the state assessment model, such that the value of each component of interpretive vector E corresponds to a measure of the contribution of the associated component of that component of state vector x to the determination of an anomalous state. Interpretive vector E includes components of one or more components of state vector x, representing how one or more vector x components are correlated in determining an anomalous state. Therefore, interpretive vector E makes it possible, for example, to specifically examine sensor values ​​that lead to the anomalous state to determine, for example, the origin of a fault.

[0039] Figure 2 An example curve 24 is shown, displaying a time series of a two-dimensional state vector x. In this case, the previously known method can be implemented as follows. To form the two-dimensional state vector x, we assume we have two components or features. and Most state vectors x follow an outlier with an anomalous value 22 describing the anomalous state. The pattern. All other state vectors x only follow the pattern. Axis variation. We assume a very simple outlier metric f is the Euclidean distance from the origin, i.e., This mimics the idea of ​​density-based estimators, since the origin is the center of all normal points. In this construction, we also ensure that... Outliers have the highest distance to the origin and will therefore be classified as outliers.

[0040] The goal of any method for anomaly identification and interpretation would be to decompose the anomaly score such that the contribution to the anomaly metric f can be attributed to two features. and Through construction, we know the abnormal score. Can be entirely attributed to features Because from From this perspective, the point is as normal as it could be. Therefore, the expected outcome of a good anomaly identification and interpretation method is a feature. This leads to 3 anomaly measures, while features Contribution 0.

[0041] A basic known method relies on the idea of ​​simulating feature removal. In this case, we assume... and It is independent. Therefore, for a point... We can simulate features as follows, for example. Removal: And for features We have Then, a simple measure of the feature attribute can be In numerical studies, we set The following results were obtained: Remove shaft Effect: and removing the shaft Effect: .

[0042] Therefore, in practice, this method estimates the removal of the central axis quite well. The expected explanatory vector components would be 3, and the estimated explanatory vector components are 2.996, which is quite good. However, the explanatory vector E has components... This component In reality, there is no impact at all, but the estimated impact is obtained to reduce the outlier metric by approximately 0.1. We also found that the impact of component 2 will always be underestimated (i.e., less than 3), with convergence as N approaches infinity, while the error of component 1 does not disappear with more data points.

[0043] This invention proposes a more suitable concept for feature removal, which is to explain the purpose of anomaly determination by more precisely, alternatively considering feature normalization based on normalization module 18.

[0044] according to Figure 1 The anomaly determination interpretation device 10 analyzes the input vector x obtained from input 12 and outputs an anomaly determination indicating the degree to which the input vector x appears anomalous. To understand which specific features of the input vector x contribute to the anomaly metric f, the state normalization device 18 derives or estimates the point closest to x with the anomaly metric f considered normal from the anomaly metric calculation device 14 and / or the anomaly detection device 16, and in step S3, identifies that point as the normalized input vector N or N(x). Finally, the interpretation vector calculation device 20 analyzes the anomaly metric calculation device 14 and / or the anomaly detection device 16 based on the normalized input vector N(x) to determine the state vector x (by...) in step S4. , ...represents which features or components are responsible for the anomaly detection device 16 to determine state x as an anomaly.

[0045] The state normalization device 18 allows users to derive appropriate interpretations, where the simplified concept of feature removal is replaced by feature normalization based on the normalized input vector N.

[0046] Typically, finding an exact normalized input vector N that actually represents the closest point to the input vector x that the anomaly detection device 16 considers normal would be a mathematically difficult problem. Therefore, for example, a normalized input vector N would be estimated for most input vectors x. For this estimation, mathematical methods such as approximations could be used. This could be achieved, for example, by explicitly searching for new states near x that the anomaly detection device 16 considers normal via a dedicated sampling or optimization procedure. If the anomaly detection device 16 is differentiable, such as a neural network, backpropagation could be used for this purpose. For other model types, gradient-free techniques such as particle swarm optimization are suitable. An alternative approach involves providing a finite set of predetermined state vectors to the state normalization device 18, all of which have been a priori identified as normal. These predefined state vectors can be directly queried to determine the closest matching normal state.

[0047] Applied to the above about Figure 2 In the example described above, the state normalization device 18 maps the outlier state vector x = (0,3) to the closest normal point from the perspective of the normalized input vector N(x) = (0,0), and can derive the corresponding explanatory vector E without incorrectly averaging the features, but rather by normalizing the features by replacing them with the corresponding values ​​of N(x). This results in the expected E(x) = (0,3) and correctly indicates... The value is only responsible for the observed outlier score.

[0048] Therefore, in step S5, the anomaly metric f is determined as the baseline metric. Furthermore, to determine the components of the explanatory vector E, in step S6, a partially normalized input vector Np is calculated for each component of the explanatory vector E, wherein one or more components of the input vector x are replaced with one or more corresponding components of the normalized input vector N that correspond to the components of the explanatory vector E.

[0049] In another step S7, each of the components of the explanatory vector E is determined by subtracting the anomaly metric f of its associated partially normalized input vector Np from the baseline metric.

[0050] In the example case, this would mean performing calculations in steps 5 and 6. as well as In step S7, the corresponding contribution metric is calculated as follows: as well as It perfectly matches the expected explanatory vector E(x) = (0, 3).

[0051] Note that the logic of normalizing features or components rather than removing them to explain which components are relevant to the anomaly determination interpretation device 10 can also be combined with other computational methods for deriving the interpretation. Techniques that explicitly consider baselines (such as Shapley values, Lime, DeepLift, or integral gradients) can be reformulated by setting the normalized input vector N(x) to the baseline value when interpreting the anomaly determination of the state vector x. Therefore, the interpretation vector computation device 20 can also be implemented using any of these methods.

[0052] Furthermore, the same logic applies when the anomaly measurement computing device 10 reads a more complex model (such as SOM, LOF, kNN, a type of neural network, or a type of SVM) rather than a simple distance metric, in order to compute the anomaly metric.

[0053] In various embodiments, the input vector x can be constructed to represent various states. For example, in some embodiments, the input vector x can be constructed from sensor data, such as position, orientation, acceleration, light, electric field, magnetic field, humidity, pressure, temperature, optical, acoustic, and / or other sensor data. Therefore, the input vector x can represent, for example, the current state of a machine. Such an input vector x can be, for example, one-dimensional, including components x1, x2, ..., x... i In some embodiments, the input vector x may include an image or other two-dimensional data. In such cases, the input vector x may include components x. 1、1 x 1、2 ... x 1、k x 2,1 ... x i,k Each component can then include a scalar, vector, matrix, or tensor. Higher-dimensional input vectors x may also be advantageous for other tasks.

[0054] The interpretation vector E may also include various components. In some embodiments, the interpretation vector E may include a component associated with a component of the input vector x. In some embodiments, each component of the interpretation vector E may represent multiple components of the input vector x.

[0055] In some embodiments, the anomaly determination device may include a data receiving device that includes a plurality of inputs 12 for receiving data and is configured to represent the state described by the data as an input vector x.

[0056] It should be understood that, generally, the anomaly determination and interpretation device 10, and in particular any of the anomaly measurement calculation device 14, the anomaly detection device 16, the state normalization device 18, and / or the vector calculation device 20, can be implemented using software modules for execution on, for example, a general-purpose computer's CPU. For this purpose, the software modules can be stored in a non-transitory computer-readable storage medium and loaded into memory for execution by the CPU. The software components can be written in various programming languages ​​and can include machine-readable instructions to guide the operation of a general-purpose computer.

[0057] Although the invention has been described in detail with the aid of embodiments, the invention is not limited to the embodiments described. Other modifications will be readily derived by those skilled in the art without departing from the spirit of the invention or the scope of the invention as defined by the following claims.

[0058] Optional features of the invention are specified using the word "may". Therefore, there are also other embodiments and / or embodiments of the invention that additionally or alternatively have one or more corresponding features.

[0059] Based on the currently disclosed combination of features, isolated features may be selected and used in combination with other features to define the subject matter of the claims, while removing any structural and / or functional relationships that may exist between the features.

[0060] Individuals with male, female, or other gender identities are included within the term, independent of grammatical usage.

Claims

1. A computer-implemented anomaly determination and interpretation method for determining an interpretation vector (E), the interpretation vector (E) being used to interpret an input vector (x) representing the state of a machine, indicating the determination of an anomalous state, wherein the state is at least partially obtained from sensor data and is used to examine sensor values ​​that cause the anomalous state, the method comprising performing the following steps: An anomaly measure (f) is determined from the input vector (x) using a state assessment model; Based on the anomaly metric (f), determine (S2) whether the input vector (x) represents an abnormal state or a normal state; thereby Estimate (S3) a normalized input vector (N) from the input vector (x), wherein the normalized input vector (N) is the vector closest to the input vector (x) that is estimated to be not determined as an anomaly according to the anomaly metric (f) calculated therefrom; The explanatory vector (E) is determined (S4) from the normalized input vector (N), the input vector (x), and the state evaluation model, such that the value of each component of the explanatory vector (E) corresponds to a measure of the contribution of one or more associated components of the input vector (x) to determining that the input vector (x) represents an anomalous state.

2. The anomaly determination and interpretation method according to claim 1, wherein the step of determining (S4) the interpretation vector (E) comprises the following sub-steps: Determine (S5) the anomaly metric (f) of the input vector (x) as the baseline metric, and For each component of the interpretation vector (E): The partially normalized input vector (Np) is determined (S6) by replacing one or more components of the input vector (x) corresponding to the components of the interpretation vector (E) with the corresponding associated one or more components of the normalized input vector (N). The components of the (S7) explanatory vector (E) are determined by subtracting the anomaly metric (f) of the partially normalized input vector (Np) from the baseline metric.

3. The anomaly determination interpretation method according to claim 1, wherein the step of determining (S4) the interpretation vector (E) includes a method for determining the interpretation vector (E) relative to a baseline such as a Shapley value, Lime, DeepLift, or integral gradient, wherein the normalized input vector (N) is selected as the baseline.

4. The anomaly determination and interpretation method according to any one of the preceding claims, wherein the input vector (x) is a state vector obtained at least in part from sensor data.

5. An anomaly determination and interpretation device (10) for determining an interpretation vector (E), the interpretation vector (E) being used to interpret an input vector (x) representing the state of a machine as indicating an anomaly state, wherein the state is at least partially obtained from sensor data, and for viewing sensor values ​​that cause the anomaly state, the anomaly determination and interpretation device (10) comprising: An anomaly measurement calculation device (14) is configured to determine an anomaly measurement (f) from the input vector (x) representing the state by means of a state evaluation model. An anomaly detection device (16) is configured to receive the anomaly metric (f) from the anomaly metric calculation device (14) and determine whether the state vector (x) represents an abnormal state or a normal state based on the anomaly metric (f), thereby A state normalization device (18) is configured to estimate a normalized input vector (N) based on the input vector (x), wherein the normalized input vector (N) is the vector closest to the input vector (x) that is not determined to be an anomaly based on its calculated anomaly metric (f). An interpretation vector computation device (20) is configured to determine an interpretation vector (E) from a normalized input vector (N), an input vector (x), and a state evaluation model, such that the value of each component of the interpretation vector (E) corresponds to a measure of the contribution of one or more associated components of the input vector (x) to the determination of the abnormal state represented by the input vector (x).

6. The anomaly determination and interpretation device (10) according to claim 5, wherein the interpretation vector calculation device (20) is configured such that the step of determining the interpretation vector (E) includes the following sub-steps: The anomaly metric (f) of the input vector (x) is determined as the baseline metric, and For each component of the interpretation vector (E): The partially normalized input vector (Np) is determined by replacing one or more components of the state vector (x) that correspond to the components of the interpretation vector (E) with one or more corresponding associated components of the normalized input vector (N). The components of the explanatory vector (E) are determined by subtracting the anomaly metric (f) of the partially normalized input vector (Np) from the baseline metric.

7. The anomaly determination and interpretation device (10) according to claim 5, wherein the interpretation vector calculation device (20) is configured such that the step of determining the interpretation vector (E) includes a method for determining the interpretation vector (E) relative to a baseline such as a Shapley value, Lime, DeepLift or integral gradient, wherein the normalized input vector (N) is selected as the baseline.

8. The anomaly determination and interpretation method according to any one of claims 5 to 7, wherein the input vector (x) is a state vector obtained at least in part from sensor data.

9. A computer program product that can be directly loaded into the internal memory of a digital computer, comprising a software code portion for performing the steps of claims 1-4 when the product is run on the digital computer.