Method, apparatus and related device for determining anomaly detection algorithm

CN116049257BActive Publication Date: 2026-06-23SHENZHEN HUAWEI CLOUD COMPUTING TECHNOLOGIES CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENZHEN HUAWEI CLOUD COMPUTING TECHNOLOGIES CO LTD
Filing Date
2023-01-03
Publication Date
2026-06-23

Smart Images

  • Figure CN116049257B_ABST
    Figure CN116049257B_ABST
Patent Text Reader

Abstract

The application provides a method for determining an anomaly detection algorithm, which is used for automatically determining an anomaly detection algorithm for performing anomaly detection on target time series data. The method comprises: determining a first target algorithm set, the first target algorithm set comprising a plurality of anomaly detection algorithms, and the anomaly detection algorithms in the first target algorithm set having the capability of performing anomaly detection on target time series data; determining a second target algorithm set according to a first target anomaly type, the second target algorithm set comprising a plurality of anomaly detection algorithms, and the anomaly detection algorithms in the second target algorithm set having the capability of detecting anomalies of the first target anomaly type; and determining a first target anomaly detection algorithm according to the first target algorithm set and the second target algorithm set. The application also provides corresponding apparatuses, computing device clusters, chips, computer readable storage media, and computer program products.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology, and in particular to a method, apparatus and related equipment for determining anomaly detection algorithms. Background Technology

[0002] Time-series data is an important data type, generated by numerous applications. Monitoring this data allows for the timely detection of anomalies. Examples include cloud service machine metrics, application programming interface (API) status, and financial trend data. Monitoring these time-series data for anomalies reveals underlying discrepancies.

[0003] Because time series data covers a wide range of fields, anomaly monitoring algorithms for time series data are currently mostly designed by technical personnel specializing in that area. For example, if monitoring anomalies in financial time series data is required, technical personnel need to understand the data characteristics of financial time series data and the potential anomalies in order to select the appropriate anomaly monitoring algorithm. Furthermore, different monitoring algorithms may have different functions; only by selecting the appropriate algorithm can anomaly monitoring of time series data be effectively performed.

[0004] In other words, traditional methods for monitoring time-series data require technicians to understand the characteristics of the time-series data itself and the characteristics of anomaly monitoring algorithms. This increases the difficulty of monitoring anomalies in time-series data. Summary of the Invention

[0005] In view of this, this application provides a method for determining an anomaly detection algorithm, used to automatically determine an anomaly detection algorithm for detecting anomalies in target time-series data. This application also provides corresponding apparatus, computing device clusters, chips, computer-readable storage media, and computer program products.

[0006] Firstly, this application provides a method for determining anomaly detection algorithms. This method can be used to determine anomaly detection algorithms for detecting target time-series data, and can be executed by an anomaly detection algorithm determining device. Specifically, when executing the method for determining anomaly detection algorithms provided in this application, the anomaly detection algorithm determining device can first determine a first target algorithm set. The first target algorithm set includes at least one anomaly detection algorithm, and the anomaly detection algorithms in the first target algorithm set have the ability to detect anomalies in the target time-series data. Next, the anomaly detection algorithm determining device can determine a second target algorithm set based on a first target anomaly type. The second target algorithm set also includes at least one anomaly detection algorithm, and the anomaly detection algorithms in the second target algorithm set have the ability to detect anomalies of the first target anomaly type. After determining the first target algorithm set and the second target algorithm set, the anomaly detection algorithm determining device can determine a first target anomaly detection algorithm based on the first target algorithm set and the second target algorithm set. The first target anomaly detection algorithm belongs to both the first target algorithm set and the second target algorithm set. Therefore, the first target anomaly detection algorithm can both detect anomalies in the target time-series data and detect anomalies of the first target anomaly type. As can be seen, the anomaly detection algorithm determination device can determine the first anomaly detection algorithm that meets the conditions based on the target time-series data to be detected and the first type of anomaly to be detected. This eliminates the need for technicians to have a thorough understanding of the characteristics of the target time-series data, nor requires them to study extensively about anomaly detection algorithms; it enables the rapid determination of the anomaly detection algorithm for the target time-series data. Thus, it improves the efficiency of determining the target anomaly detection algorithm and lowers the barrier to entry for technical personnel.

[0007] In some possible implementations, the anomaly detection algorithm determination device can determine the first target anomaly detection algorithm by finding the intersection of the first and second target algorithm sets. Specifically, the device can first find the intersection of the first and second target algorithm sets to determine a third target algorithm set. Then, the device can select the algorithm with the highest detection efficiency from the third target algorithm set as the first target anomaly detection algorithm. This ensures that the first target anomaly detection algorithm can detect anomalies of the first target anomaly type in the target time series data while improving the detection efficiency. It is understood that the first target anomaly detection algorithm can also be selected from the third target algorithm set based on other principles.

[0008] In some possible implementations, it may be necessary to detect multiple anomalies in the target time-series data. Accordingly, the anomaly detection algorithm determination device can also acquire a second target anomaly type and determine a fourth target algorithm set based on the second target anomaly type. Similar to the second target algorithm set, the fourth target algorithm set may include at least one anomaly detection algorithm, and the anomaly detection algorithms in the fourth target algorithm set have the ability to detect anomalies of the second target anomaly type. Next, the anomaly detection algorithm determination device can determine a second target anomaly detection algorithm based on the first target algorithm set and the fourth target algorithm set. Since the second target anomaly detection algorithm belongs to both the first and second target algorithm sets, it is capable of detecting anomalies of the second target anomaly type in the transmission of the target time-series data.

[0009] In some possible implementations, the anomaly detection algorithm determination device can generate corresponding anomaly detection procedures based on multiple target anomaly detection algorithms, thereby using the anomaly detection procedures to detect target time-series data. Specifically, if the anomaly detection algorithm determination device determines a first target anomaly detection algorithm and a second target anomaly detection algorithm based on a first target anomaly type and a second target anomaly type, then the anomaly detection algorithm determination device can determine anomaly detection procedures based on the first target anomaly detection algorithm and the second target anomaly detection algorithms, thereby using the anomaly detection procedures to detect target time-series data.

[0010] In some possible implementations, the anomaly detection algorithm determination device can determine a first target algorithm set based on the data characteristics of the target time series data. Specifically, the anomaly detection algorithm determination device can first acquire historical time series data corresponding to the target time series data. The historical time series data can be, for example, historical data of the target time series data, or time series data of the same type as the target time series data. Next, the anomaly detection algorithm determination device can extract features from the historical time series data to determine its data characteristics. Since the historical time series data corresponds to the target time series data, and the data characteristics of the historical time series data have a high degree of similarity to those of the target time series data, the data characteristics of the historical time series data can be used as the data characteristics of the target time series data. Then, the anomaly detection algorithm determination device can determine the first target algorithm set based on the data characteristics of the target time series data. In this way, without requiring a technician to understand the characteristics of the target time series data, the anomaly detection algorithm determination device can determine the first target algorithm set corresponding to the target time series data by analyzing the historical time series data.

[0011] Secondly, this application provides an apparatus for determining anomaly detection algorithms. The apparatus is used to determine anomaly detection algorithms for detecting anomalies in target time-series data. The apparatus includes: a first determining module for determining a first target algorithm set, the first target algorithm set including multiple anomaly detection algorithms, wherein the anomaly detection algorithms in the first target algorithm set have the ability to detect anomalies in the target time-series data; a second determining module for determining a second target algorithm set based on a first target anomaly type, the second target algorithm set including multiple anomaly detection algorithms, wherein the anomaly detection algorithms in the second target algorithm set have the ability to detect anomalies of the first target anomaly type; and a third determining module for determining a first target anomaly detection algorithm based on the first target algorithm set and the second target algorithm set, wherein the first target anomaly detection algorithm belongs to both the first target algorithm set and the second target algorithm set.

[0012] In some possible implementations, the third determining module is specifically used to determine a third target algorithm set, which is the intersection of the first target algorithm set and the second algorithm set; and select the algorithm with the highest detection efficiency from the third target algorithm set as the first target anomaly detection algorithm.

[0013] In some possible implementations, the apparatus further includes a first acquisition module; the first acquisition module is configured to acquire a second target anomaly type; the second determination module is further configured to determine a fourth target algorithm set based on the second target anomaly type, the fourth target algorithm set including multiple anomaly detection algorithms, the anomaly detection algorithms in the fourth target algorithm set having the ability to detect anomalies of the second target anomaly type; the third determination module is further configured to determine a second target anomaly detection algorithm based on the first target algorithm set and the fourth target algorithm set, the second target anomaly detection algorithm belonging to both the first target algorithm set and the fourth target algorithm set.

[0014] In some possible implementations, the third determining module is further configured to determine an anomaly detection process based on the first target anomaly detection algorithm and the second target anomaly detection algorithm; and to detect the target time-series data according to the anomaly detection process.

[0015] In some possible implementations, the apparatus further includes a second acquisition module; the acquisition module is used to acquire historical time-series data corresponding to the target time-series data; the first determination module is specifically used to extract features from the historical time-series data to determine the data features of the target time-series data; and to determine the first target algorithm set based on the data features of the target time-series data.

[0016] Thirdly, this application provides a computing device cluster, the computing device including at least one computing device, the at least one computing device including at least one processor and at least one memory; the at least one memory is used to store instructions, and the at least one processor executes the instructions stored in the at least one memory to cause the computing device cluster to perform the method of determining the anomaly detection algorithm in the first aspect or any possible implementation of the first aspect. It should be noted that the memory can be integrated into the processor or can be independent of the processor. The at least one computing device may also include a bus. The processor is connected to the memory via the bus. The memory may include readable storage and random access memory.

[0017] Fourthly, this application provides a computer-readable storage medium storing instructions that, when executed on at least one computing device, cause the at least one computing device to perform the method for determining the anomaly detection algorithm described in the first aspect or any implementation thereof.

[0018] Fifthly, this application provides a computer program product containing instructions that, when run on at least one computing device, cause the at least one computing device to execute the method for determining the anomaly detection algorithm described in the first aspect or any implementation thereof. Attached Figure Description

[0019] Figure 1 This is a schematic diagram of an application scenario for the anomaly detection algorithm determination device provided in the embodiments of this application;

[0020] Figure 2 A signaling interaction diagram for a method of determining anomaly detection algorithms provided in embodiments of this application;

[0021] Figure 3 This is a schematic diagram of the structure of a computing device provided in an embodiment of this application;

[0022] Figure 4 This is a schematic diagram of the structure of a computing device cluster provided in an embodiment of this application. Detailed Implementation

[0023] The solutions in the embodiments provided in this application will now be described with reference to the accompanying drawings.

[0024] The terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such terms can be used interchangeably where appropriate; this is merely a way of distinguishing objects with the same attributes in the embodiments of this application.

[0025] Time-series data refers to data generated in chronological order. Therefore, time-series data is characterized by large volume but low information content. However, valuable information can still be extracted from large amounts of time-series data through analysis. Anomaly detection in time-series data is a crucial approach to unlocking its value. Because time-series data is generated sequentially, anomaly detection is generally performed as the data is generated. Therefore, unlike traditional anomaly detection algorithms, the event-driven nature of time-series data necessitates different anomaly detection algorithms. This requires technical personnel to select appropriate anomaly detection algorithms based on the characteristics of time-series data.

[0026] Specifically, when determining an anomaly detection algorithm for a certain type of time series data, technical personnel need to fully understand the characteristics of the time series data and the characteristics of the anomaly detection algorithm, so as to select the corresponding anomaly detection algorithm from a variety of anomaly detection algorithms based on the characteristics of the time series data.

[0027] In reality, the accuracy of anomaly detection results is strongly correlated with the data context. For example, both scenarios involve a continuously rising metric. In a memory detection scenario, this phenomenon corresponds to a memory leak. However, in a process startup time monitoring scenario, since process startup time itself is inherently continuous, a continuous increase in process startup time is not considered anomaly. In other words, the same phenomenon may present multiple interpretations in different detection scenarios. Therefore, it is necessary to select specific anomaly detection algorithms based on the data context.

[0028] Furthermore, due to the unique characteristics of time-series data, anomaly detection algorithms used for anomaly detection in time-series data are mostly highly specific. That is, a single anomaly detection algorithm may only be used to detect specific phenomena, and there is a lack of anomaly detection algorithms capable of detecting all anomalies. For example, the 3-sigma algorithm can be used to detect sudden increases or decreases in time-series data; the fixed threshold algorithm can be used to detect whether time-series data exceeds a preset threshold; and the periodic detection algorithm can be used to detect periodic changes in time-series data.

[0029] In other words, to select a suitable anomaly detection algorithm for time-series data, technical personnel need to have a thorough understanding of the data characteristics of the anomaly being detected, and also need to be familiar with various alternative anomaly detection algorithms. Clearly, this places high demands on technical personnel and creates a significant learning curve for users of anomaly detection algorithms.

[0030] Furthermore, even different anomaly detection algorithms can vary in accuracy and efficiency when detecting the same anomaly. Therefore, in order to select the most suitable anomaly detection algorithm, technicians may need to choose one from several similar algorithms based on the specific circumstances. This undoubtedly increases the amount of learning required from technicians and makes anomaly detection on time-series data more difficult.

[0031] Based on this, embodiments of this application provide a method for determining anomaly detection algorithms, which can be executed by a computing device or a cluster of computing devices. Specifically, it can be executed by an anomaly detection algorithm determination device running on a computing device or a cluster of computing devices to automatically select anomaly detection algorithms for time-series data. Specifically, when selecting anomaly detection algorithms for target time-series data, the anomaly detection algorithm determination device can first determine a first target algorithm set. The first target algorithm set includes multiple anomaly detection algorithms, and all anomaly detection algorithms in the first target algorithm set have the ability to detect anomalies in the target time-series data. Next, the anomaly detection algorithm determination device can determine a second target algorithm set based on a first target anomaly type. The second target algorithm set includes multiple anomaly detection algorithms, and the anomaly detection algorithms in the second target algorithm set have the ability to detect anomalies of the first target anomaly type. In other words, the anomaly detection algorithm determination device can determine a first target algorithm set capable of detecting the target time-series data based on the target time-series data to be detected, and can also determine a second target algorithm set that can be used based on the items to be detected, i.e., the first target anomaly type. Then, the anomaly detection algorithm determination device can determine a first target anomaly detection algorithm based on the first target algorithm set and the second target algorithm set. Therefore, the selected first target anomaly detection algorithm can detect anomalies in both the target time series data and the first target anomaly type. It is evident that the anomaly detection algorithm determination device can determine the first anomaly detection algorithm that meets the conditions based on the target time series data to be detected and the first target anomaly type to be detected. This allows for the rapid determination of the anomaly detection algorithm for the target time series data without requiring technicians to have a thorough understanding of the characteristics of the target time series data or to learn extensively about anomaly detection algorithms. Thus, the efficiency of determining the target anomaly detection algorithm is improved, and the technical personnel's skill level is lowered.

[0032] As an example, the anomaly detection algorithm determination device described above can be deployed on a single computing device, such as a computer or server. As another example, the anomaly detection algorithm determination device can be deployed on multiple computing devices, such as one or more servers in a distributed processing system. Optionally, if the anomaly detection algorithm determination device is deployed on multiple computing devices, different computing devices can be used to perform different steps. Optionally, the computing device or cluster of computing devices deploying the anomaly detection algorithm determination device can be the same as or different from the computing device or cluster of computing devices that generates the target time-series data. In some possible implementations, after the anomaly detection algorithm determination device determines the first target anomaly detection algorithm, it can send the identifier of the first target anomaly detection algorithm to other computing devices (or computing device clusters). The other computing devices (or computing device clusters) can then use the first target anomaly detection algorithm to perform anomaly detection on the target time-series data based on the identifier of the first target anomaly detection algorithm.

[0033] For example, in Figure 1 In the application scenario shown, the anomaly detection algorithm determination device 100 may specifically include a first determination module 110, a second determination module 120, and a third determination module 130. The third determination module 130 is connected to both the first determination module 110 and the second determination module 120.

[0034] Specifically, the first determining module 110 is used to determine a first target algorithm set, the second determining module 120 is used to determine a second target algorithm set according to the first target anomaly type, and the third determining module 130 is used to combine the first target algorithm set and the second target algorithm set to determine a first target anomaly detection algorithm.

[0035] In practical applications, the above-mentioned anomaly detection algorithm determination device 100 can be implemented in software or in hardware.

[0036] The anomaly detection algorithm determination device 100, as an example of a software functional unit, may include code running on a computing instance. The computing instance may include at least one of a physical host (computing device), a virtual machine, and a container. Further, the aforementioned computing instance may be one or more. For example, the anomaly detection algorithm determination device 100 may include code running on multiple hosts / virtual machines / containers. It should be noted that the multiple hosts / virtual machines / containers used to run the code may be distributed in the same region or in different regions. Further, the multiple hosts / virtual machines / containers used to run the code may be distributed in the same availability zone (AZ) or in different AZs, each AZ including one or more geographically proximate data centers. Typically, a region may include multiple AZs.

[0037] Similarly, multiple hosts / virtual machines / containers used to run this code can be distributed within the same Virtual Private Cloud (VPC) or across multiple VPCs. Typically, a VPC is set up within a region. Communication between two VPCs within the same region, as well as between VPCs in different regions, requires a communication gateway to be set up within each VPC to enable interconnection between VPCs.

[0038] The anomaly detection algorithm determination device 100 is one example of a hardware functional unit. It may include at least one computing device, such as a server. Alternatively, the anomaly detection algorithm determination device 100 may also be a device implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD). The PLD may be implemented using a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), generic array logic (GAL), or any combination thereof.

[0039] The anomaly detection algorithm determination device 100 includes multiple computing devices that can be distributed in the same region or in different regions. Similarly, the multiple computing devices included in the anomaly detection algorithm determination device 100 can be distributed in the same Availability Zone (AZ) or in different AZs. Likewise, the multiple computing devices included in the anomaly detection algorithm determination device 100 can be distributed in the same Virtual Private Cloud (VPC) or in multiple VPCs. These multiple computing devices can be any combination of computing devices such as servers, ASICs, PLDs, CPLDs, FPGAs, and GALs.

[0040] Next, various non-limiting specific implementations of the process for determining the anomaly detection algorithm will be described in detail.

[0041] See Figure 2 This is a flowchart illustrating a method for determining an anomaly detection algorithm in an embodiment of this application. This method can be applied to the above... Figure 1 The application scenarios shown can also be applied to other applicable scenarios. The following example demonstrates an application to... Figure 1 The application scenario shown is used as an example for illustration. It should be noted that... Figure 1 In the application scenario shown, the anomaly detection algorithm determination device 100 not only includes the first determination module 110, the second determination module 120, and the third determination module 130, but may further include an acquisition module 140 and a feature extraction module 150. Since the acquisition module 140 and the feature extraction module 150 are optional, therefore... Figure 1 and Figure 2 (If present) are indicated by dashed lines. Furthermore, the functions of each module are described in detail in the following embodiments.

[0042] Figure 2 The method for determining the anomaly detection algorithm shown may specifically include:

[0043] S201: The first determining module 110 determines the first target algorithm set.

[0044] To determine the target anomaly detection algorithm for the target time series data, the first determining module 110 in the anomaly detection algorithm determining device 100 first determines a first set of target algorithms based on the target time series data to be detected. Here, the target time series data refers to the time series data to be detected. It can be understood that the target time series data may be time series data that has not yet been generated.

[0045] The first target algorithm set includes multiple anomaly detection algorithms, and any one of the anomaly detection algorithms in the first target algorithm set has the ability to detect anomalies in the target time series data. That is, before determining the final anomaly detection algorithm, the first determining module 110 can first select anomaly detection algorithms with the ability to detect anomalies in the target time series data from multiple anomaly detection algorithms to obtain the first target algorithm set.

[0046] In some possible implementations, the first determining module 110 can determine the first target algorithm set based on the data characteristics of the target time series data. Specifically, the first determining module 110 can first determine the data characteristics of the target time series data. These data characteristics are relevant information indicating the features possessed by the time series data. For example, they may include data periodicity, trend, and random peak values. Optionally, the data characteristics of the target time series data can be obtained by feature extraction from the target time series data, or by feature extraction from historical time series data of the same type as the target time series data. For example, the first determining module 110 can acquire historical time series data, then extract features from the historical time series data, thereby using the data characteristics of the historical time series data as the data characteristics of the target time series data to determine the corresponding first target algorithm set.

[0047] As mentioned earlier, target time-series data can be time-series data that has not yet been generated at the current moment. Historical time-series data, on the other hand, can be data generated before the current moment that is of the same type or origin as the target time-series data. For example, suppose we need to detect anomalies in the memory usage of a distributed system. The target time-series data could be the memory usage of the distributed system after the current moment. Historical time-series data, however, could be the memory usage of the distributed system before the current moment, such as memory usage over a past period. Thus, by extracting features from historical time-series data, we can determine the data characteristics of the target time-series data, thereby selecting the corresponding first set of target algorithms.

[0048] Optionally, the process of feature extraction from historical time-series data described above can be determined by the feature extraction module 150 in the anomaly detection algorithm determination device 100. The historical time-series data can be acquired by the acquisition module 140 in the anomaly detection algorithm determination device 110.

[0049] In this embodiment, different mapping relationships between data features and anomaly detection algorithms can be pre-set. This mapping relationship can be referred to as the first mapping relationship. Thus, after determining the data features of the target time-series data to be detected, a first set of target algorithms corresponding to the data features can be determined based on the first mapping relationship. The first mapping relationship between data features and anomaly detection algorithms can be pre-set in the anomaly detection algorithm determination device 100 by a person skilled in the art who understands anomaly detection algorithms, and is used to indicate which data features of time-series data anomalies the anomaly detection algorithm can detect.

[0050] After determining the first target algorithm set, the first determining module 110 can send the first target algorithm set to the third determining module 130. The first target algorithm set may include the anomaly detection algorithm itself, or it may include the identifier of the anomaly detection algorithm. The third determining module 130 can determine the anomaly detection algorithm in the first target algorithm set based on the identifier of the anomaly detection algorithm.

[0051] S202: The second determining module 120 determines the second target algorithm set based on the first target anomaly type.

[0052] In step S201, the first determining module 110 can determine a first target algorithm set based on the target time-series data. Conversely, the second determining module 120 can determine a second target algorithm set based on the first target anomaly type. The second target algorithm set includes multiple anomaly detection algorithms, and these algorithms are capable of detecting anomalies of the first target anomaly type. In other words, the second determining module 120 can select anomaly detection algorithms capable of detecting the first target anomaly type from a variety of algorithms, thereby obtaining the second target algorithm set.

[0053] In this embodiment, the first target anomaly type is the item that needs to be detected in the target time series data. That is, the anomaly detection algorithm determination device 100 finally selects the anomaly algorithm to detect the target time series data, and the detected anomaly items include anomalies of the first target anomaly type. Therefore, in order to ensure that the selected first target anomaly detection algorithm can detect anomalies of the first target anomaly type, the second determination module 120 needs to select a second set of target algorithms that can detect anomalies of the first target anomaly type, so as to further determine the first target anomaly detection algorithm from the second set of target algorithms. In other words, the first target anomaly detection algorithm belongs to the second set of target algorithms.

[0054] In some possible implementations, the first target anomaly type can be set by a technician in the anomaly detection algorithm determination device 100. For example, after determining the detection requirements for the target time-series data, the technician can analyze which types of anomalies need to be detected, thereby determining the first target anomaly type. Then, the technician can use the client module of the anomaly detection algorithm determination device 100 (… Figure 1 (Not shown in the image) Sets the first target exception type. Correspondingly, the second determining module 120 can obtain the first target exception type sent by the client module, thereby determining the second target algorithm set.

[0055] Optionally, the first target anomaly type can be obtained by the acquisition module 140 in the anomaly detection algorithm determination device 100. In this embodiment, if both the first target anomaly type and historical time-series data are obtained through the acquisition module, then the acquisition module for obtaining the first target anomaly type and the acquisition module for obtaining the historical time-series data can be the same module. For example, in Figure 1 In the illustrated implementation, both the first target anomaly type and the historical time-series data are acquired by the acquisition module 140. However, in some other possible implementations, the acquisition module for acquiring the first target anomaly type and the acquisition module for acquiring the historical time-series data can be different modules. For example, the anomaly detection algorithm determination device may include a first acquisition module and a second acquisition module. The first acquisition module is used to acquire the first target anomaly type (and the second target anomaly type described later), and the second acquisition module is used to acquire the historical time-series data.

[0056] In some other possible implementations, the first target anomaly type can be obtained by the anomaly detection algorithm determination device 100 or other devices through analysis of historical anomaly data. This historical anomaly data includes information about anomalies that occurred in the historical time-series data. Based on the historical anomaly data, the anomalies that occurred in the historical time-series data in the past can be determined. Thus, based on the anomalies that occurred in the historical time-series data, the anomaly type of the target time-series data that may occur in the future can be predicted, thereby determining the first target anomaly type. For example, the anomaly detection algorithm determination device 100 can use the anomaly type of the most frequently occurring anomaly in the historical anomaly data as the first target anomaly type. Alternatively, the anomaly detection algorithm determination device 100 can also use the anomaly type of the most frequently occurring or most severe anomaly in the historical anomaly data as the first target anomaly type.

[0057] In this embodiment, after determining the first target anomaly type, the second determining module 120 can determine the corresponding second target algorithm set based on the first target anomaly type and the second mapping relationship. The second mapping relationship indicates the anomaly detection algorithm's ability to detect anomalies. That is, based on the second mapping relationship, it can be determined which type of anomaly a particular anomaly detection algorithm can detect. Therefore, after determining the first target anomaly type, the second determining module 120 can use the second mapping relationship to search for anomaly detection algorithms from a variety of anomaly detection algorithms that are capable of detecting anomalies of the first target anomaly type, thereby determining the second target algorithm set.

[0058] After determining the second target algorithm set, the second determining module 120 can send the second target algorithm set to the third determining module 130. The first target algorithm set may include the anomaly detection algorithm itself, or it may include the identifier of the anomaly detection algorithm. The third determining module 130 can determine the anomaly detection algorithm in the second target algorithm set based on the identifier of the anomaly detection algorithm.

[0059] It should be noted that steps S201 and S202 may not have a clear sequential relationship. Step S201 can be executed before or after step S202. For example, in one possible implementation, the anomaly detection algorithm determination device 100 can first acquire a first target anomaly type, and then determine a second target algorithm set based on the first target anomaly type. In this process, the anomaly detection algorithm determination device 100 can record the time-series data to be detected, thereby analyzing the time-series data, determining the data characteristics of the target time-series data, and then determining the first target algorithm set based on the data characteristics of the target time-series data.

[0060] Optionally, the second determining module 120 can also be used to determine the fourth target algorithm set based on the second target anomaly type. A description of this part can be found later and will not be repeated here.

[0061] S203: The third determining module 130 determines the first target anomaly detection algorithm based on the first target algorithm set and the second target algorithm set.

[0062] After determining the first set of target algorithms and the second set of target algorithms, the third determining module 130 can determine the first target anomaly detection algorithm based on the first set of target algorithms and the second set of target algorithms. The first target anomaly detection algorithm belongs to both the first set of target algorithms and the second set of target algorithms. Thus, the first set of target anomaly algorithms can detect anomalies in the target time series data, as well as anomalies of the first target anomaly type. Therefore, the selected first target anomaly detection algorithm can detect whether anomalies of the first target anomaly type have occurred in the target time series data, thus meeting the requirements for anomaly detection of the target time series data.

[0063] In some possible implementations, the third determining module 130 can first find the intersection of the first target algorithm set and the second target algorithm set to determine the third target algorithm set. Since the third target algorithm set is the intersection of the first and second target algorithm sets, the anomaly detection algorithms in the third target algorithm set can detect anomalies in both the target time-series data and the first target anomaly type. After determining the third target algorithm set, the third determining module 130 can determine the first target anomaly detection algorithm from the third target algorithm set.

[0064] Optionally, the third determining module 130 can select a first target anomaly detection algorithm from a third set of target algorithms based on the efficiency of the anomaly detection algorithm. Specifically, efficiency indicators for different anomaly detection algorithms can be preset in the anomaly detection algorithm determining device 100. The efficiency indicator is used to indicate the efficiency of the anomaly detection algorithm in detecting anomalies, for example, to indicate how long it takes for the anomaly detection algorithm to detect the anomaly after it occurs.

[0065] Next, after determining the third target algorithm set, if the third target algorithm set includes only one anomaly detection algorithm, then the third determining module 130 can determine the unique anomaly detection algorithm in the third target algorithm set as the first target anomaly detection algorithm. If the third target algorithm set includes multiple anomaly detection algorithms, then the third determining module 130 can select the most efficient algorithm from the multiple anomaly detection algorithms as the first target anomaly detection algorithm based on the efficiency index of each anomaly detection algorithm.

[0066] Optionally, a mapping relationship between algorithms and efficiency metrics can be pre-defined; this mapping relationship can be referred to as the third mapping relationship. Thus, after determining the third target algorithm set, the efficiency metric corresponding to each anomaly detection algorithm in the third target algorithm set can be determined based on the third mapping relationship, thereby identifying the most efficient anomaly detection algorithm as the first target anomaly detection algorithm.

[0067] It is understandable that in some other possible implementations, if the third target algorithm set includes multiple anomaly detection algorithms, the third determining module 130 can also select the first target anomaly detection algorithm based on other indicators. For example, if the anomaly detection of the target time series data requires high accuracy, the third determining module 130 can determine the anomaly detection algorithm with the highest accuracy in the third target algorithm set as the first target anomaly detection algorithm. Alternatively, if the target time series data has a large amount of data, the third determining module 130 can determine the anomaly detection algorithm in the third target algorithm set that can perform anomaly detection on a large amount of data as the first target anomaly detection algorithm. It should be noted that the above two examples are for illustrative purposes only. In practical application scenarios, the first target anomaly detection algorithm can also be selected based on other principles.

[0068] Alternatively, in some possible implementations, the anomaly detection algorithm determination device 100 can also be implemented via a client module ( Figure 1 and Figure 2 (Not shown in the image) Show the technician the anomaly detection algorithms in the third target algorithm set, for example, display the names of the anomaly detection algorithms in the third target algorithm set on the technician's computer device. In this way, the technician can manually select the first target anomaly detection algorithm from the third target algorithm set.

[0069] After determining the first target anomaly detection algorithm, the anomaly detection algorithm determination device 100 can utilize the first...

[0070] As can be seen, in the implementation described above, the anomaly detection algorithm determination device 100 can determine the first anomaly detection algorithm that meets the conditions based on the target time-series data to be detected and the first target anomaly type to be detected. This way, technicians do not need to have a full understanding of the characteristics of the target time-series data, nor do they need to learn extensively about anomaly detection algorithms; they can quickly determine the anomaly detection algorithm for detecting the target time-series data. Thus, the efficiency of determining the target anomaly detection algorithm is improved, and the technical personnel's skill threshold is lowered.

[0071] In some possible implementations, there may be a need to detect different types of anomalies in the same time-series data. That is, the anomaly detection algorithm determination device 100 is used not only to determine the first target anomaly detection algorithm, but also to determine the second target anomaly detection algorithm.

[0072] For example, in some possible implementations, the acquisition module 140 in the anomaly detection algorithm determination device 100 can also acquire a second target anomaly type and send the second target anomaly type to the second determination module 120. Next, the second determination module 120 can determine a fourth target algorithm set based on the second target anomaly type. The fourth target algorithm set includes at least one anomaly detection algorithm, and the anomaly detection algorithms in the fourth target algorithm set have the ability to detect anomalies of the second target anomaly type. Then, the second determination module 120 sends the fourth target algorithm set to the third determination module 130. The third determination module 130 determines a second target anomaly detection algorithm based on the fourth target algorithm set and the first target algorithm set. The second target anomaly detection algorithm belongs to both the fourth target algorithm set and the first target algorithm set. Thus, the second target anomaly detection algorithm can both detect anomalies in the target time-series data and detect anomalies of the second target anomaly type. For a detailed description of determining the second target anomaly detection algorithm, please refer to the process of determining the first target anomaly detection algorithm in S203, which will not be repeated here.

[0073] Alternatively, in some other possible implementations, the anomaly detection algorithm determination device 100 can also determine multiple target anomaly types. For example, in the implementation described in step S202 above, multiple target anomaly types can be determined based on historical anomaly data. These multiple target anomaly types include a first target anomaly type and a second target anomaly type. Next, the second determination module 120 can determine a fourth target algorithm set based on the second target anomaly type and send the fourth target algorithm set to the third determination module 130, so that the third determination module 130 can determine a second target anomaly detection algorithm based on the fourth target algorithm set.

[0074] After determining multiple target anomaly detection algorithms, a corresponding detection process can be generated based on these algorithms. Specifically, the anomaly detection algorithm determining device 100 can determine the anomaly detection process sequentially based on the multiple target anomaly detection algorithms. The anomaly detection process includes multiple target anomaly detection algorithms, and these algorithms can be executed in parallel or sequentially. For example, if there are no dependencies between the multiple target anomaly detection algorithms in the anomaly detection process, they can be executed in parallel. Alternatively, if there are dependencies between the multiple target anomaly detection algorithms, for example, assuming that the second target anomaly detection algorithm depends on the result of the first target anomaly detection algorithm, then the dependent target anomaly detection algorithms can be executed sequentially according to the dependencies, while the undependent target anomaly detection algorithms can be executed in parallel.

[0075] After determining the anomaly detection process, the anomaly detection algorithm determination device 100 can perform anomaly detection on the target time-series data according to the anomaly detection process. Alternatively, the anomaly detection algorithm determination device 100 can also provide information to the detection device ( Figure 1 (Not shown in the image) A fault detection process is sent. This sent fault detection process includes multiple target fault detection algorithms and the execution order of these algorithms. The detection device can perform fault detection on the target time-series data according to the fault detection process.

[0076] It should be noted that the division and functional description of the various modules within the anomaly detection algorithm determination device in this embodiment are merely examples. For instance, in other embodiments, the first determination module 110 can be used to execute any step in the method for determining the anomaly detection algorithm described above. Similarly, the second determination module 120, the third determination module 130, the acquisition module 140, and the feature extraction module 150 can all be used to execute any step in the method for determining the anomaly detection algorithm described above. Furthermore, the steps that the first determination module 110, the second determination module 120, the third determination module 130, the acquisition module 140, and the feature extraction module 150 can be specified as needed. The anomaly detection algorithm determination device can achieve all its functions by implementing different steps in the method for determining the anomaly detection algorithm through the first determination module 110, the second determination module 120, the third determination module 130, the acquisition module 140, and the feature extraction module 150 respectively.

[0077] The above Figure 2 In the illustrated embodiment, the anomaly detection algorithm determination device (including the first determination module 110, the second determination module 120, the third determination module 130, the acquisition module 140, and the feature extraction module 150) involved in determining the anomaly detection algorithm can be software configured on a computing device or a cluster of computing devices. Furthermore, by running this software on the computing device or the cluster of computing devices, the computing device or the cluster of computing devices can implement the functions of the aforementioned anomaly detection algorithm determination device. Below, from the perspective of hardware implementation, the anomaly detection algorithm determination device involved in the model update process will be described in detail.

[0078] Figure 3 A schematic diagram of a computing device is shown. The aforementioned anomaly detection algorithm determination device can be deployed on this computing device. This computing device can be a computing device in a cloud environment (such as a server), a computing device in an edge environment, or a terminal device, etc., specifically used to implement the above. Figure 2 The embodiments shown illustrate the functions of the first determining module 110, the second determining module 120, the third determining module 130, the acquisition module 140, and the feature extraction module 150.

[0079] like Figure 3 As shown, the computing device 300 includes a processor 310, a memory 320, a communication interface 330, and a bus 340. The processor 310, memory 320, and communication interface 330 communicate via the bus 340. The bus 340 can be a peripheral component interconnect (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus, etc. For ease of illustration, Figure 3 The symbol is represented by a single thick line, but this does not indicate that there is only one bus or one type of bus. Communication interface 330 is used for communication with external systems, such as for acquiring a first dataset and acquiring a second dataset.

[0080] The processor 310 can be a central processing unit (CPU), an application-specific integrated circuit (ASIC), a graphics processing unit (GPU), or one or more integrated circuits. The processor 310 can also be an integrated circuit chip with signal processing capabilities. During implementation, the anomaly detection algorithm determines the functions of each module in the device through integrated logic circuits in the hardware of the processor 310 or through software instructions. The processor 310 can also be a general-purpose processor, a digital signal processor (DSP), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components, capable of implementing or executing the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor, etc. The methods disclosed in the embodiments of this application can be directly embodied as execution by a hardware decoding processor, or as execution by a combination of hardware and software modules in the decoding processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. The storage medium is located in memory 320. Processor 310 reads the information in memory 320 and combines it with its hardware to complete the anomaly detection algorithm to determine some or all of the functions in the device.

[0081] Memory 320 may include volatile memory, such as random access memory (RAM). Memory 320 may also include non-volatile memory, such as read-only memory (ROM), flash memory, HDD, or SSD.

[0082] The memory 320 stores executable code, and the processor 310 executes the executable code to perform the method executed by the aforementioned anomaly detection algorithm determination device.

[0083] Specifically, in achieving Figure 2 In the case of the illustrated embodiment, and Figure 2 When the first determining module 110, the second determining module 120, the third determining module 130, the acquisition module 140, and the feature extraction module 150 described in the illustrated embodiment are implemented in software, the execution... Figure 2 The software or program code required for the functions of the first determining module 110, the second determining module 120, the third determining module 130, the acquisition module 140 and the feature extraction module 150 are stored in the memory 320. The acquisition module 140 interacts with other devices through the communication interface 330. The processor is used to execute the instructions in the memory 320 to implement the method executed by the anomaly detection algorithm determining device.

[0084] Figure 4 The diagram illustrates the structure of a computing device cluster. Figure 4 The computing device cluster 40 shown includes multiple computing devices, and the aforementioned anomaly detection algorithm determination device can be distributed and deployed across multiple computing devices within the computing device cluster 40. For example... Figure 4 As shown, the computing device cluster 40 includes multiple computing devices 400. Each computing device 400 includes a memory 420, a processor 410, a communication interface 430, and a bus 440. The memory 420, the processor 410, and the communication interface 430 communicate with each other through the bus 440.

[0085] Processor 410 can be a CPU, GPU, ASIC, or one or more integrated circuits. Processor 410 can also be an integrated circuit chip with signal processing capabilities. In implementation, some functions of the anomaly detection algorithm determination device can be completed through integrated logic circuits in the hardware of processor 410 or through software instructions. Processor 410 can also be a DSP, FPGA, general-purpose processor, other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components, capable of implementing or executing some of the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor, etc. The steps of the methods disclosed in the embodiments of this application can be directly implemented by a hardware decoding processor, or by a combination of hardware and software modules in the decoding processor. The software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory 420. In each computing device 400, processor 410 reads information from memory 420, and in conjunction with its hardware, can complete some functions of the anomaly detection algorithm determination device.

[0086] The memory 620 may include ROM, RAM, static storage devices, dynamic storage devices, hard disks (e.g., SSDs, HDDs), etc. The memory 620 may store program code, such as partial or complete program code for implementing the first determining module 110, partial or complete program code for implementing the second determining module 120, partial or complete program code for implementing the third determining module 130, partial or complete program code for implementing the acquisition module 140, partial or complete program code for implementing the feature extraction module 150, etc. For each computing device 400, when the program code stored in the memory 420 is executed by the processor 410, the processor 410 executes a portion of the methods executed by the anomaly detection algorithm determining device based on the communication interface 430. For example, one part of the computing device 400 may be used to execute the method executed by the first determining module 110, and another part of the computing device 400 may be used to execute the method executed by the second determining module 120. The memory 420 may also store data, such as intermediate data or result data generated by the processor 410 during execution, such as the aforementioned first target algorithm set and second target algorithm set, etc.

[0087] The communication interface 403 in each computing device 400 is used to communicate with the outside world, such as interacting with other computing devices 400.

[0088] Bus 440 can be a standard bus for interconnecting peripheral components or an extended industry standard structure bus, etc. For ease of representation, Figure 4The bus 440 within each computing device 400 is represented by a single thick line, but this does not mean that there is only one bus or one type of bus.

[0089] The aforementioned multiple computing devices 400 establish communication channels through a communication network to realize the function of the anomaly detection algorithm determination device. Any computing device can be a computing device in a cloud environment (e.g., a server), a computing device in an edge environment, or a terminal device.

[0090] Furthermore, embodiments of this application also provide a computer-readable storage medium storing instructions that, when executed on one or more computing devices, cause the one or more computing devices to perform the methods executed by the various modules of the anomaly detection algorithm determination device provided in the above embodiments.

[0091] Furthermore, this application also provides a computer program product, which, when executed by one or more computing devices, allows the computing devices to execute any of the methods described above for determining the anomaly detection algorithm. This computer program product can be a software installation package; when any of the methods described above for determining the anomaly detection algorithm is required, the computer program product can be downloaded and executed on a computer.

[0092] It should also be noted that the device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. In addition, in the device embodiment drawings provided in this application, the connection relationship between modules indicates that they have a communication connection, which can be implemented as one or more communication buses or signal lines.

[0093] Through the above description of the embodiments, those skilled in the art can clearly understand that this application can be implemented by means of software plus necessary general-purpose hardware, or it can be implemented by special-purpose hardware including application-specific integrated circuits, special-purpose CPUs, special-purpose memory, special-purpose components, etc. Generally, any function performed by a computer program can be easily implemented by corresponding hardware, and the specific hardware structure used to implement the same function can be diverse, such as analog circuits, digital circuits, or special-purpose circuits. However, for this application, software program implementation is more often the preferred implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a readable storage medium, such as a computer floppy disk, USB flash drive, mobile hard disk, ROM, RAM, magnetic disk, or optical disk, etc., and includes several instructions to cause a computer device (which may be a personal computer, training equipment, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0094] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product.

[0095] The computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions may be transmitted from one website, computer, training device, or data center to another website, computer, training device, or data center via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium may be any available medium that a computer can store or a data storage device such as a training device or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid-state drives (SSDs)).

Claims

1. A method for determining an anomaly detection algorithm, characterized in that, The method is used to determine an anomaly detection algorithm for anomaly detection of target time-series data, and the method includes: A first target algorithm set is determined, which includes multiple anomaly detection algorithms. The anomaly detection algorithms in the first target algorithm set have the ability to detect anomalies in the target time series data. The first target algorithm set is determined based on the data characteristics of the target time series data. A second set of target algorithms is determined based on a first target anomaly type. The second set of target algorithms includes multiple anomaly detection algorithms, and the anomaly detection algorithms in the second set of target algorithms have the ability to detect anomalies of the first target anomaly type. A first target anomaly detection algorithm is determined based on the first target algorithm set and the second target algorithm set, wherein the first target anomaly detection algorithm belongs to both the first target algorithm set and the second target algorithm set; the determination of the first target anomaly detection algorithm based on the first target algorithm set and the second target algorithm set includes: A third set of target algorithms is determined, wherein the third set of target algorithms is the intersection of the first set of target algorithms and the second set of target algorithms; From the set of algorithms for the third target, the algorithm with the highest detection efficiency is selected as the first target anomaly detection algorithm.

2. The method according to claim 1, characterized in that, The method further includes: Obtain the second target exception type; A fourth set of target algorithms is determined based on the second target anomaly type. The fourth set of target algorithms includes multiple anomaly detection algorithms, and the anomaly detection algorithms in the fourth set of target algorithms have the ability to detect anomalies of the second target anomaly type. Based on the first set of target algorithms and the fourth set of target algorithms, a second target anomaly detection algorithm is determined, wherein the second target anomaly detection algorithm belongs to the first set of target algorithms and the fourth set of target algorithms.

3. The method according to claim 2, characterized in that, The method further includes: The anomaly detection process is determined based on the first target anomaly detection algorithm and the second target anomaly detection algorithm; The target time series data is detected according to the anomaly detection process.

4. The method according to any one of claims 1-3, characterized in that, The set of algorithms for determining the first target includes: Obtain the historical time series data corresponding to the target time series data; Feature extraction is performed on the historical time series data to determine the data features of the target time series data; The first target algorithm set is determined based on the data characteristics of the target time series data.

5. An apparatus for determining an anomaly detection algorithm, characterized in that, The apparatus is used to determine an anomaly detection algorithm for anomaly detection of target time-series data, and the apparatus includes: A first determining module is used to determine a first target algorithm set, which includes multiple anomaly detection algorithms. The anomaly detection algorithms in the first target algorithm set have the ability to detect anomalies in the target time series data. The first target algorithm set is determined based on the data characteristics of the target time series data. The second determining module is used to determine a second target algorithm set based on the first target anomaly type. The second target algorithm set includes multiple anomaly detection algorithms, and the anomaly detection algorithms in the second target algorithm set have the ability to detect anomalies of the first target anomaly type. The third determining module is used to determine a first target anomaly detection algorithm based on the first target algorithm set and the second target algorithm set, wherein the first target anomaly detection algorithm belongs to the first target algorithm set and the second target algorithm set; The third determining module is specifically used to determine a third target algorithm set, which is the intersection of the first target algorithm set and the second target algorithm set; from the third target algorithm set, the algorithm with the highest detection efficiency is selected as the first target anomaly detection algorithm.

6. The apparatus according to claim 5, characterized in that, The device further includes a first acquisition module; The first acquisition module is used to acquire the second target exception type; The second determining module is further configured to determine a fourth target algorithm set based on the second target anomaly type, the fourth target algorithm set including multiple anomaly detection algorithms, and the anomaly detection algorithms in the fourth target algorithm set having the ability to detect anomalies of the second target anomaly type; The third determining module is further configured to determine a second target anomaly detection algorithm based on the first target algorithm set and the fourth target algorithm set, wherein the second target anomaly detection algorithm belongs to the first target algorithm set and the fourth target algorithm set.

7. The apparatus according to claim 6, characterized in that, The third determining module is further configured to determine an anomaly detection process based on the first target anomaly detection algorithm and the second target anomaly detection algorithm; and to detect the target time series data according to the anomaly detection process.

8. The apparatus according to any one of claims 5-7, characterized in that, The device further includes a second acquisition module; The second acquisition module is used to acquire historical time series data corresponding to the target time series data; The first determining module is specifically used to extract features from the historical time series data to determine the data features of the target time series data; and to determine the first target algorithm set based on the data features of the target time series data.

9. A computing device cluster, characterized in that, The computing device cluster includes at least one computing device, each computing device including a processor and memory: The memory is used to store instructions; The processor is configured to, according to the instructions, cause the computing device cluster to perform the method of any one of claims 1-4.

10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores instructions that, when executed on a computing device, cause the computing device to perform the method as described in any one of claims 1-4.

11. A computer program product comprising instructions that, when run on a computing device, cause the computing device to perform the method as described in any one of claims 1-4.