An abnormality attribution method, device, storage medium, electronic device and product

By constructing an anomaly attribution system based on multi-layer decision trees and language models in a financial trading system, the problem of low efficiency in fault location in existing technologies is solved, enabling accurate location of anomaly sources and root cause analysis, thereby improving system stability and response speed.

CN122285352APending Publication Date: 2026-06-26ALIPAY (HANGZHOU) INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ALIPAY (HANGZHOU) INFORMATION TECH CO LTD
Filing Date
2026-03-26
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing fault detection and root cause analysis technologies struggle to uncover the root causes of anomalies in financial trading systems, resulting in inefficient localization and an inability to meet the demands for rapid response and processing.

Method used

By acquiring and parsing anomaly logs, a multi-layer decision tree is constructed to determine feature distribution information. This information, combined with business change information, is input into a language model-based anomaly attribution system to achieve accurate location of anomaly sources and root cause analysis.

Benefits of technology

It significantly improves the efficiency and response speed of business anomaly location, breaks through the limitations of traditional rule engines, and realizes the leap from locating the source of anomalies to tracing the root cause, thereby improving the accuracy and efficiency of analysis.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122285352A_ABST
    Figure CN122285352A_ABST
Patent Text Reader

Abstract

This specification provides an anomaly attribution method, apparatus, storage medium, electronic device, and product. In this method, anomaly logs of a target service are acquired and parsed to determine the characteristic distribution information of the target service; service change information of the target service is acquired, and the characteristic distribution information and service change information are input into a language model-based anomaly attribution system to determine the anomaly attribution information of the target service; wherein, the anomaly attribution information includes: anomaly source features determined based on the characteristic distribution information, and service change information associated with the anomaly source features, the anomaly source features being used to characterize the source of the service anomaly. This solution effectively improves the efficiency and response speed of service anomaly location.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This specification relates to one or more embodiments in the field of computer technology, and more particularly to an anomaly attribution method, apparatus, storage medium, electronic device, and product. Background Technology

[0002] As digital transformation deepens, the demands on business systems are constantly increasing. Taking financial trading scenarios as an example, the business system is the trading system. As the core infrastructure supporting users' investment decisions and the execution of trading strategies, the stability of the trading system directly affects users' trading experience and fund security. Therefore, accurate detection and root cause analysis of business system failures have become a key technical requirement for ensuring the reliable operation of the system.

[0003] However, existing fault detection and root cause analysis technologies still have significant limitations. Currently used rule-based fault detection and log analysis tools can only locate the source of the problem, but it is difficult to delve into the root cause of the anomaly. They often require manual intervention or rely on preset rules to assist in the analysis, resulting in low efficiency in problem localization and failing to meet the actual needs of business systems for rapid fault response and handling. Summary of the Invention

[0004] In view of the above, one or more embodiments of this specification provide the following technical solutions: According to a first aspect of one or more embodiments of this specification, an anomaly attribution method is proposed, comprising: Obtain the anomaly logs of the target service and parse the anomaly logs to determine the characteristic distribution information of the target service; Obtain the business change information of the target business, and input the feature distribution information and the business change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on the feature distribution information, and business change information associated with the anomaly source features, wherein the anomaly source features are used to characterize the source of the business anomaly.

[0005] According to a second aspect of one or more embodiments of this specification, an anomaly attribution apparatus is provided, comprising: The parsing module is used to obtain the abnormal logs of the target service and parse the abnormal logs to determine the characteristic distribution information of the target service; The attribution module is used to acquire business change information of the target business and input the feature distribution information and the business change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on the feature distribution information, and business change information associated with the anomaly source features, wherein the anomaly source features are used to characterize the source of the business anomaly.

[0006] According to a third aspect of one or more embodiments of this specification, an electronic device is provided, comprising: a processor; a memory for storing processor-executable instructions; wherein the processor performs the executable instructions to implement the steps of the above-described anomaly attribution method.

[0007] According to a fourth aspect of one or more embodiments of this specification, a computer-readable storage medium is provided that stores computer instructions thereon, which, when executed by a processor, implement the steps of the above-described anomaly attribution method.

[0008] According to a fifth aspect of one or more embodiments of this specification, a computer program product is provided, comprising a computer program / instructions that, when executed by a processor, implement the steps of the above-described anomaly attribution method.

[0009] As can be seen from the above embodiments, this specification obtains the anomaly logs of the target service and parses the anomaly logs to determine the feature distribution information of the target service; obtains the service change information of the target service, and inputs the feature distribution information and the service change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target service through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on feature distribution information, and service change information associated with anomaly source features, the anomaly source features being used to characterize the source of the service anomaly.

[0010] This method acquires business anomaly logs and business change information. When a business anomaly occurs, it first accurately locates the feature distribution information causing the anomaly based on the anomaly logs. Then, using an anomaly attribution system, it performs correlation analysis between the input feature distribution information and the business change information, ultimately clarifying the anomaly source characteristics and mining business change information related to these anomaly source characteristics. Compared to traditional solutions, this solution, after completing the initial anomaly source location, relies on the semantic understanding and correlation reasoning capabilities of the anomaly attribution system to parse the feature distribution information and establish management between it and the business change information. This overcomes the limitations of traditional rule engines, achieving a leap from locating the anomaly source to tracing the root cause, thereby completing the business anomaly attribution analysis with higher accuracy and efficiency, significantly improving the efficiency and response speed of business anomaly location. Attached Figure Description

[0011] Figure 1 This is a flowchart illustrating an anomaly attribution method provided in an exemplary embodiment; Figure 2 This is a schematic diagram of a multi-layer decision tree structure provided in an exemplary embodiment; Figure 3 This is an exemplary embodiment providing a flowchart of the construction process of a multi-level decision tree; Figure 4 This is a schematic diagram of a multi-level decision tree based on principal component analysis, provided as an exemplary embodiment. Figure 5 This is a schematic diagram of a multi-layer decision tree based on anomaly analysis, provided as an exemplary embodiment. Figure 6 This is a schematic diagram of an anomaly attribution system architecture provided in an exemplary embodiment; Figure 7 This is a schematic diagram of the structure of a device provided in an exemplary embodiment; Figure 8 This is a block diagram of an anomaly attribution apparatus provided in an exemplary embodiment. Detailed Implementation

[0012] To make the objectives, technical solutions, and advantages of this specification clearer, the technical solutions of this specification will be clearly and completely described below in conjunction with specific embodiments and corresponding drawings. Obviously, the described embodiments are only a part of the embodiments of this specification, and not all of them. Based on the embodiments in this specification, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this specification.

[0013] As digital transformation continues to accelerate, business systems face significant challenges. Particularly in financial trading scenarios, the stability and responsiveness of trading systems not only affect user experience but also directly impact the order and security of the financial market. Any anomalies in the trading system can severely damage the user's trading experience, disrupt investment decisions, and even trigger systemic risks.

[0014] Against this backdrop, achieving rapid fault analysis and accurate root cause localization has become a core issue in ensuring the stable operation of trading systems. However, despite the industry's increasing focus on system stability, current fault detection and root cause analysis technologies still have significant shortcomings. Most technologies currently rely on fixed detection rules or simple log analysis. Fixed detection rules typically only identify anomalies that perfectly match preset patterns. However, financial trading systems have complex architectures and diverse business scenarios, and the causes of faults often exhibit "combinatorial" and "variant" characteristics. Once new anomalies occur, fixed rules cannot flexibly adapt to newly emerging fault patterns, leading to a large number of "non-standard" faults being missed or misjudged. Furthermore, simple log analysis techniques can only uncover surface information and cannot delve into the potential relationships between system components, resulting in one-sided analysis and difficulty in accurately locating the root cause of the fault. This fails to meet the stringent requirements of trading systems for timeliness and accuracy.

[0015] Based on this, this specification provides an anomaly attribution method. This method integrates machine learning and natural language processing techniques to mine the feature distribution information of business anomalies in anomaly logs, determine the source of business anomalies and change information with obvious correlation, and further ensure the stable and continuous operation of business systems.

[0016] The technical solutions provided in the various embodiments of this specification are described in detail below with reference to the accompanying drawings.

[0017] Figure 1 This is a flowchart illustrating an anomaly attribution method provided in an exemplary embodiment, including: S100: Obtain the exception log of the target service and parse the exception log to determine the characteristic distribution information of the target service.

[0018] In this specification, the execution subject for implementing an anomaly attribution method can be a designated device such as a server, or it can be a computing platform, an edge computing device, or a cluster of intelligent terminals that integrates relevant computing and analysis capabilities. For ease of description, the following will use a server as the execution subject to illustrate an anomaly attribution method provided in this specification.

[0019] When a business operation experiences an anomaly, the server can retrieve the anomaly log for that business. This log records the anomaly events that occurred during the actual operation of the business system, including key information such as the timestamp of the anomaly and the triggering scenario. Simultaneously, the log also records corresponding candidate anomaly characteristics, including the client version number, operating system type, and identifier of the interfacing organization. This data can be structured using unique identifiers (such as globally unique IDs or serial numbers).

[0020] The exception logs acquired by the server can include the current log corresponding to the current business exception and the historical logs corresponding to previous business exceptions. In this specification, the server can acquire exception logs at various times. For example, when the server detects that a certain business exception occurs N times within a specified time window, it can acquire exception logs related to the business exception. In this case, the current log can be the exception logs corresponding to the first occurrence of the business exception within the current time window up to the Nth occurrence, and the historical logs can be the exception logs within the historical time window up to the time corresponding to the Nth occurrence of the business exception within the current time window.

[0021] Taking a 24-hour time window as an example, the historical time window is the previous time window of the current time window. The current time window can be from 0:00 to 24:00 of the current day, and the historical time window can be from 0:00 to 24:00 of the previous day. If the time corresponding to the Nth occurrence of a business anomaly within the current time window is 11:00, then the historical log will be the anomaly log corresponding to the business anomaly from 00:00 to 11:00 of the previous day.

[0022] For example, when the server detects a business anomaly and the end time of the time window has been reached, it can retrieve the anomaly logs related to the business anomaly. In this case, the current log contains all anomaly logs corresponding to the business anomaly within the current time window, and the historical log contains all anomaly logs corresponding to the business anomaly within the historical time windows.

[0023] Taking a 24-hour time window as an example, the historical time window is the previous time window of the current time window. The current time window can be from 0:00 to 24:00 of the current day, and the historical time window can be from 0:00 to 24:00 of the previous day. If the end time of the time window is 24:00, then the current log contains all logs of the service anomalies within the period from 0:00 to 24:00 of the current day, and the historical log contains all logs of the service anomalies within the period from 0:00 to 24:00 of the previous day.

[0024] In this specification, the aforementioned business anomalies may include page loading failure, transaction timeout, data verification failure, API call anomaly, system crash, service circuit breaker, network connection interruption, data transmission error, permission verification failure, business process blockage, resource loading anomaly, etc.

[0025] The aforementioned candidate anomaly features can be understood as characteristics of suspected anomalies in the anomaly events recorded in the anomaly log. These candidate anomaly features can be categorized according to different anomaly feature dimensions. Taking client version, operating system, and interfacing organization as examples, each of these corresponds to an anomaly feature dimension. A specific version number of the client version corresponds to a candidate anomaly feature under the client version dimension, a specific type identifier of the operating system corresponds to a candidate anomaly feature under the operating system dimension, and a specific organization identifier of the interfacing organization corresponds to a candidate anomaly feature under the interfacing organization dimension. Of course, in practical applications, other anomaly feature dimensions can also be used to categorize the candidate anomaly features in the anomaly log, such as business operation dimension, system status dimension, data interaction dimension, security attribute dimension, and geographical and time dimension.

[0026] The aforementioned target business may refer to financial transaction business, or it may be other businesses such as logistics management, intelligent manufacturing, cloud computing services, autonomous driving, etc. This specification does not make specific limitations on the above-mentioned abnormal feature dimensions, abnormal features, business abnormalities, and the contents included in the target business.

[0027] After obtaining the exception logs of the target service, the server can parse the exception logs to determine the characteristic distribution information of the target service.

[0028] The server can segment features based on log information in the anomaly log to construct a feature space, obtain each candidate anomaly feature, and determine the anomaly contribution score corresponding to each candidate anomaly feature. The anomaly contribution score is used to characterize the degree of correlation between the candidate anomaly feature and the business anomaly. Then, based on the anomaly contribution score, each candidate anomaly feature is arranged to determine the feature distribution information, wherein the feature distribution information contains at least some of the arranged candidate anomaly features.

[0029] In this specification, there are several ways to determine the anomaly contribution score corresponding to each candidate anomaly feature, including: Method 1: For each candidate anomaly feature, the server can determine the anomaly contribution score corresponding to that candidate anomaly feature based on the proportion of its occurrences in the current log, and thus determine the anomaly contribution score for each candidate anomaly feature. The proportion mentioned above is positively correlated with the anomaly contribution score.

[0030] Method 2: For each candidate anomaly feature, the server can determine the rate of change of the number of occurrences of the candidate anomaly feature based on the number of times the candidate anomaly feature appears in the current log and the number of times the candidate anomaly feature appears in the historical log. For example, if the candidate anomaly feature appears X times in the current log and Y times in the historical log, then its corresponding rate of change is X / Y.

[0031] The server can then determine the anomaly contribution score corresponding to the candidate anomaly feature based on the rate of change and the proportion of the candidate anomaly feature's occurrences in the current log. The rate of change is positively correlated with the anomaly contribution score; the server can determine the anomaly contribution score by multiplying the proportion by the rate of change.

[0032] Method 3: The server can determine the first contribution score of the candidate anomaly feature based on the proportion of its occurrences in the current log, and determine the second contribution score based on the aforementioned rate of change and the proportion of its occurrences in the current log. This results in an anomaly contribution score that includes both the first and second contribution scores.

[0033] Method 3 is the preferred option. The method for determining the above proportions will be described in detail below, and will not be elaborated on here.

[0034] Furthermore, the server can determine the aforementioned feature distribution information by constructing a multi-layer decision tree.

[0035] Specifically, the server can identify candidate anomaly features under different anomaly feature dimensions from the anomaly log. Then, based on the candidate anomaly features under each anomaly feature dimension, a multi-layer decision tree is constructed to determine the feature distribution information. For ease of understanding, this specification provides a schematic diagram of a multi-layer decision tree structure, such as... Figure 2 As shown.

[0036] Figure 2 This is a schematic diagram of a multi-layer decision tree structure provided in an exemplary embodiment.

[0037] In this multi-layer decision tree, each decision layer corresponds to an anomaly feature dimension, a decision node in each decision layer corresponds to a candidate anomaly feature under the corresponding anomaly feature dimension of that decision layer, and a decision node in each decision layer is associated with at least one decision node in its next decision layer.

[0038] In the process of constructing a multi-decision tree, the server can construct each decision layer of the multi-decision tree in sequence. For each decision layer, when constructing the decision layer, the server can select the abnormal feature dimension with the largest information gain value from the remaining abnormal feature dimensions as the abnormal feature dimension corresponding to the decision layer. The information gain value is used to characterize the degree of influence of the abnormal feature dimension on the abnormality of the target business. The remaining abnormal dimensions are the abnormal feature dimensions that were not selected when constructing the previous decision layer.

[0039] In practical applications, the server can calculate the information gain value corresponding to the abnormal feature dimension based on the preset RF classifier. The server can input the abnormal information of the business abnormality (such as the content of the abnormal alarm) and the abnormal log into the RF classifier so that the RF classifier can determine the abnormal feature dimension with the greatest impact on the business abnormality from the remaining abnormal feature dimensions that have not been selected, and construct the decision layer through the candidate abnormal features under the abnormal feature dimension.

[0040] like Figure 2 As shown, when constructing decision layer 2, since the docking organization dimension was already selected when constructing the previous decision layer (decision layer 1), the remaining abnormal feature dimensions at this time include the client dimension and the operating system dimension. Then, through the RF classifier, it is determined that the information gain value corresponding to the client dimension is the largest, so it is used as the abnormal feature dimension corresponding to decision layer 2.

[0041] In addition, the server can select the N largest abnormal feature dimensions based on the information gain value corresponding to each abnormal feature dimension to construct a multi-layer decision tree. The depth (number of decision layers) of this multi-layer decision tree is equal to N.

[0042] Furthermore, when constructing each decision layer, the server can determine the anomaly contribution score corresponding to each candidate anomaly feature under the anomaly feature dimension of that decision layer. For each decision layer, if multiple decision nodes in that decision layer belong to the same subtree in the multi-layer decision tree, then the multiple decision nodes are arranged in descending order of the anomaly contribution score corresponding to each decision node.

[0043] by Figure 2 For example, decision node 1-1, decision node 2-1, and decision node 2-2 belong to the same subtree (hereinafter referred to as subtree A), and decision node 2-1 and decision node 2-2 are located in the same decision layer. At this time, the server can determine the abnormal contribution scores corresponding to decision node 2-1 and decision node 2-2 respectively, and arrange the positions of decision node 2-1 and decision node 2-2 in decision layer 2 according to their respective contribution scores.

[0044] Decision nodes 1-2, 2-3, and 2-4 belong to the same subtree (hereinafter referred to as subtree B), and decision nodes 2-3 and 2-4 are located in the same decision layer. At this time, the server can determine the abnormal contribution scores of decision nodes 2-3 and 2-4 respectively, and arrange the positions of decision nodes 2-3 and 2-4 in decision layer 2 according to their respective contribution scores.

[0045] The server can arrange decision nodes with higher anomaly contribution scores on the same side of their respective subtrees (e.g., the leftmost or rightmost side). Figure 2 In the process, the abnormal contribution score of decision node 2-1 is greater than that of decision node 2-2. Therefore, decision node 2-1 is placed on the leftmost side of subtree A. Similarly, the abnormal contribution score of decision node 2-3 is greater than that of decision node 2-4. Therefore, decision node 2-3 is placed on the leftmost side of subtree B.

[0046] Therefore, the candidate anomaly features corresponding to each decision node located on the far left of the entire decision tree are the candidate anomaly features with the largest anomaly contribution scores under each anomaly feature dimension.

[0047] Furthermore, during the construction of each decision layer, the server can prune decision nodes ranked after a specified position within the sorted list, and then construct the decision layer based on these pruned nodes. This removes redundant candidate anomaly features, improving the accuracy and efficiency of the analysis.

[0048] To facilitate understanding, this specification provides a flowchart for constructing a multi-level decision tree, such as... Figure 3 As shown.

[0049] Figure 3 This is an exemplary embodiment providing a flowchart of the construction process of a multi-level decision tree.

[0050] In this process, when constructing each decision layer of the multi-layer decision tree, the server can use a classifier to calculate the information gain value corresponding to each feature dimension. Then, it selects the anomaly feature dimension with the highest information gain value and, based on the anomaly contribution score of each candidate anomaly feature under that dimension, sorts and prunes each decision node in that layer. This process is repeated for the N anomaly feature dimensions with the highest information gain values ​​until a preset depth M is reached or no more anomaly feature dimensions are available, outputting the multi-layer decision tree. The classifier can be a Random Forest Classifier (RF) or other classifiers; this specification does not specify a particular classifier.

[0051] Furthermore, the multi-level decision trees obtained from the abnormal contribution scores determined by Method 1 and Method 2 are also different. For ease of understanding, in Figure 2 Based on this, this specification provides a schematic diagram of a multi-level decision tree based on principal component analysis corresponding to Method 1 and a schematic diagram of a multi-level decision tree based on anomaly analysis corresponding to Method 2, as shown below. Figure 4 and Figure 5 As shown.

[0052] Figure 4 This is a schematic diagram of a multi-level decision tree based on principal component analysis, provided as an exemplary embodiment.

[0053] For each decision node, its proportion in the anomaly log can include the current proportion and the overall proportion. For each decision node, the overall proportion corresponding to the decision node (candidate anomaly feature) can be calculated based on the number of times the candidate anomaly feature corresponding to the decision node appears in the current log and the total number of anomaly logs. The current proportion corresponding to the decision node can be determined based on the number of times the candidate anomaly feature corresponding to the decision node appears in the current log and the number of times the candidate anomaly feature corresponding to the decision node in the previous decision layer appears in the current log.

[0054] Taking decision node 2-1 as an example, its occurrence count is 97, and the total number of abnormal logs is 191. Therefore, its overall proportion is 97 / 191=50.79%. The decision node 1-1 in the previous decision layer has an occurrence count of 114, so its current proportion is 97 / 114=85.09%.

[0055] In this specification, the server can determine the abnormal contribution score corresponding to the decision node based on the current proportion, or it can determine the abnormal contribution score corresponding to the decision node based on the overall proportion.

[0056] Preferably, the server anomaly contribution score can be determined based on the current proportion of each decision node (i.e., only the current contribution is considered when sorting the decision nodes at each level), and the candidate anomaly features corresponding to each decision node on the far left of the multi-level decision tree are the candidate anomaly features with the greatest correlation to business anomalies.

[0057] In addition, the server can return the constructed multi-level decision tree to the user. The information in the multi-level decision tree can include the current proportion and the overall proportion of each decision node, thus providing a reference for the user.

[0058] Figure 5 This is a schematic diagram of a multi-layer decision tree based on anomaly analysis, provided as an exemplary embodiment.

[0059] Among them, the method for determining the overall proportion and the current proportion for each decision node is as follows: Figure 4 Similarly, for each decision node, the corresponding comparison value is the number of times the candidate anomaly feature corresponding to that decision node appears in the historical logs, the difference is the difference between the number of times it appears in the current logs and the number of times it appears in the historical logs, and the rate of change is the ratio of its corresponding difference to the comparison value. Its corresponding anomaly contribution score is determined by the product of its current proportion and the rate of change. The candidate anomaly features corresponding to each decision node on the far left of the multi-level decision tree are the candidate anomaly features with the highest correlation to business anomalies.

[0060] Taking decision node 2-1 as an example, its corresponding occurrence count in the current log is 97, and its corresponding occurrence count in the historical log is 15. The difference between the two is 82, the change rate is 82 / 15=546.67%, and the abnormal contribution score is 546.67%×85.09%=4.65.

[0061] In addition, the server can return the constructed multi-level decision tree to the user. The information in the multi-level decision tree can include the current proportion, overall proportion, comparison value, difference, and rate of change of each decision node, thereby providing a reference for the user.

[0062] Furthermore, for method three, the server can determine the first distribution information based on the first contribution score corresponding to each candidate anomaly feature, and the second distribution information based on the second contribution score corresponding to each candidate anomaly feature. Since the method for determining the first contribution score is the same as in method one, and the method for determining the second contribution score is the same as in method two, the multi-layer decision tree corresponding to this first distribution information can be referenced. Figure 4 The multi-level decision tree corresponding to the information in the second part can be referenced. Figure 5 . S102: Obtain the business change information of the target business, and input the feature distribution information and the business change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on the feature distribution information, and business change information associated with the anomaly source features, wherein the anomaly source features are used to characterize the source of the business anomaly.

[0063] After determining the feature distribution information, the server can integrate the feature distribution information to make it conform to the input format of the model.

[0064] It should be noted that when the feature distribution information includes the first distribution information and the second distribution information determined by method three, the server can concatenate the data corresponding to the first distribution information and the second distribution information, and then input them into the anomaly attribution system based on the language model. The language model mentioned above can be a large language model (LLM).

[0065] In addition, the server can also obtain business change information of the target business. This business change information records the changes of the target business within a certain period of time. These changes include, but are not limited to, adjustments to performance parameters, updates to interface protocols, modifications to configuration files, upgrades to dependent components, fixes to security vulnerabilities, optimization and refactoring of code logic, additions or deletions of functional modules, and iterations of the user interface.

[0066] In practical applications, target business change information can be in the form of a change list, where each change corresponds to a timestamp when the change occurred, and the changes are arranged in chronological order in the change list.

[0067] In this scenario, the service change information obtained by the server can be service change information within a preset time range that corresponds to the exception log. This preset time range can be set according to actual circumstances, and this manual does not impose specific limitations on it. For example, if the exception log is from within the same 24-hour period and the time range is set to 24 hours, then the obtained service change information can be service change information from within the previous day's 24 hours as well as service change information from within the same day's 24 hours.

[0068] In addition, the server can also construct corresponding prompt words to prompt the anomaly attribution system to determine the characteristics of the anomaly source and the associated business change information based on the prompt words.

[0069] For example, the content of the prompt word may include: ## Role {Indicates the role and tasks performed by the anomaly attribution system}; ##Flow {Detailed breakdown of the process for completing the anomaly attribution task}; ## Background Information {Indicates the background of the transaction and a brief overview of the decision tree results}; ##Constraints {indicate the constraints that the process and output results must satisfy}; ##User input: {Feature distribution information, business change information}.

[0070] The server can then include feature distribution information and business change information in the prompt words and input them into the anomaly attribution system.

[0071] After receiving feature distribution information, the anomaly attribution system can first analyze the feature distribution information to determine the anomaly source features used to characterize the source of business anomalies.

[0072] Specifically, when the feature distribution information is determined solely based on method one or method two as described in S100, only one type of feature distribution information will be determined. When this feature distribution information is a multi-layer decision tree, since the candidate abnormal features corresponding to the leftmost decision nodes of the decision tree have the highest abnormal contribution scores, these abnormal features are the abnormal source features. For example... Figure 4 or Figure 5 The decision nodes 1-1, 2-1, and 3-1 in the table have the following anomaly source characteristics: organization A, client version 10.6.70.6000, and operating system IOS, respectively. This indicates that the source of the business anomaly is client version 10.6.70.6000 under the operating system IOS that is connected to organization A.

[0073] When the feature distribution information is determined solely based on method three in S100, the input feature distribution information includes two types: first distribution information and second distribution information. In this case, if there is a discrepancy between the anomaly source features determined based on the first distribution information and the anomaly source features determined based on the second distribution information, the server can integrate the first and second distribution information, combining the first and second contribution scores (e.g., by weighted summation or product operation) to re-rank the candidate anomaly features, thereby obtaining comprehensive distribution information, and determining the anomaly source features based on the comprehensive distribution information. For ease of understanding, this specification provides a schematic diagram of the anomaly attribution system architecture, such as... Figure 6 As shown.

[0074] Figure 6 This is a schematic diagram of an anomaly attribution system architecture provided in an exemplary embodiment.

[0075] After the server obtains the anomaly logs, it can perform principal component analysis based on the current logs (see Method 1) to construct a decision tree corresponding to the first anomaly distribution information. At the same time, it can perform anomaly analysis based on the current logs and historical logs (see Method 2) to construct a decision tree corresponding to the second anomaly distribution information. Then, the decision trees constructed by principal component analysis and anomaly analysis, along with the business change information, are input into a language model-based anomaly attribution system for intelligent change analysis, so that the anomaly attribution system outputs anomaly attributions that include anomaly source characteristics and business change information.

[0076] It's important to note that since the first and second contribution scores are of different orders of magnitude, to redetermine a new score based on these scores, a weight can be added to the first contribution position as a standardization factor. Taking decision node 2-1 as an example, if its current proportion is used as the first contribution score, its corresponding first contribution score is 0.85, and its corresponding second contribution score is 4.65. Therefore, the weight of the first contribution can be set to 10, and the redefined contribution position is: 0.85 × 10 + 4.65.

[0077] After identifying the anomaly source characteristics, the server can further perform correlation analysis between the anomaly source characteristics and various business change information to determine the correlation degree between each business change information and the anomaly source characteristics. Based on this correlation degree, one or more business change information with the highest correlation degree with the anomaly source characteristics are selected from the various business change information. This business change information is the root cause of the anomaly caused by the anomaly source characteristics.

[0078] Thus, the server can obtain anomaly attribution information output by the anomaly attribution system, which includes anomaly source characteristics and business change information associated with the anomaly source characteristics.

[0079] Once the anomaly attribution information is determined, the server can take corresponding processing strategies to handle business anomalies based on this information, such as rolling back to a historical version of the program, undoing configuration parameter modifications, rolling back data change operations, and reworking, adjusting, and upgrading.

[0080] As can be seen from the above, the solution relies on a language model-based anomaly attribution system to achieve a deep integration of semantic understanding and logical reasoning in complex fault location scenarios. By integrating the results of principal component analysis and anomaly analysis with business change information, it promotes intelligent fault location. This solution significantly reduces the reliance on professional skills in operation and maintenance work, effectively improves the efficiency and accuracy of fault location, and provides more intelligent and efficient technical support for operation and maintenance management.

[0081] Furthermore, this solution innovatively proposes a custom decision tree construction algorithm based on the RF classifier. By performing deep feature segmentation on abnormal logs and using information gain as the core selection criterion, a hierarchical and logically rigorous decision tree model is constructed. This method can not only efficiently identify the key factors triggering anomalies, but also eliminate redundant feature dimensions through pruning strategies, significantly improving the accuracy and timeliness of the analysis.

[0082] At the anomaly analysis level, this solution continues the core approach of principal component analysis, introducing a joint contribution index that integrates year-on-year change rate and proportion, opening up a new perspective for anomaly log analysis. By dynamically comparing current logs with historical data, it accurately captures the dimensions of abnormal fluctuations, thereby more quickly and accurately locating the root cause of the fault. This method integrates time series analysis into the traditional analysis framework, strengthening the ability to mine differences in logs throughout the fault's entire lifecycle, and significantly enhancing the stability and reliability of fault location.

[0083] Figure 7 This is a schematic structural diagram of a device provided in an exemplary embodiment. Please refer to... Figure 7 At the hardware level, the device includes a processor, internal bus, network interface, memory, and non-volatile memory, and may also include other hardware required for its functions. One or more embodiments of this specification can be implemented in software, such as by the processor reading the corresponding computer program from non-volatile memory into memory and then running it. Of course, besides software implementation, one or more embodiments of this specification do not exclude other implementation methods, such as logic devices or a combination of hardware and software, etc. That is to say, the execution entity of the following processing flow is not limited to individual logic units, but can also be hardware or logic devices.

[0084] Please refer to Figure 8 Anomaly attribution devices can be applied to, for example... Figure 7 The device shown is used to implement the technical solution of this specification. The anomaly attribution device may include: The parsing module 800 is used to obtain the abnormal logs of the target service and parse the abnormal logs to determine the characteristic distribution information of the target service; The attribution module 802 is used to acquire the business change information of the target business and input the feature distribution information and the business change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on the feature distribution information, and business change information associated with the anomaly source features, wherein the anomaly source features are used to characterize the source of the business anomaly.

[0085] Optionally, the parsing module 800 is specifically used to: determine each candidate anomaly feature in the anomaly log, and determine the anomaly contribution score corresponding to each candidate anomaly feature, wherein the anomaly contribution score is used to characterize the degree of correlation between the candidate anomaly feature and the business anomaly; and arrange each candidate anomaly feature based on the anomaly contribution score to determine the feature distribution information; wherein the feature distribution information includes at least some of the arranged candidate anomaly features.

[0086] Optionally, the exception log includes: the current log corresponding to this service exception; The parsing module 800 is specifically used to determine the anomaly contribution score corresponding to each candidate anomaly feature based on the proportion of the number of times the candidate anomaly feature appears in the current log.

[0087] Optionally, the exception log includes: the current log corresponding to the current business exception and the historical log corresponding to the previous business exception; The parsing module 800 is specifically used to, for each candidate anomaly feature, determine the rate of change of the number of occurrences of the candidate anomaly feature based on the number of occurrences of the candidate anomaly feature in the current log and the number of occurrences of the candidate anomaly feature in the historical log; and determine the anomaly contribution score corresponding to the candidate anomaly feature based on the rate of change and the proportion of the number of occurrences of the candidate anomaly feature in the current log.

[0088] Optionally, the parsing module 800 is specifically used to: determine candidate abnormal features under different abnormal feature dimensions in the abnormal log; construct a multi-level decision tree based on the candidate abnormal features under each abnormal feature dimension, so as to determine the feature distribution information using the multi-level decision tree; wherein: each decision layer of the multi-level decision tree corresponds to an abnormal feature dimension, a decision node in each decision layer corresponds to a candidate abnormal feature under the corresponding abnormal feature dimension of the decision layer, and a decision node in each decision layer is associated with at least one decision node in its next decision layer.

[0089] Optionally, the parsing module 800 is specifically used to sequentially construct each decision layer of the multi-layer decision tree to determine the multi-layer decision tree based on each sequentially constructed decision layer; wherein, for each decision layer, when constructing the decision layer, the abnormal feature dimension with the largest corresponding information gain value is selected from the remaining abnormal feature dimensions as the abnormal feature dimension corresponding to the decision layer; the information gain value is used to characterize the degree of influence of the abnormal feature dimension on the abnormality of the target business, and the remaining abnormal dimensions are the abnormal feature dimensions that were not selected when constructing the previous decision layer.

[0090] Optionally, the parsing module 800 is specifically used to, for each decision layer, determine the anomaly contribution score corresponding to each candidate anomaly feature under the anomaly feature dimension of the decision layer when constructing the decision layer; the anomaly contribution score is used to characterize the degree of correlation between the candidate anomaly feature and the business anomaly; if multiple decision nodes in the decision layer belong to the same subtree in the multi-layer decision tree, then arrange the multiple decision nodes in descending order of the anomaly contribution score corresponding to each decision node; and construct the decision layer based on the arranged decision nodes.

[0091] Optionally, the parsing module 800 is specifically used to prune the decision nodes that are arranged after a specified position in the arranged decision nodes, and to construct the decision layer based on the pruned decision nodes.

[0092] Optionally, the exception log includes: the current log corresponding to the current business exception and the historical log corresponding to the historical business exception, and the exception contribution score includes: a first contribution score and a second contribution score. The parsing module 800 is specifically used to determine the rate of change of the occurrence count of each candidate anomaly feature based on the number of times the candidate anomaly feature appears in the current log and the number of times the candidate anomaly feature appears in the historical log. Based on the proportion of the number of times the candidate anomaly feature appears in the current log, a first contribution score is determined for the candidate anomaly feature. Based on the rate of change and the proportion of the number of times the candidate anomaly feature appears in the current log, a second contribution score is determined for the candidate anomaly feature.

[0093] Optionally, the feature distribution information includes: first distribution information determined based on the first contribution score corresponding to each candidate abnormal feature, and second distribution information determined based on the second contribution score corresponding to each candidate abnormal feature; The attribution module is specifically used to: input the first distribution information, the second distribution information, and the business change information into the anomaly attribution system; if there is a deviation between the anomaly source characteristics determined based on the first distribution information and the anomaly source characteristics determined based on the second distribution information, then integrate the first distribution information and the second distribution information to obtain comprehensive distribution information, and determine the anomaly source characteristics based on the comprehensive distribution information.

[0094] Based on the same concept as the methods described above, this specification also provides a computer-readable storage medium having computer instructions stored thereon that, when executed by a processor, implement the steps of the methods as described in any of the above embodiments.

[0095] Based on the same concept as the methods described above, this specification also provides a computer program product, including a computer program / instructions that, when executed by a processor, implement the steps of the methods as described in any of the above embodiments.

Claims

1. An anomaly attribution method, comprising: Obtain the anomaly logs of the target service and parse the anomaly logs to determine the characteristic distribution information of the target service; Obtain the business change information of the target business, and input the feature distribution information and the business change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on the feature distribution information, and business change information associated with the anomaly source features, wherein the anomaly source features are used to characterize the source of the business anomaly.

2. The method as described in claim 1, wherein determining the feature distribution information of the target service specifically includes: Each candidate anomaly feature is identified in the anomaly log, and the anomaly contribution score corresponding to each candidate anomaly feature is determined. The anomaly contribution score is used to characterize the degree of correlation between the candidate anomaly feature and the business anomaly. Based on the anomaly contribution score, each candidate anomaly feature is arranged to determine the feature distribution information; wherein, the feature distribution information includes at least some of the arranged candidate anomaly features.

3. The method as described in claim 2, wherein the exception log includes: The current log corresponding to this business anomaly; Determine the anomaly contribution score for each candidate anomaly feature, specifically including: For each candidate anomaly feature, the anomaly contribution score is determined based on the proportion of the number of times the candidate anomaly feature appears in the current log.

4. The method as described in claim 2, wherein the exception log comprises: The current log corresponding to this business exception and the historical log corresponding to previous business exceptions; Determine the anomaly contribution score for each candidate anomaly feature, specifically including: For each candidate anomaly feature, the rate of change of the number of occurrences of the candidate anomaly feature is determined based on the number of occurrences of the candidate anomaly feature in the current log and the number of occurrences of the candidate anomaly feature in the historical log. Based on the rate of change and the proportion of the number of times the candidate anomaly feature appears in the current log, the anomaly contribution score corresponding to the candidate anomaly feature is determined.

5. The method as described in claim 1, wherein determining the feature distribution information of the target service specifically includes: Candidate anomaly features under different anomaly feature dimensions are identified in the anomaly log; Based on the candidate anomaly features under each anomaly feature dimension, a multi-layer decision tree is constructed to determine the feature distribution information; wherein: Each decision layer of the multi-layer decision tree corresponds to an anomaly feature dimension. A decision node in each decision layer corresponds to a candidate anomaly feature under the anomaly feature dimension of that decision layer. A decision node in each decision layer is associated with at least one decision node in the next decision layer.

6. The method as described in claim 5, wherein a multi-layer decision tree is constructed based on candidate anomaly features under each anomaly feature dimension, specifically including: Each decision layer of the multi-layer decision tree is constructed sequentially to determine the multi-layer decision tree based on the sequentially constructed decision layers. For each decision layer, when constructing the decision layer, the anomaly feature dimension with the largest corresponding information gain value is selected from the remaining anomaly feature dimensions and used as the anomaly feature dimension corresponding to that decision layer. The information gain value is used to characterize the degree of influence of the anomaly feature dimension on the anomaly occurring in the target business. The remaining anomaly dimensions are the anomaly feature dimensions that were not selected when constructing the previous decision layer.

7. The method of claim 6, wherein each decision layer of the multi-layer decision tree is constructed, specifically comprising: For each decision layer, when constructing the decision layer, the abnormal contribution score corresponding to each candidate abnormal feature under the abnormal feature dimension of the decision layer is determined. The anomaly contribution score is used to characterize the degree of correlation between candidate anomaly features and business anomalies; If multiple decision nodes in the decision layer belong to the same subtree in the multi-level decision tree, then the multiple decision nodes are arranged in descending order of the abnormal contribution score corresponding to each decision node. Based on the arranged decision nodes, the decision layer is constructed.

8. The method as described in claim 7, wherein the decision layer is constructed based on the arranged decision nodes, specifically including: In the arranged decision nodes, the decision nodes that are ranked after the specified position are pruned, and the decision layer is constructed based on the pruned decision nodes.

9. The method as described in claim 2, wherein the exception log comprises: The current log corresponding to this business anomaly and the historical log corresponding to historical business anomalies, wherein the anomaly contribution score includes: a first contribution score and a second contribution score; Determine the anomaly contribution score for each candidate anomaly feature, specifically including: For each candidate anomaly feature, the rate of change of the number of occurrences of the candidate anomaly feature is determined based on the number of occurrences of the candidate anomaly feature in the current log and the number of occurrences of the candidate anomaly feature in the historical log. Based on the proportion of the number of times the candidate anomaly feature appears in the current log, a first contribution score is determined for the candidate anomaly feature. Based on the rate of change and the proportion of the number of times the candidate anomaly feature appears in the current log, a second contribution score is determined for the candidate anomaly feature.

10. The method of claim 9, wherein the feature distribution information includes: The first distribution information is determined based on the first contribution score corresponding to each candidate anomaly feature, and the second distribution information is determined based on the second contribution score corresponding to each candidate anomaly feature. The feature distribution information and the business change information are input into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business. Specifically, this includes: The first distribution information, the second distribution information, and the business change information are input into the anomaly attribution system. If there is a discrepancy between the anomaly source characteristics determined based on the first distribution information and the anomaly source characteristics determined based on the second distribution information, the first distribution information and the second distribution information are integrated to obtain comprehensive distribution information, and the anomaly source characteristics are determined based on the comprehensive distribution information.

11. An anomaly attribution device, comprising: The parsing module is used to obtain the abnormal logs of the target service and parse the abnormal logs to determine the characteristic distribution information of the target service; The attribution module is used to acquire business change information of the target business and input the feature distribution information and the business change information into a language model-based anomaly attribution system to determine the anomaly attribution information of the target business through the anomaly attribution system; wherein, the anomaly attribution information includes: anomaly source features determined based on the feature distribution information, and business change information associated with the anomaly source features, wherein the anomaly source features are used to characterize the source of the business anomaly.

12. An electronic device, comprising: processor; A memory for storing processor-executable instructions; wherein the processor implements the steps of the method as described in any one of claims 1-10 by executing the executable instructions.

13. A computer-readable storage medium having stored thereon computer instructions that, when executed by a processor, implement the steps of the method as claimed in any one of claims 1-10.

14. A computer program product comprising a computer program / instructions that, when executed by a processor, implement the steps of the method as claimed in any one of claims 1-10.