Abnormality troubleshooting method and device, electronic equipment, storage medium and program product

By acquiring business data and contextual information in real time through the intelligent question-answering system, semantic structured processing and multi-tool collaborative scheduling are achieved, solving the problem of existing systems relying on manual intervention and improving the efficiency of anomaly investigation and the speed of problem location.

CN122308909APending Publication Date: 2026-06-30SHANGHAI SHIZHUANG INFORMATION TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SHANGHAI SHIZHUANG INFORMATION TECHNOLOGY CO LTD
Filing Date
2026-03-31
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

Existing intelligent question-answering systems rely on manual intervention when dealing with complex anomaly troubleshooting scenarios, resulting in a heavy workload for maintenance personnel and long problem-solving times, making it difficult to meet the needs of rapid discovery and automatic location in large-scale distributed systems.

Method used

By acquiring business data, middleware status, and runtime context information in real time through the intelligent question-and-answer system, semantic structured processing and automated collaborative scheduling of multiple tools are achieved, replacing manual switching between multiple systems and manual data collection, and integrating them into an automated anomaly troubleshooting process.

Benefits of technology

It significantly shortens the average time to resolve anomalies, reduces the workload of operations and maintenance personnel, improves the efficiency of problem investigation, and enables rapid location of code-level root causes of problems.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122308909A_ABST
    Figure CN122308909A_ABST
Patent Text Reader

Abstract

This application provides an anomaly troubleshooting method, apparatus, electronic device, storage medium, and program product, relating to the field of computer technology. The method, by invoking tools and relying on underlying client dependencies to obtain business data, middleware status, and runtime context information in real time, enables the system to possess dynamic data access and deep technical information retrieval capabilities. Based on this, semantic structured processing is used to uniformly transform user anomaly queries or system alarms into executable instructions, and the execution chain enables automated collaborative scheduling of multiple tools, replacing the traditional mode of manually switching between multiple systems, manually collecting data, and inferring root causes step by step. This mechanism integrates the originally scattered data entry points and complex troubleshooting processes into an automated system, enabling the system to quickly locate code-level root causes of problems, significantly shortening the average anomaly resolution time, reducing the workload of maintenance personnel, and improving the efficiency of problem troubleshooting.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of computer technology, and more specifically, to an anomaly detection method, apparatus, electronic device, storage medium, and program product. Background Technology

[0002] In the current field of system operation and maintenance and business support, intelligent question-and-answer systems have been gradually applied to internal anomaly investigation and consulting services. Existing technical solutions generally adopt a core architecture of unified entry point and intent distribution: first, natural language queries from users are received through a single service entry point; then, the request content is parsed based on the intent understanding module; after identifying the user's intent, the request is distributed to the corresponding business domain's intelligent agent for processing.

[0003] Regarding business data access capabilities, the existing system's technical architecture has limitations, failing to effectively connect to and integrate dynamic business data sources. This results in responses being limited to general, static knowledge-based questions, making it difficult to support real-time business status queries and data-driven decision analysis needs. These technical limitations mean that existing intelligent question-answering systems often require manual intervention for multi-system switching, manual data collection, and root cause inference when handling complex anomaly troubleshooting scenarios. This not only increases the workload of operations and maintenance personnel but also significantly prolongs the average problem resolution time, making it difficult to meet the needs of rapid anomaly detection and automatic location in large-scale distributed systems. Summary of the Invention

[0004] The purpose of this application is to provide an anomaly detection method, apparatus, electronic device, storage medium, and program product to improve the problem of low problem detection efficiency caused by the reliance on manual intervention in existing intelligent question-and-answer systems.

[0005] In a first aspect, embodiments of this application provide an anomaly detection method applied to an intelligent question-answering system, the method comprising: In response to a triggered service request, the service request is semantically structured to generate a structured parsing result, wherein the service request includes a user abnormal query request or a system alarm event. Based on the parsing results, a corresponding matching execution link is determined, and the execution link is used to call at least one tool to work collaboratively; The corresponding tools are invoked according to the execution chain, and the execution results of each tool are obtained. During the invocation process, middleware information and system runtime context information are obtained through the underlying client dependencies. The execution results returned by each tool are integrated to generate and feed back the response result for the service request.

[0006] In the aforementioned implementation process, by using the underlying client dependencies to obtain business data, middleware status, and runtime context information in real time through the calling tools, the system acquires dynamic data access and deep technical information retrieval capabilities. Based on this, semantic structured processing transforms user anomaly queries or system alarms into executable instructions, and the execution chain enables automated collaborative scheduling of multiple tools, replacing the traditional model of manually switching between multiple systems, manually collecting data, and inferring root causes step by step. This mechanism integrates the previously scattered data entry points and complex troubleshooting processes into an automated system, enabling the system to quickly locate code-level root causes of problems, significantly shortening the average time to resolve anomalies, reducing the workload of operations and maintenance personnel, and improving the efficiency of problem troubleshooting.

[0007] Optionally, the step of semantically structuring the service request to generate a structured parsing result includes: Obtain multi-source context information, which includes: extraction rules for business domains, field dictionaries, and parameter templates; Based on the multi-source context information, key parameters are extracted from the service request to obtain extraction results, which include business scenarios, functional scenarios, and common parameters used for tool invocation. The extracted results are then processed into a structured parsing result.

[0008] In the above implementation process, by acquiring multi-source context information, the accurate transformation from the original service request to the structured execution instruction is achieved, laying the data foundation for the automated anomaly investigation process of the entire intelligent question answering system, thereby reducing multiple rounds of follow-up questions and improving the overall processing efficiency.

[0009] Optionally, if the service request includes abnormal user query requests, the multi-source context information further includes user historical preferences, commonly used query criteria, and parameters from the most recent session. This information enables personalized and accurate parsing of user queries, avoids users repeatedly entering previously provided information, and improves question-and-answer efficiency.

[0010] Optionally, determining the corresponding matching execution link based on the parsing result includes: If the complexity of the scenario corresponding to the parsing result exceeds a preset threshold, an execution chain is generated through a large model; If the scenario complexity corresponding to the parsing result is lower than the preset threshold, the matched standardized execution template will be used as the execution link.

[0011] In the above implementation process, the dual-mode mechanism enables rapid response to high-frequency and simple scenarios, while maintaining deep reasoning capabilities for complex and unknown scenarios, ensuring the efficiency and effectiveness of the intelligent question-answering system in anomaly investigation tasks.

[0012] Optionally, the process of integrating the execution results returned by each tool to generate and feed back a response result for the service request includes: The execution results returned by each tool are processed at the field level to generate standardized output results. The field-level processing includes at least one of data format conversion, semantic standardization conversion, and sensitive information desensitization. The standardized output results are integrated to generate and feed back a response result for the service request.

[0013] In the above implementation process, the above field-level processing mechanism ensures that the output information is not only safe and compliant, but also has business readability and the credibility of the conclusions.

[0014] Optionally, the step of performing field-level processing on the execution results returned by each tool to generate standardized output results includes: The execution results returned by each tool are processed according to the field-level processing rules corresponding to the field type to generate standardized processing results.

[0015] In the above implementation process, through the mechanism of classifying and processing by field type, fields of the same type can adopt consistent rules, avoiding the need to write separate processing code for each field. Furthermore, when adding a new field type, only the type and processing rules need to be defined, without modifying the core processing flow.

[0016] Optionally, after invoking the corresponding tool according to the execution chain, the method further includes: The input parameters, intermediate results, and interaction logic of each tool call in the execution chain are obtained to generate a full-chain record.

[0017] In the above implementation process, the end-to-end recording mechanism not only automates the anomaly investigation process, but also ensures the interpretability and auditability of the process, making the system's decision-making logic completely transparent to users, and significantly improving the trust and controllability of handling complex problems.

[0018] Optionally, the process of invoking the corresponding tool according to the execution chain further includes: A global context object is maintained, which is used to pass shared parameters and intermediate calculation results between different tools. This object serves as a unified runtime data hub, continuously accumulating and passing shared parameters and intermediate calculation results during tool calls. This allows subsequent tools to perform in-depth processing based on the output of previous steps without repeatedly acquiring the same data or relying on manual intervention to pass information, ensuring the continuity and consistency of complex investigation tasks in the multi-step execution process.

[0019] Secondly, embodiments of this application provide an anomaly detection device applied to an intelligent question-answering system, the device comprising: The perception module is used to respond to triggered service requests, perform semantic structuring processing on the service requests, and generate structured parsing results, wherein the service requests include user abnormal query requests or system alarm events. The planning module is used to determine the corresponding matching execution link based on the parsing results. The execution link is used to call at least one tool to work collaboratively. The execution module is used to call the corresponding tools according to the execution chain and obtain the execution results of each tool. During the calling process, middleware information and system runtime context information are obtained through the underlying client dependencies. The mapping module is used to integrate the execution results returned by various tools, generate and feed back the response results for the service request.

[0020] Thirdly, embodiments of this application provide an electronic device, including a processor and a memory, wherein the memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the steps of the method provided in the first aspect above are performed.

[0021] Fourthly, embodiments of this application provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, performs the steps of the method provided in the first aspect above.

[0022] Fifthly, embodiments of this application provide a computer program product, including computer program instructions, which, when read and executed by a processor, perform the steps of the method provided in the first aspect above.

[0023] Other features and advantages of this application will be set forth in the following description and will be apparent in part from the description or may be learned by practicing embodiments of this application. The objectives and other advantages of this application may be realized and obtained by means of the structures particularly pointed out in the written description, claims, and drawings. Attached Figure Description

[0024] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments of this application will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0025] Figure 1 A structural block diagram of an intelligent question-answering system provided in an embodiment of this application; Figure 2A flowchart illustrating an anomaly detection method provided in this application embodiment; Figure 3 An execution flowchart of a sensing module provided in an embodiment of this application; Figure 4 An execution flowchart of an execution module and a mapping module provided in an embodiment of this application; Figure 5 A structural block diagram of an anomaly detection device provided in an embodiment of this application; Figure 6 This is a schematic diagram of the structure of an electronic device for performing an anomaly detection method, provided as an embodiment of this application. Detailed Implementation

[0026] The technical solutions in the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings.

[0027] It should be noted that the terms "system" and "network" in the embodiments of this invention can be used interchangeably. "Multiple" refers to two or more; therefore, in the embodiments of this invention, "multiple" can also be understood as "at least two". "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Additionally, the character " / ", unless otherwise specified, generally indicates that the preceding and following related objects have an "or" relationship.

[0028] It should also be noted that all actions involving the acquisition of signals, information, or data in this application are carried out in compliance with the relevant data protection laws and policies of the country where the application is located, and with the authorization granted by the owner of the relevant device.

[0029] This application provides an anomaly troubleshooting method applied to an intelligent question-and-answer system. This method enables the system to dynamically access data and retrieve deep technical information by real-time acquiring business data, middleware status, and runtime context information through underlying client dependencies within the calling tool. Based on this, semantic structuring transforms user anomaly queries or system alarms into executable instructions, and automated collaborative scheduling of multiple tools is achieved through the execution chain, replacing the traditional model of manually switching between multiple systems, manually collecting data, and inferring root causes step by step. This mechanism integrates previously scattered data entry points and complex troubleshooting processes into an automated system, enabling the system to quickly locate code-level root causes of problems, significantly shortening the average anomaly resolution time, reducing the workload of maintenance personnel, and improving the efficiency of problem troubleshooting.

[0030] like Figure 1 As shown, Figure 1This is a schematic diagram of the architecture of an intelligent question-answering system. The system includes an external interaction layer and internal modules. The external interaction layer may include application robots, page assistants, web page conversations, etc., which can be used to implement information input. Internal modules may include a perception module, planning module, execution module, memory module, feedback module, desensitization module, tracking module, stability module, etc. Some of these modules are used in this solution, and their functions will be described in the subsequent implementation flow. The functions of the internal modules are as follows: Figure 1 As shown in the image.

[0031] The intelligent question-answering system can also integrate some underlying tools, which can be called to obtain relevant information during anomaly investigation.

[0032] Please refer to Figure 2 , Figure 2 A flowchart of an anomaly detection method provided in this application embodiment, which is applied to an intelligent question-answering system, includes the following steps: Step S110: In response to the triggered service request, perform semantic structuring on the service request to generate a structured parsing result.

[0033] Service requests include abnormal user query requests or system alarm events.

[0034] When the intelligent question-answering system receives a service request, the perception module first performs semantic structuring processing on the request. These service requests fall into two categories: first, user-initiated exception queries using natural language, such as "Why did order DW2024001 payment time out?"; and second, automatically triggered alarm events, such as an alarm pushed by the monitoring system indicating that "the payment interface response time exceeded 5 seconds." The goal of semantic structuring processing is to transform unstructured input into machine-readable structured parsing results.

[0035] Step S120: Determine the corresponding matching execution link based on the parsing results.

[0036] This step can be executed by the planning module. After receiving the structured parsed results, the planning module can determine the corresponding execution chain based on its content. The execution chain is used to call at least one tool to work collaboratively. It can be understood as an ordered sequence of tool calls used to collaboratively complete the anomaly investigation task.

[0037] The planning module can combine structured parsing results, a list of available tools (each tool includes a name, function description, input / output format, etc.), and a few historical examples into a prompt, and then call the large model to generate the execution chain. For example, regarding "payment timeout diagnosis" in the above example, the chain output by the large model might be: (1) Call the order query tool (Tool_OrderQuery), input the parameter order_no, and obtain the payment order number payment_id; (2) Call the link tracing tool (Tool_TraceQuery), input the payment_id and time range, and obtain the complete call chain that the request went through and the time consumed by each node; (3) Call the log retrieval tool (Tool_LogQuery), input the abnormal node identifier and time range, and obtain the relevant error logs; (4) Call the middleware monitoring tool (Tool_MiddlewareMonitor), input the database connection pool identifier corresponding to the abnormal node, and obtain the connection pool status during the period.

[0038] Subsequently, the planning module can verify the generated links to ensure that there are no circular dependencies and that the parameter passing paths are clear, and then deliver them to the execution module.

[0039] Step S130: Call the corresponding tools according to the execution chain and obtain the execution results of each tool.

[0040] This step can be performed by the execution module, which calls each tool sequentially according to the execution chain issued by the planning module. Each tool is a pre-packaged business capability unit that can interface with various data sources or service interfaces, including HTTP interfaces, Dubbo services, database queries, log file reading, etc.

[0041] In some implementations, during the invocation process, the execution module maintains a global context object. This global context object is used to pass shared parameters and intermediate calculation results between different tools. The lifecycle of the global context object is bound to the current service request, created when the execution chain starts, and destroyed after all tool invocations are completed and response results are generated. It stores data in key-value pairs and provides a unified read / write interface for different tools. When multiple tools write data to the same key, the default strategy is to overwrite previous writes with later writes. Conflicts can also be avoided by configuring namespaces or field prefixes.

[0042] During the invocation process, middleware information and system runtime context information are obtained through underlying client dependencies. These underlying client dependencies refer to the various middleware clients, database drivers, service invocation SDKs (Software Development Kits), or infrastructure access interfaces that the intelligent question-answering robot directly relies on when invoking the tool to obtain in-depth, real-time internal system information. They serve as a bridge connecting the tool to the underlying technical system, enabling the tool to penetrate to the code level and runtime environment. When the execution module invokes the tool according to the chain, the tool directly initiates requests to the underlying system through these client dependencies to obtain real-time, dynamic technical data that traditional question-answering systems cannot access. For example, it obtains the real-time connection pool level through the database driver, the complete call trajectory of a request through the tracing client, and the JVM thread stack through JMX. This data ultimately converges into the global context object, providing in-depth technical evidence for root cause analysis.

[0043] Taking the above example, the order query tool is first invoked: the execution module reads the order number from the global context, constructs a request, obtains the payment order number "P001" through the HTTP client of the order service, and writes this payment order number into the global context object. Next, the distributed tracing tool is invoked: this tool reads the payment order number and time range from the global context object, and sends a request to the distributed tracing server by invoking the distributed tracing system's client (the underlying client depends on it), obtaining the call chain data related to P001 within that time period. The distributed tracing tool encapsulates a client that interacts with middleware (such as SkyWalking or Zipkin), enabling it to pull distributed tracing data in real time. This data belongs to the system's runtime context information, reflecting the actual flow path and time consumption of requests across multiple services. The call chain data returned by the tool is stored in the global context object.

[0044] Then, a log retrieval tool is invoked: this tool retrieves relevant error logs through a log service client (such as the ELK API) based on the anomaly node identifier (e.g., the name of the service with the longest execution time in the call chain) and time range in the global context object, and stores the log content in the global context object. Finally, a middleware monitoring tool is invoked: this tool monitors the database connection pool client (e.g., directly connecting to MySQL's information_schema or obtaining the connection pool MBean via JMX) to obtain information such as the number of active connections and waiting threads in the connection pool during the anomaly period in real time. This information is also runtime context information that can reveal the underlying resource status. The execution results of all tools are returned in a standardized format and accumulated in the global context object.

[0045] It should be noted that if the execution result returned by a tool in the execution chain can resolve the service request issue, then subsequent tool calls can be stopped, and the execution results returned by the previous tools can be retrieved.

[0046] Step S140: Integrate the execution results returned by each tool, generate and feed back the response results for the service request.

[0047] After the various tools are invoked, the global context object contains multi-source data, including order information, call chain data, error logs, and connection pool status. At this point, the mapping module is responsible for integrating these execution results to generate the final response. The core function of the mapping module is to perform field-level processing on the raw technical data, including data format conversion, semantic standardization conversion, and sensitive information anonymization.

[0048] For example, the raw data returned by the tracing tool may contain timestamps, service names, IP addresses, etc., which the mapping module transforms into a business-readable description: "The request took 5.2 seconds at the payment gateway node (exceeding the normal value of 200ms)". The connection pool metric returned by the middleware monitoring tool is transformed into "The number of active connections in the database connection pool has reached the upper limit (100 / 100)".

[0049] The mapping module aggregates all the transformed information and combines it with the original question from the parsing results to generate a natural language conclusion, such as: "Order DW2024001 payment timed out because the connection pool was exhausted when the payment gateway accessed the database (100 / 100 active connections). It is recommended to check for database connection leaks or expand the connection pool. See the attachment for detailed call chain and logs." Simultaneously, the system can visualize the input and output parameters of each step of the execution chain for user traceability.

[0050] Ultimately, the intelligent question-and-answer system can provide feedback to users or maintenance personnel through a dialogue interface or alarm channel.

[0051] In the aforementioned implementation process, by using the underlying client dependencies to obtain business data, middleware status, and runtime context information in real time through the calling tools, the system acquires dynamic data access and deep technical information retrieval capabilities. Based on this, semantic structured processing transforms user anomaly queries or system alarms into executable instructions, and the execution chain enables automated collaborative scheduling of multiple tools, replacing the traditional model of manually switching between multiple systems, manually collecting data, and inferring root causes step by step. This mechanism integrates the previously scattered data entry points and complex troubleshooting processes into an automated system, enabling the system to quickly locate code-level root causes of problems, significantly shortening the average time to resolve anomalies, reducing the workload of operations and maintenance personnel, and improving the efficiency of problem troubleshooting.

[0052] Based on the above embodiments, when parsing a service request, the perception module can obtain multi-source context information, which may include: extraction rules of the business domain, field dictionary and parameter template. Then, based on the multi-source context information, the key parameters of the service request are extracted to obtain the extraction results. The extraction results may include business scenarios, functional scenarios and common parameters used for tool calls. Subsequently, the extraction results are processed in a structured manner to obtain a structured parsing result.

[0053] As the entry point of the intelligent question-answering system, the perception module's core objective is to accurately convert user-initiated natural language queries or system-triggered alarm events into structured parsing results that can be directly consumed by downstream modules during the first round of interaction, thereby reducing multiple rounds of follow-up questions and improving overall processing efficiency.

[0054] The execution flow of the perception module is as follows: Figure 3 As shown. When the perception module receives a service request, it first loads multi-source context information related to the current business scenario from the knowledge base or configuration center. This information is pre-configured or accumulated through historical learning and is used to guide subsequent parameter extraction. The multi-source context information mainly includes three categories: (1) Extraction rules for business domains: Pre-defined intent recognition and parameter extraction strategies for different business domains (such as payment domain, logistics domain, and finance domain). For example, the payment domain focuses on order number, payment order number, channel ID, etc., while the logistics domain focuses on waybill number, logistics provider code, etc. The extraction rules exist in the form of configuration files or DSL (domain-specific language) scripts, defining regular expressions, keyword weights, and semantic parsing templates for recognizing these fields from natural language.

[0055] (2) Field dictionary: A list of parameter fields supported by each business domain, along with their aliases and synonym mappings. For example, the field dictionary defines aliases for "order number" including "order_no", "orderId", and "transaction number", while also indicating the field type (string, integer, time range) and whether it is required. The field dictionary is used to normalize synonyms during parsing, ensuring that different user expressions are mapped to the same standard field.

[0056] (3) Parameter templates: Pre-set parameter structure templates for different functional scenarios define which field combinations are required to execute the function. For example, the parameter template for the "Payment Order Query" function requires the payment order number or order number, time range, and environment (test / production); the parameter template for the "Timeout Diagnosis" function additionally requires the interface name, timeout threshold, etc.

[0057] Based on the acquired multi-source context information, the perception module performs layered extraction of service requests, refining the parsing results layer by layer. Layered extraction includes three sub-layers: (1) Business Scenario Extraction: First, determine the major business domain to which the service request belongs. The perception module matches user input (e.g., "Why did order DW2024001 fail to pay?") or system alarms (e.g., "Payment interface timeout alarm") with the extraction rules of the business domain. For example, by matching the rule set of the payment domain through the keywords "payment failure" and "payment interface", the business scenario is determined to be "payment". This layer determines which business domain's agent (i.e., Agent) will be called subsequently.

[0058] (2) Functional Scenario Extraction: Within the defined business domain, further identify the specific operation the user wants to perform. The perception module combines the field dictionary and functional templates to analyze the semantic focus of the request. For example, for the statement "payment failed", combined with the context (such as containing the keyword "timeout") and the functional template library, the functional scenario "diagnose_payment_timeout" is matched. The identification of functional scenarios relies on a predefined intent classifier (such as a BERT-based text classification model) or few-shot reasoning from a large model.

[0059] (3) Common Parameter Extraction: This is the finest-grained extraction layer, aiming to extract key fields from service requests that can be directly used for tool calls. The perception module traverses all fields supported by the business domain in the field dictionary and extracts values ​​from the service request using the corresponding extraction rules (regular expression matching, entity recognition model). For example, the order number "DW2024001" is extracted from "order DW2024001". If the time range is not directly provided in the request, it is automatically completed according to the context or default rules (such as alarm events having their own timestamps, or users defaulting to the most recent 30 minutes). If environmental parameters are involved and not specified, the default is "prod" (production environment). Basic validation can also be performed during the extraction process, such as whether the order number format conforms to the specifications and whether the time range is reasonable.

[0060] After the layered extraction is complete, the perception module assembles the three layers of results into a structured parsing result. This result can use structured data formats such as JSON or Protocol Buffers, containing a clear field hierarchy to facilitate parsing by subsequent modules. The structured parsing result can include at least three parts: scenario information (identifiers of business scenarios used for routing to the correct processing unit), function points (identifiers of functional scenarios used for matching specific execution modules or orchestration logic), and parameter key-value pairs (a collection of key-value pairs of common parameters).

[0061] The structured parsing result is then passed to the planning module as the basis for subsequent execution chain orchestration.

[0062] In the above implementation process, by acquiring multi-source context information, the accurate transformation from the original service request to the structured execution instruction is achieved, laying the data foundation for the automated anomaly investigation process of the entire intelligent question answering system, thereby reducing multiple rounds of follow-up questions and improving the overall processing efficiency.

[0063] In some implementations, if the service request includes an abnormal user query request, the aforementioned multi-source context information may also include the user's historical preferences, commonly used query criteria, and parameters from the most recent session.

[0064] When a service request is an unusual query request initiated by a user, the awareness module, in addition to loading the extraction rules, field dictionary, and parameter templates preset by the business domain, will also additionally obtain personalized context information related to the current user. This information originates from the session context storage and is used to achieve accurate parsing of the user's query, avoid the user repeatedly entering previously provided information, and improve the efficiency of the first round of interaction.

[0065] Upon receiving a user query, the perception module first retrieves the user's personalized context information from the distributed cache or session storage based on the user identifier (such as user ID or session ID). This information mainly includes three categories: User historical preferences: Records long-term preference settings exhibited by users in their historical interactions. For example, if an operations and maintenance personnel frequently focuses on the "Payment Center" business, the system automatically marks that user's default business domain as "payment"; if 80% of the requests in a user's historical queries specify the "production environment," then "env:prod" will be their default environment preference. Historical preferences are stored in key-value pairs.

[0066] Commonly Used Query Templates: The system compiles frequently used query condition combinations to create "query templates." For example, if a user frequently queries "order timeout status within the last hour," the system can abstract this pattern into a template: {"function": "query_timeout_orders", "params": {"time_range": "1h", "status": "timeout"}}. When a user's new input contains ambiguous intent (such as "check the timeout status again"), the system can quickly complete the parameters based on this template.

[0067] Recent Session Parameters: Records parameters confirmed in the previous rounds of interaction during the current session to maintain session continuity. For example, if the user provided order number "DW2024001" in the previous round, and in this round they only say "check its payment logs," the system does not need to retrieve the order number again and can reuse it directly. This information exists in a temporary cache and automatically expires when the session ends.

[0068] After loading multi-source context information, the perception module performs hierarchical extraction of user queries, dynamically integrating personalized information during the extraction process to achieve intent calibration and parameter completion.

[0069] For example, if a user's query is vague (such as "there's been another problem"), the perception module will first refer to the business scenario in the parameters of the most recent session, or use the default scenario in the user's historical preferences (such as "payment") as a fallback. For example, if the user only enters "timeout", combined with the high-frequency scenario "payment" in historical preferences, the system will determine the business scenario as the payment domain and the functional scenario as "diagnose_timeout".

[0070] During the parameter extraction phase, the perception module traverses the field dictionary to attempt to extract explicit parameters from the query, while automatically filling in missing required parameters by referring to personalized information: Environment parameters: If the query does not specify an environment (such as "test" or "production"), and the user's history preferences contain the default environment "prod", then env: prod will be automatically completed.

[0071] Time range: If the query does not provide a specific time point (such as "the recent failure"), the time range parameter time_range will be automatically generated by combining the timestamp of the most recent session or the default time window in the user's commonly used query criteria (such as "the last 30 minutes"). For example, "just now" will be parsed as "15 minutes before the current time".

[0072] Order number / user ID reuse: If the current query only mentions "its payment order", the perception module will look up the extracted order number "DW2024001" or user ID from the parameters of the most recent session and automatically associate them to generate common parameters.

[0073] Query caliber matching: If the user input is highly similar to a common query caliber (such as "the same old problem"), the perception module directly calls the matching caliber template, and merges the preset parameters in the template (such as the function point "query_timeout_orders" and the parameter "status: timeout") with the current input to generate a structured extraction result.

[0074] After the layered extraction is completed, the perception module assembles the extraction results, which incorporate personalized contextual information, into a structured parsing result. This result not only includes information directly extracted from the current query, but also explicitly labels which parameters are completed through personalized context (optional for subsequent interpretability display).

[0075] Through the above mechanism, the perception module achieves personalized and accurate parsing of user queries, significantly reducing the cost of multi-round interactions and enabling the intelligent question-answering system to adapt more naturally to user habits.

[0076] Based on the above embodiments, in determining the matching execution link according to the parsing results, the generation path of the execution link can be selected according to the scenario complexity. Specifically, if the scenario complexity corresponding to the parsing results exceeds a preset threshold, the execution link is generated through a large model; if the scenario complexity corresponding to the parsing results is lower than the preset threshold, the matched standardized execution template is used as the execution link.

[0077] This approach introduces a dual-model generation mechanism based on scene complexity. The complexity evaluator quantifies and scores the parsing results, and depending on whether the score exceeds a preset threshold, it adopts either dynamic generation of a large model or standardized template matching to balance processing efficiency and problem-solving depth.

[0078] After receiving the structured parsing results, the planning module first inputs them into the complexity estimator. This estimator calculates the complexity score of the current scenario from multiple dimensions, with the score range normalized to between 0 and 1. Evaluation dimensions may include: Functionality Baseline Weight: Presets the baseline complexity for each type of functionality. For example, the complexity of "Single Order Query" is 0.2, "Multi-Data Source Aggregation Query" is 0.5, "Root Cause Diagnosis" is 0.8, and "Unknown Intent Handling" is 0.9. If the functionality (Payment Timeout Diagnosis) belongs to the Root Cause Diagnosis category, its baseline weight is set to 0.8.

[0079] Number of parameters and constraints: The more parameters the public parameter `params` contains, or the more advanced constraints such as time range, pagination, and multiple condition combinations exist, the more the complexity increases. For example, if `params` only contains `order_no`, the complexity increases by 0.05; if it contains multiple parameters such as `time_range`, `env`, and `interface`, the complexity increases by 0.15.

[0080] Historical execution statistics: Query the average number of tool calls for this function in the knowledge base during historical executions. If the average number of calls exceeds 5, the complexity increases by 0.1; if exceptions or timeouts occur frequently in historical executions, the complexity increases by an additional 0.05.

[0081] Semantic confidence: If the perception module adds a confidence indicator (such as confidence: 0.7) to the parsing result, when the confidence is lower than 0.8, it is considered that the intent may be ambiguous, and the complexity increases by 0.1.

[0082] The complexity score can be calculated based on the above dimensions, and then compared with a preset threshold. For example, if the preset threshold is set to 0.5, it is considered a complex scenario and enters the large model dynamic generation mode. If it is not exceeded, it is considered a simple scenario and enters the standardized template matching mode.

[0083] In some implementations, the complexity of a scenario can also be identified based on the aforementioned business and functional scenarios. For example, if only one functional point is involved, meaning only one tool needs to be called to return a result, it can be identified as a simple scenario; the rest are complex scenarios. The definitions and examples of each scenario are shown in the table below:

[0084] For standardized template matching, the planning module can parse the function points in the results as primary keys and retrieve matching templates from the standardized execution template library. The module library is pre-built through manual configuration or historical links. Each module may include: template ID, applicable function points, tool call sequence, parameter mapping rules (such as defining how to map common parameters in the parsed results to the inputs of each tool, supporting static values, variable references and simple expressions), and exception handling strategies (such as the number of retries when the call fails, degradation schemes, etc.).

[0085] The planning module iterates through the template library, searching for templates whose applicable function point fields completely match the function points in the current parsing result. If found, the template is instantiated directly: the common parameters in the parsing result are filled into the tool call sequence according to the mapping rules to generate an executable link object.

[0086] If no perfectly matching template is found, the templates are sorted by similarity (e.g., based on text similarity of function descriptions) and the template with the highest similarity is selected for adaptation. The tool order or parameter mapping may be adjusted if necessary. If all similarities are below the threshold (e.g., 0.7), the system is downgraded to dynamic generation mode.

[0087] For complex scenarios where the complexity exceeds a threshold, the planning module dynamically generates the execution chain by calling the large model. The planning module can assemble the following information into input prompts for the large model: The current structured parsing results; A list of available tools and a detailed description of each tool, including tool name, function description, input parameter format, output result format, and usage examples; Sample examples demonstrate the generation patterns of similar complex problems in the past, such as "For payment timeout diagnosis, it is usually necessary to call the order query tool, the link tracing tool, the log retrieval tool, and the middleware monitoring tool in sequence." Constraints, such as "The generated link must be output in JSON array format, with each element containing tool_name, input_params, and error_handler".

[0088] The large model can perform inference based on prompts and output a multi-step execution chain. The planning module can perform syntax validation, circular dependency detection, and resource estimation on the chain output by the large model. If a missing step, incorrect parameter reference, or potential infinite loop is found, it will trigger regeneration (e.g., retrying after adjusting the prompts) or request manual intervention. After the validation passes, the chain is marked as "dynamically generated" and delivered to the execution module.

[0089] For dynamically generated links, if their effectiveness and versatility are subsequently verified, they can be stored in a standardized execution template library after manual review for reuse in similar scenarios.

[0090] In the above implementation process, through the dual-mode mechanism, the planning module can not only respond quickly to high-frequency and simple scenarios, but also maintain deep reasoning ability for complex and unknown scenarios, ensuring the efficiency and effectiveness of the intelligent question answering system in anomaly investigation tasks.

[0091] In some implementations, the execution status of each tool call can be monitored during the process of calling each tool. If the execution status indicates that the call has failed or timed out, a preset exception handling strategy is triggered. The exception handling strategy may include retrying, downgrading, or calling alternative tools.

[0092] When the execution module begins executing a tool call, it can create an execution status object for each tool call instance to record the lifecycle status of the call in real time. The execution status is mainly divided into the following categories: pending execution, executing, success, failure, timeout, and circuit breaker.

[0093] The execution module can initiate tool calls asynchronously and non-blockingly, registering callback functions for each call or using the Future pattern to wait for results. Simultaneously, a separate timeout monitoring thread is started to maintain a timer for each executing call. When a call does not return within a preset timeout threshold (e.g., 5 seconds), the timeout monitoring thread actively interrupts the call and marks its status as "timeout".

[0094] Before the execution chain begins, the planning module passes the exception handling strategy associated with each tool call as metadata to the execution module. These strategies are defined in a configurable manner, supporting settings at the tool level, the functional scenario level, or the global default rules. Specifically, they include: Retry: When a tool call fails or times out, the call is automatically re-initiated. This mechanism can be configured with parameters such as the maximum number of retries, retry interval, and retry conditions.

[0095] Degradation: When a tool call fails without retrying, or fails after all retries have been exhausted, a pre-defined degradation result is returned to ensure the process can continue. The degradation result can be a static default value (such as returning an empty list, a default string "Data is temporarily unavailable"), cached data (reading the result of the last successful call from the local cache or Redis), or simplified logic (calling a lighter-weight alternative interface to retrieve only the core fields).

[0096] Using Alternative Tools: When the primary tool is unavailable, other tools with similar functionality are invoked as alternatives. For example, if the standard order query tool (based on MySQL) times out, a backup order query tool (based on Elasticsearch or a read-only replica) can be switched to. Alternative tools must be declared in advance during tool registration, and their input / output formats must be compatible with the primary tool.

[0097] The execution module calls each tool sequentially according to the execution chain. During each tool call, the execution module continuously monitors its execution status and triggers corresponding exception handling strategies based on status changes. If a tool call returns a clear failure status (such as HTTP 500), or the timeout monitoring thread marks it as "timeout," the execution module immediately interrupts the current call (e.g., closes the HTTP connection, cancels the Future task) and enters the exception handling process.

[0098] The execution module reads the tool's pre-configured exception handling strategy and makes decisions based on the following priorities: Check the retry strategy: If the current number of failures (including this one) does not exceed the maximum number of retries, then a retry is performed. The execution module waits for the specified retry interval, then re-initiates the call and increments the retry count by 1. During the retry process, the parameters in the global context remain unchanged.

[0099] After retries are exhausted, an alternative tool strategy is checked: If the retry count is exhausted and the call still fails, or if no retry strategy is configured, an alternative configuration is checked. If one exists, the execution module dynamically modifies the execution chain, replacing the current tool with the alternative tool, and re-initiates the call using the same input parameters. The invocation of the alternative tool is also subject to its own timeout and retry strategy.

[0100] Final Degradation: If both retries and alternatives fail, or if retries / alternatives are not configured, a degradation strategy is executed. The execution module generates degradation results based on the alternative configurations, writes them to the global context (possibly with error flags), and continues executing subsequent tools instead of interrupting the entire process. This design ensures that even if some tools fail, the process can still collect as much information as possible, providing some basis for the final comprehensive judgment.

[0101] After any exception handling strategy is triggered, the execution module generates an exception event record and appends it to the execution status of that step, ultimately merging it into the end-to-end record. The exception event record contains the following fields: exception type, exception details, trigger strategy, timestamp, and duration.

[0102] This record serves two purposes: firstly, it facilitates subsequent auditing and link analysis; secondly, it informs users appropriately when providing final feedback that "some data sources experienced anomalies but have been restored through backup solutions," thereby enhancing the transparency of the conclusions.

[0103] Through the aforementioned execution status monitoring and anomaly handling mechanisms, the fault tolerance of the intelligent question-answering system in complex distributed environments is enhanced, ensuring that even when some services are unstable, the system can still collect as much valid information as possible and provide users with meaningful feedback or clear error messages.

[0104] The execution flow of the execution module and the mapping module can be as follows: Figure 4 As shown. In some implementations, in the above-described method of integrating the execution results returned by various tools, the execution results returned by each tool can first be processed at the field level to generate standardized output results. Field-level processing includes at least one of data format conversion, semantic standardization, and sensitive information desensitization. Then, the standardized output results are integrated to generate and feed back the response results for the service request.

[0105] In the process of integrating the execution results returned by various tools to generate the final response, the mapping module assumes the core responsibility of data governance and semantic unification. Located at the end of the execution chain, this module receives the raw execution results returned from various tools, and through multi-type processing at the field level, transforms the technology-driven raw data into standardized output that is business-readable, consistent in definition, and secure and compliant. Finally, the output module integrates and feeds the results back to the user.

[0106] Before performing field-level processing, the mapping module first loads pre-configured field-level processing rules for the current business scenario. These rules are defined in the form of configuration files or a visual interface, specifying the specific processing type for each field that the tool may return. This embodiment supports the following three core processing types, and can be finely combined and configured by field: (1) Data format conversion: Converting raw data from a technical format to a business-friendly display format. For example, converting the timestamp "1742572800000" to "2025-03-21 16:00:00"; mapping enumeration values ​​"0" and "1" to "success" and "failure"; converting the byte size "1048576" to "1MB". The format conversion rules are based on the data type of the field and the business display requirements.

[0107] (2) Semantic standardization transformation: Transforming obscure technical terms or code-level information in the original data into natural language descriptions that business personnel can understand. For example, transforming the database error code "ORA-00942" into "table or view does not exist". Semantic standardization relies on a predefined terminology mapping library or the real-time rewriting capability of large models.

[0108] (3) Sensitive Information Desensitization: Sensitive fields that conform to preset rules in the execution results returned by each tool are replaced or hidden to ensure that the output results meet data security compliance requirements. For example, the ID number “320101100001011234” is desensitized to “320101********1234”; the mobile phone number “16600138000” is desensitized to “166**8000”; and the internal IP address “10.23.45.67” is desensitized to “10.23..”. Desensitization rules support multiple methods such as regular expression matching, fixed-length mask, and keyword replacement.

[0109] In some implementations, different tools may be configured with different desensitization rules. Therefore, based on the execution results of each tool, the desensitization rules configured by that tool can be obtained first, and then desensitization processing can be performed according to those rules.

[0110] If a field is not configured with any processing type, the mapping module will not expose the field by default to avoid the spread of irrelevant information or the accidental output of sensitive data.

[0111] After the execution module completes all tool calls, the raw execution results returned by each tool are aggregated in the global context object. The mapping module iterates through each tool output in the global context and processes them one by one according to the field-level processing rules.

[0112] After completing field-level processing of all tool outputs, the mapping module aggregates the standardized outputs of each tool and combines them with the original question from the structured parsing results to generate the final response. The aggregation process includes: Information filtering and sorting: Based on the functional scenario, determine which outputs are the core conclusions and which are supporting evidence. For example, in a timeout diagnosis scenario, the root cause "connection pool exhaustion" should be the primary conclusion, call chain timeout data should be used as procedural evidence, and the original logs should be used as supplementary details.

[0113] Natural language generation: It splices the filtered information into coherent natural language paragraphs, and calls large models to summarize or polish the information when necessary.

[0114] Additional traceability information: Optionally, an execution chain summary can be included in the response, showing an overview of the input and output of each tool call, allowing users to trace the conclusions.

[0115] Finally, the output module feeds back the response results to the user or maintenance personnel through a dialog interface, alarm channel, or work order system.

[0116] Through the above field-level processing mechanism, the mapping module ensures that the output information is not only safe and compliant, but also has business readability and credibility.

[0117] In the above-described method of processing the execution results returned by each tool at the field level, the execution results returned by each tool can be processed according to the field-level processing rules corresponding to the field type to generate standardized processing results.

[0118] In this approach, differentiated processing strategies can be predefined for fields with different business meanings and data structure characteristics. Through an automated process of type recognition and strategy matching, standardized outputs that conform to both business semantics and security and compliance requirements can be generated.

[0119] Before performing field-level processing, the mapping module first loads a predefined field type system. This system is a categorized directory that classifies the various fields that the tool may return into several types according to their business meaning and data structure characteristics. In this embodiment, the field type system includes, but is not limited to, the following categories: Direct data types include strings, integers, and floating-point numbers. These types represent the original technical form of fields.

[0120] Enumerated status types (enum): such as order status, payment channel. These fields need to map code values ​​to descriptive text that business users can read.

[0121] Date and time data types (time, date): such as creation time, update time, log timestamp. These fields require a standardized format conversion (e.g., converting ISO8601 to "YYYY-MM-DD HH:MM:SS").

[0122] Sensitive information types (masks): such as ID card numbers, mobile phone numbers, bank card numbers, and internal IP addresses. These fields must be masked according to the masking rules.

[0123] Composite structure types (json, jsonArray): such as JSON objects, arrays, and lists of key-value pairs. These types of fields may require recursive processing of their inner subfields.

[0124] In some implementations, field types may also include numerical measurement types (such as amount, time (milliseconds), and number of connections. These fields require unit conversion, formatted display (retaining decimal places), or threshold alarm coloring), and technical diagnostic types (such as exception stack traces, error codes, and SQL statements. These fields require semantic simplification or terminology explanation to transform obscure technical information into business-readable descriptions).

[0125] The field type system is stored in the form of configuration files or database tables, with each type associated with a set of default processing rules. For example, "enumerated state type" is associated with the "code-description mapping table," and "sensitive information type" is associated with the "regular expression desensitization rule library." The specific correspondence between field types and their field-level processing rules is shown in the table below:

[0126] After the execution module completes all tool calls, the raw execution results returned by each tool are aggregated in the global context object. The mapping module iterates through each tool output in the global context, performing type identification and rule matching for each field. Then, based on the matched rules, the mapping module performs specific processing operations on each field.

[0127] Through the above mechanism of classifying and processing by field type, fields of the same type can use consistent rules, avoiding the need to write separate processing code for each field. Furthermore, when adding a new field type, only the type and processing rules need to be defined, without modifying the core processing flow.

[0128] Based on the above embodiments, after calling the corresponding tools according to the execution chain, the input parameters, intermediate results and interaction logic of each tool call in the execution chain can also be obtained to generate a full-chain record.

[0129] When the planning module delivers the execution chain to the execution module, the execution module first creates a unique execution instance ID for this execution and initializes a tracing context. This context is a structured data container used throughout the entire execution process to accumulate and record detailed information about each step. The tracing context pre-contains basic information about the execution chain, including the chain ID, the associated service request ID, the start timestamp, and the complete tool call sequence template.

[0130] For example, in a payment timeout diagnostic scenario, the execution chain issued by the planning module includes four tool call steps: order query tool, tracing tool, log retrieval tool, and middleware monitoring tool. The execution module expands this into an executable task queue and reserves a recording slot for each step.

[0131] The execution module calls each tool sequentially according to the link order, synchronously capturing full-link information each time a tool is called. The captured information is divided into three categories: Input parameters: Records the actual input parameters passed when calling this tool. These parameters may come from common parameter fields of the structured parsing results, or from intermediate results written into the global context in previous steps. Before initiating the call, the execution module serializes the final assembled request parameters (such as the HTTP request body and Dubbo request object) into JSON format and stores them in the tracing context along with the call timestamp.

[0132] Intermediate Results: The raw results returned by the recording tool after execution. These results are stored in the global context for use by subsequent tools, and a complete copy is also made into the tracing context. Intermediate results may include order information, call chain data, log lists, or monitoring metrics.

[0133] Interaction Logic: Records the data flow and dependencies between tools. Specifically, this includes: which key-value pairs were read from the global context in this step, which key was written to the output of this step, and the execution status of this step (success, failure, timeout). If an exception occurs, it can also record the exception type, error code, and stack trace.

[0134] Once all tool calls in the execution chain are completed (regardless of success or failure), the execution module triggers the aggregation and persistence of the full-chain record. At this point, the input parameters, intermediate results, and interaction logic for each step have accumulated in the tracing context. The execution module integrates this information into a complete full-chain record object. This record object is persistently stored in a dedicated tracing archive (such as Elasticsearch or a time-series database) and associated with the identifier of this service request for subsequent retrieval and auditing.

[0135] When providing the final response to users, a summary or visual link to the execution chain can be attached. Users can click "View Execution Chain" to expand a tree-like or flowchart-style interface, intuitively seeing the inputs and outputs, time consumption, and data flow of each tool call. For example, below the response result of the payment timeout diagnosis, a simplified list of execution steps is displayed; clicking on a step expands to show detailed input parameters and return results.

[0136] The historical records in the execution pipeline archive can be used for subsequent analysis. For example, by statistically analyzing the average execution time and failure rate of each tool, performance bottlenecks or error-prone links in the execution pipeline can be identified; by reviewing the anomaly records of failed pipelines, the stability of the tool itself or the orchestration strategy of the planning module can be optimized; when similar problems occur, historical pipelines can be directly retrieved for user reference, or used to train large models to generate better pipelines.

[0137] Through the aforementioned end-to-end recording mechanism, this solution not only automates the anomaly investigation process but also ensures the interpretability and auditability of the process, making the decision-making logic of the intelligent question-and-answer system completely transparent to users and significantly improving the trustworthiness and controllability of handling complex issues.

[0138] Please refer to Figure 5 , Figure 5 This is a structural block diagram of an anomaly detection device 200 provided in an embodiment of this application. The device 200 may be a module, program segment, or code on an electronic device. It should be understood that the device 200 corresponds to the above method embodiment and is capable of performing the various steps involved in the method embodiment. The specific functions of the device 200 can be found in the description above. To avoid repetition, detailed descriptions are appropriately omitted here.

[0139] Optionally, the device 200 includes: The perception module 210 is used to respond to a triggered service request, perform semantic structuring processing on the service request, and generate a structured parsing result, wherein the service request includes a user abnormal query request or a system alarm event. The planning module 220 is used to determine the corresponding matching execution link based on the parsing result, and the execution link is used to call at least one tool to work collaboratively; The execution module 230 is used to call the corresponding tools according to the execution chain and obtain the execution results of each tool. During the calling process, middleware information and system runtime context information are obtained through the underlying client dependency. The mapping module 240 is used to integrate the execution results returned by various tools, generate and feed back the response results for the service request.

[0140] Optionally, the perception module 210 is used to acquire multi-source context information, which includes: extraction rules for the business domain, field dictionary, and parameter template; extract key parameters from the service request based on the multi-source context information to obtain extraction results, which include business scenarios, functional scenarios, and common parameters for tool invocation; and perform structured processing on the extraction results to obtain structured parsing results.

[0141] Optionally, if the service request includes an abnormal user query request, the multi-source context information may also include the user's historical preferences, commonly used query criteria, and parameters of the most recent session.

[0142] Optionally, the planning module 220 is used to generate an execution link through a large model if the scene complexity corresponding to the parsing result exceeds a preset threshold; and to use the matched standardized execution template as the execution link if the scene complexity corresponding to the parsing result is lower than the preset threshold.

[0143] Optionally, the mapping module 240 is used to perform field-level processing on the execution results returned by each tool to generate standardized output results. The field-level processing includes at least one of data format conversion, semantic standardization conversion, and sensitive information desensitization. The standardized output results are then integrated to generate and feed back a response result for the service request.

[0144] Optionally, the mapping module 240 is used to process the execution results returned by each tool according to the field-level processing rules corresponding to the field type, and generate standardized processing results.

[0145] Optionally, the mapping module 240 is further configured to obtain the input parameters, intermediate results and interaction logic of each tool call in the execution chain, and generate a full-chain record.

[0146] Optionally, the execution module 230 is used to maintain a global context object, which is used to transfer shared parameters and intermediate calculation results between different tools.

[0147] It should be noted that those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working process of the device described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0148] Please refer to Figure 6 , Figure 6 This application provides a schematic diagram of the structure of an electronic device for performing an anomaly troubleshooting method. The electronic device may include: at least one processor 310, such as a CPU; at least one communication interface 320; at least one memory 330; and at least one communication bus 340. The communication bus 340 is used to establish communication between these components. In this embodiment, the communication interface 320 is used for signaling or data communication with other node devices. The memory 330 may be high-speed RAM or non-volatile memory, such as at least one disk storage device. Optionally, the memory 330 may also be at least one storage device located remotely from the processor. The memory 330 stores computer-readable instructions; when these instructions are executed by the processor 310, the electronic device performs the aforementioned method process.

[0149] Understandable. Figure 6 The structure described is for illustrative purposes only; the electronic device may also include components that are more advanced than those described above. Figure 6 More or fewer components as described, or having the same Figure 6 The different configurations. Figure 6The components described herein can be implemented using hardware, software, or a combination thereof.

[0150] This application provides a computer-readable storage medium storing a computer program thereon. When the computer program is executed by a processor, it performs the method process executed by the electronic device in the above method embodiments.

[0151] This embodiment discloses a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium. The computer program includes program instructions, and when the program instructions are executed by a computer, the computer can perform the methods provided in the above-described method embodiments, such as including: In response to a triggered service request, the service request is semantically structured to generate a structured parsing result, wherein the service request includes a user abnormal query request or a system alarm event. Based on the parsing results, a corresponding matching execution link is determined, and the execution link is used to call at least one tool to work collaboratively; The corresponding tools are invoked according to the execution chain, and the execution results of each tool are obtained. During the invocation process, middleware information and system runtime context information are obtained through the underlying client dependencies. The execution results returned by each tool are integrated to generate and feed back the response result for the service request.

[0152] In summary, this application provides an anomaly troubleshooting method, apparatus, electronic device, storage medium, and program product. This method, by using a calling tool to obtain business data, middleware status, and runtime context information in real time through underlying client dependencies, enables the system to possess dynamic data access and deep technical information retrieval capabilities. Based on this, semantic structured processing transforms user anomaly queries or system alarms into executable instructions, and automated collaborative scheduling of multiple tools is achieved through the execution chain, replacing the traditional mode of manually switching between multiple systems, manually collecting data, and inferring root causes step by step. This mechanism integrates the originally scattered data entry points and complex troubleshooting processes into an automated system, enabling the system to quickly locate code-level root causes of problems, significantly shortening the average anomaly resolution time, reducing the workload of maintenance personnel, and improving the efficiency of problem troubleshooting.

[0153] In the embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. The apparatus embodiments described above are merely illustrative. For example, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. Furthermore, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Additionally, the displayed or discussed mutual couplings, direct couplings, or communication connections may be through some communication interfaces; indirect couplings or communication connections between devices or units may be electrical, mechanical, or other forms.

[0154] Furthermore, the units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0155] Furthermore, the functional modules in the various embodiments of this application can be integrated together to form an independent part, or each module can exist independently, or two or more modules can be integrated to form an independent part.

[0156] In this document, relational terms such as first and second are used only to distinguish one entity or operation from another entity or operation, without necessarily requiring or implying any such actual relationship or order between these entities or operations.

[0157] The above description is merely an embodiment of this application and is not intended to limit the scope of protection of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application.

Claims

1. An anomaly detection method, characterized in that, Applied to intelligent question-answering systems, the method includes: In response to a triggered service request, the service request is semantically structured to generate a structured parsing result, wherein the service request includes a user abnormal query request or a system alarm event. Based on the parsing results, a corresponding matching execution link is determined, and the execution link is used to call at least one tool to work collaboratively; The corresponding tools are invoked according to the execution chain, and the execution results of each tool are obtained. During the invocation process, middleware information and system runtime context information are obtained through the underlying client dependencies. The execution results returned by each tool are integrated to generate and feed back the response result for the service request.

2. The method according to claim 1, characterized in that, The semantic structuring processing of the service request to generate a structured parsing result includes: Obtain multi-source context information, which includes: extraction rules for business domains, field dictionaries, and parameter templates; Based on the multi-source context information, key parameters are extracted from the service request to obtain extraction results, which include business scenarios, functional scenarios, and common parameters used for tool invocation. The extracted results are then processed into a structured parsing result.

3. The method according to claim 2, characterized in that, If the service request includes abnormal user query requests, the multi-source context information also includes user historical preferences, commonly used query criteria, and parameters of the most recent session.

4. The method according to claim 1, characterized in that, The step of determining the corresponding matching execution link based on the parsing result includes: If the complexity of the scenario corresponding to the parsing result exceeds a preset threshold, an execution chain is generated through a large model; If the scenario complexity corresponding to the parsing result is lower than the preset threshold, the matched standardized execution template will be used as the execution link.

5. The method according to claim 1, characterized in that, The process of integrating the execution results returned by various tools to generate and feedback a response to the service request includes: The execution results returned by each tool are processed at the field level to generate standardized output results. The field-level processing includes at least one of data format conversion, semantic standardization conversion, and sensitive information desensitization. The standardized output results are integrated to generate and feed back a response result for the service request.

6. The method according to claim 5, characterized in that, The process of performing field-level processing on the execution results returned by each tool to generate standardized output results includes: The execution results returned by each tool are processed according to the field-level processing rules corresponding to the field type to generate standardized processing results.

7. The method according to claim 1, characterized in that, After invoking the corresponding tool according to the execution chain, the process further includes: The input parameters, intermediate results, and interaction logic of each tool call in the execution chain are obtained to generate a full-chain record.

8. The method according to claim 1, characterized in that, The process of invoking the corresponding tool according to the execution chain also includes: Maintain a global context object, which is used to pass shared parameters and intermediate calculation results between different tools.

9. An anomaly detection device, characterized in that, The device, used in an intelligent question-answering system, includes: The perception module is used to respond to triggered service requests, perform semantic structuring processing on the service requests, and generate structured parsing results, wherein the service requests include user abnormal query requests or system alarm events. The planning module is used to determine the corresponding matching execution link based on the parsing results. The execution link is used to call at least one tool to work collaboratively. The execution module is used to call the corresponding tools according to the execution chain and obtain the execution results of each tool. During the calling process, middleware information and system runtime context information are obtained through the underlying client dependencies. The mapping module is used to integrate the execution results returned by various tools, generate and feed back the response results for the service request.

10. An electronic device, characterized in that, It includes a processor and a memory, the memory storing computer-readable instructions that, when executed by the processor, perform the method as described in any one of claims 1-8.

11. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it performs the method as described in any one of claims 1-8.

12. A computer program product, characterized in that, It includes computer program instructions, which, when read and executed by a processor, perform the method as described in any one of claims 1-8.