An AI agent-based operation and maintenance log analysis method and system
By using an AI-based intelligent agent-based operation and maintenance log analysis method, the problems of overall consistency and execution continuity in multi-anomaly coupled scenarios in operation and maintenance log analysis are solved. This enables continuous transformation of the impact of operation and maintenance anomalies and decision optimization, and supports operation and maintenance resource allocation and responsibility assignment.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING HOLYSTONE TECH CO LTD
- Filing Date
- 2026-05-18
- Publication Date
- 2026-06-19
AI Technical Summary
Existing operation and maintenance log analysis methods lack overall consistency and execution coherence in scenarios with multiple coupled anomalies, resulting in a lack of overall consistency and execution coherence in the operation and maintenance decision-making process under such scenarios.
By using an AI-based intelligent agent-based operation and maintenance log analysis method, raw operation and maintenance logs are collected, object ownership is identified and semantic parsed, service status change information and operation status change information are extracted, impact connection relationships are established, impact transmission sequences are formed, operation and maintenance paths are organized, priority and succession relationships are identified, and the operation and maintenance management decision-making process is optimized.
It enables the continuous transformation of the impact of operation and maintenance anomalies. Through the collaboration of impact transmission sequence construction and operation and maintenance management decision reasoning, it forms anomaly propagation path data with sequential connections, supports operation and maintenance resource allocation decisions and responsibility assignment, and ensures the continuity of operation and maintenance management decisions and the consistency of execution.
Smart Images

Figure CN122240422A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of log data management technology, and in particular to an operation and maintenance log analysis method and system based on AI intelligent agents. Background Technology
[0002] In recent years, with the continuous evolution of cloud computing architecture and distributed service systems, operation and maintenance scenarios have gradually shifted from single device management to a collaborative management model oriented towards multiple services and multiple resource objects. As a key record carrier, operation and maintenance logs carry information on service status changes, resource operation status, and operation and maintenance behavior. The structured parsing, status identification, and correlation analysis methods of operation and maintenance logs have been continuously developed, gradually forming a processing system with log template parsing, status change extraction, and correlation modeling as its core. This system has been widely applied in the fields of intelligent operation and maintenance and fault diagnosis, and AI intelligent agent-related methods have begun to be used for operation and maintenance decision-making reasoning and automated handling process construction.
[0003] However, existing methods for anomaly analysis and handling decisions based on operation and maintenance logs generally focus on single-point anomaly identification or simple correlation analysis. They lack a structured expression of the continuous propagation relationship between objects affected by anomalies, resulting in the inability to effectively organize the connection paths between anomalies. Consequently, it is difficult to form a basis for handling decisions with sequential constraints and succession relationships, making the operation and maintenance decision-making process lack overall consistency and execution coherence in scenarios with multiple anomalies. Summary of the Invention
[0004] In view of the aforementioned existing problems, the present invention is proposed.
[0005] Therefore, this invention provides an AI-based operation and maintenance log analysis method to solve the problem of lack of overall consistency and execution coherence in the operation and maintenance decision-making process under multiple anomaly coupling scenarios.
[0006] To solve the above-mentioned technical problems, the present invention provides the following technical solution: Firstly, this invention provides an AI-based intelligent agent-based operation and maintenance log analysis method, comprising: collecting raw operation and maintenance logs; performing object attribution identification and semantic parsing on the raw operation and maintenance logs to output an operation and maintenance management log set; extracting service status change information and operation status change information from the operation and maintenance management log set and performing correlation mapping; performing operation and maintenance anomaly impact identification and management object classification on the correlation mapping results to generate anomaly management event data; establishing impact connection relationships according to the anomaly impact objects and impact manifestations in the anomaly management event data, and continuously connecting the anomaly impact objects to form an impact transmission sequence; organizing operation and maintenance paths and expanding business scope on the impact transmission sequence to output anomaly propagation path data; performing operation and maintenance management decision reasoning on the anomaly propagation path data, identifying priority relationships and handling acceptance relationships as decision constraints, optimizing the operation and maintenance management decision-making process of the operation and maintenance logs, and generating operation and maintenance management reasoning data; and based on the operation and maintenance management reasoning data, making operation and maintenance resource allocation decisions and assigning responsibilities to the anomaly impact objects, outputting responsibility object assignment information, and performing operation and maintenance process orchestration processing to output operation and maintenance management decision data.
[0007] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps for outputting the operation and maintenance management log set are as follows: Establish ownership relationships between each log record, business object, and resource object in the original operation and maintenance logs to form operation and maintenance object ownership data; Perform semantic determination and behavior type classification on the operation and maintenance behaviors and results in the original operation and maintenance logs, and output semantic data of operation and maintenance activities; The operation and maintenance object attribution data and operation and maintenance activity semantic data are associated and arranged to output a set of operation and maintenance management logs.
[0008] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps for extracting service status change information and operation status change information are as follows: The log template parsing method is used to perform template matching on each log record in the operation and maintenance management log set, and log records with the same semantic structure are grouped into a unified log template to form a log template set. The system identifies status features and determines change types from the log template set, and outputs service status change information and running status change information.
[0009] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps for generating abnormal management event data are as follows: Match service status change information with operation status change information according to time sequence to determine the operation and maintenance relationship, and output the association mapping result; The abnormal impact of changes in the combination of operation and maintenance relationships in the association mapping results is identified, and the abnormal impact information of each change is determined. The information on the impact of anomalies is categorized into management objects based on business objects and resource objects, and anomaly management event data is generated.
[0010] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps of forming the influence transmission sequence are as follows: Based on the abnormal impact objects and impact manifestations in the abnormal management event data, extract the change source information and change result information from the impact manifestations, and sequentially associate the abnormal impact objects in the change source information with the abnormal impact objects in the change result information to form an impact connection relationship; Based on the influence connection relationship, abnormal influence objects that have influence connection relationships are written into the same connection structure to form influence connection data; The path connectivity of the impact connection data is confirmed, and the impact connection data belonging to the same impact chain are merged to form an impact transmission sequence.
[0011] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps for outputting anomaly propagation path data are as follows: Reconstruct the path structure of the impact propagation sequence, output path structure data, and mark the path location and impact level of each abnormal impact object in the path structure data, outputting operation and maintenance path structure data; Based on the operation and maintenance path structure data, the business objects corresponding to each object affected by the anomaly are extended outward, and the extended business objects are written into the operation and maintenance path structure data, outputting the anomaly propagation path data.
[0012] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps for generating operation and maintenance management inference data are as follows: The AI agent performs rule matching on each abnormally affected object in the abnormal propagation path data according to the preset handling rules, marks the abnormally affected objects that meet the preset handling rules as objects that can be handled, and collects them into a set of handling candidates. Based on the path location and impact level in the abnormal propagation path data, the disposal candidates are divided into disposal priority relationships and disposal succession relationships. Based on the priority and acceptance relationships of handling, an AI agent is used to analyze the data of anomaly propagation paths for operation and maintenance management decisions, identify the constraints of the priority and acceptance relationships of handling in operation and maintenance management decisions, and output decision constraints. The abnormal propagation path data is sorted and scheduled according to decision constraints, and the optimized handling order data is output. The optimized handling sequence data is mapped to the abnormal impact objects in the operation and maintenance logs, and the abnormal impact objects are rearranged according to the order relationship in the optimized handling sequence data to output operation and maintenance management inference data.
[0013] As a preferred embodiment of the AI-based intelligent agent-based operation and maintenance log analysis method of the present invention, the specific steps for outputting the responsibility object assignment information are as follows: Based on the disposal order and disposal correlation in the operation and maintenance management reasoning data, resource demand matching and resource usage order determination are performed on the objects affected by the anomaly, and resource allocation decision data is output. Based on the resource allocation decision data, assign responsibility objects to each abnormally affected object, establish the correspondence between abnormally affected objects and responsibility objects, and output responsibility object assignment information.
[0014] As a preferred embodiment of the operation and maintenance log analysis method based on AI intelligent agents described in this invention, the operation and maintenance process orchestration is to organize each abnormally affected object in sequence according to the information assigned to the responsible object, and the disposal order and disposal relationship, and output operation and maintenance management decision data.
[0015] Secondly, this invention provides an operation and maintenance log analysis system based on an AI intelligent agent, comprising: The data acquisition module is used to collect raw operation and maintenance logs, perform object attribution identification and semantic parsing on the raw operation and maintenance logs, and output a set of operation and maintenance management logs. The anomaly identification module is used to extract service status change information and running status change information from the operation and maintenance management log set, perform correlation mapping, identify the impact of operation and maintenance anomalies and classify management objects based on the correlation mapping results, and generate anomaly management event data. The impact propagation module is used to establish impact connection relationships based on the impact objects and impact manifestations in the anomaly management event data, and to continuously connect the impact objects to form an impact propagation sequence; The operation and maintenance decision module is used to organize the operation and maintenance path and expand the business scope of the impact propagation sequence, output anomaly propagation path data, perform operation and maintenance management decision reasoning on the anomaly propagation path data, identify the priority relationship and the handling acceptance relationship as decision constraints, optimize the operation and maintenance management decision process of operation and maintenance logs, and generate operation and maintenance management reasoning data. The object assignment module is used to make operational resource allocation decisions and assign responsibilities to objects affected by anomalies based on operational management reasoning data, output responsible object assignment information, perform operational process orchestration, and output operational management decision data.
[0016] The beneficial effects of this invention are as follows: By coordinating the construction of influence propagation sequences with operation and maintenance management decision reasoning, a continuous transformation from the impact of operation and maintenance anomalies to operation and maintenance management decisions is achieved. By establishing influence connection relationships according to the impact objects and manifestations in the anomaly management event data, the impact of anomalies forms a continuous expression with sequential connections between business objects and resource objects, thus providing a stable structural foundation for anomaly propagation path data. Through AI intelligence, operation and maintenance management decision reasoning is performed on the anomaly propagation path data, transforming it into operation and maintenance management reasoning data with constraints. This further supports operation and maintenance resource allocation decisions and responsibility assignment, enabling operation and maintenance management decision data to directly serve operation and maintenance process orchestration and execution. Attached Figure Description
[0017] To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the following description of the embodiments will be briefly introduced. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0018] Figure 1 This is a flowchart of an operation and maintenance log analysis method based on AI intelligent agents.
[0019] Figure 2 This is a schematic diagram of an AI-based operations and maintenance log analysis system.
[0020] Figure 3 This is a flowchart to output the effect of the transmission sequence.
[0021] Figure 4 A flowchart for generating operation and maintenance management inference data. Detailed Implementation
[0022] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
[0023] Many specific details are set forth in the following description in order to provide a full understanding of the invention. However, the invention may also be practiced in other ways different from those described herein, and those skilled in the art can make similar extensions without departing from the spirit of the invention. Therefore, the invention is not limited to the specific embodiments disclosed below.
[0024] Secondly, the term "one embodiment" or "embodiment" as used herein refers to a specific feature, structure, or characteristic that may be included in at least one implementation of the present invention. The phrase "in one embodiment" appearing in different places in this specification does not necessarily refer to the same embodiment, nor is it a single or selective embodiment that is mutually exclusive with other embodiments.
[0025] Reference Figure 1 , Figure 3 , Figure 4 This is one embodiment of the present invention, which provides a method for analyzing operation and maintenance logs based on AI intelligent agents, including the following steps: S1. Collect raw operation and maintenance logs, perform object ownership identification and semantic parsing on the raw operation and maintenance logs, and output a set of operation and maintenance management logs.
[0026] Establish ownership relationships between each log record, business object, and resource object in the original operation and maintenance logs to form operation and maintenance object ownership data.
[0027] Specifically, collecting raw operation and maintenance logs involves uniformly accessing the log records continuously generated during business operations and arranging them sequentially according to their time position to form a continuous log sequence. Based on the continuous log sequence, the field content of each log record is structurally divided to locate business identification information and resource identification information. The business objects corresponding to the business identification information and the resource objects corresponding to the resource identification information are bound together in the same log record, establishing a clear ownership relationship between business objects and resource objects in the same log record. The ownership relationships of business objects and resource objects formed in each log record are written into a unified record structure in chronological order to form operation and maintenance object ownership data.
[0028] Furthermore, the field content is structurally divided by splitting the log record into multiple field fragments based on the field separators, key-value identifiers, and fixed-format tags in the log record. Field fragments containing business name, service name, or business identifier fields are matched as business identifier information, and field fragments containing resource name, node identifier, or resource identifier fields are matched as resource identifier information, so that the source of business identifier information and resource identifier information is clear and locatable.
[0029] Semantic determination and behavior type classification are performed on the operation and maintenance behaviors and results in the original operation and maintenance logs, and semantic data of operation and maintenance activities are output.
[0030] Specifically, the behavior description and result description content of each log record in the continuous log sequence are located one by one. The operation name, target object, and state change words in the behavior description content are broken down and combined to form the corresponding operation and maintenance behavior expression. Based on the action characteristics in the operation and maintenance behavior expression, each log record is divided into different operation and maintenance behavior categories. The state feedback words and result description words in the result description content are extracted and combined with the corresponding context content to form the operation result expression. Based on the state change characteristics in the operation result expression, each log record is divided into different operation result categories. The operation and maintenance behavior category and operation result category corresponding to each log record are written into the same record position according to the time position, so that the operation and maintenance behavior category and operation result category are arranged continuously in time order, and the operation and maintenance activity semantic data is output.
[0031] Furthermore, status feedback terms are words used to indicate the current state of an object or the direction of state change, such as "success, failure, exception, normal, timeout, interruption, completion, no response, recovered, unavailable, and available," etc.; result descriptive terms are words used to describe the specific results or impacts on performance after an operation is executed, such as "startup complete, connection failed, service stopped, resource released, request rejected, task completed, interface call failed, and node offline," etc.
[0032] The operation and maintenance object attribution data and operation and maintenance activity semantic data are associated and arranged to output a set of operation and maintenance management logs.
[0033] Specifically, a correspondence is established based on the temporal order of the operation and maintenance object attribution data and the temporal order of the operation and maintenance activity semantic data. That is, business objects and resource objects at the same time position are written to the same record position along with their corresponding operation and maintenance behavior categories and operation result categories, so that business objects, resource objects, operation and maintenance behavior categories, and operation result categories form a combined expression in the same record structure. The combined results in each record position are arranged continuously according to the temporal order, and the sequential relationship between adjacent record positions is uniformly organized to form a sequential connection between each record, outputting an operation and maintenance management log set. The operation and maintenance management log set is a data set composed of log records as basic units, and each record contains the correspondence between business objects, resource objects, operation and maintenance behavior categories, and operation result categories.
[0034] S2. Extract service status change information and running status change information from the operation and maintenance management log set, perform correlation mapping, identify the impact of operation and maintenance anomalies and classify management objects based on the correlation mapping results, and generate anomaly management event data.
[0035] The log template parsing method is used to perform template matching on each log record in the operation and maintenance management log set, and log records with the same semantic structure are grouped into a unified log template to form a log template set.
[0036] Specifically, the text expression in the operation and maintenance management log collection is read one by one through the log template parsing method. The text expression is split into a sequence of terms according to character separators and field delimiters. The terms in the sequence of terms are classified into action words, object words, and state representation words according to their parts of speech. The order of the terms in the sequence of terms is marked to form a semantic structure sequence. The semantic structure sequences formed by different log records are compared item by item according to the consistency of term category and position. Semantic structure sequences with the same term category and the same position order are grouped into the same semantic structure category. The log records corresponding to the same semantic structure category are written under the same log template identifier. Log records with the same semantic structure are formed into a centralized expression under the same log template identifier, forming a log template set.
[0037] The system identifies status features and determines change types from the log template set, and outputs service status change information and running status change information.
[0038] Specifically, the log records corresponding to each log template are read one by one from the log template set. The status characteristics in the log records are located and extracted. The status characteristics are distinguished and marked according to the status characteristics corresponding to service objects and resource objects. The changes between adjacent status characteristics are extracted based on the sequential relationship of the status characteristics in time position. The part of the changes that reflects the changes in the status of service objects is marked as service status change information, and the part of the changes that reflects the changes in the status of resource objects is marked as running status change information.
[0039] Furthermore, state characterization terms are a set of words in log records used to describe the state attributes and state change processes of service objects or resource objects during operation. They include expressions used to describe state results, as well as expressions used to reflect the direction, stage, and process of state change, such as "running, loading, initializing, connecting, switching, recovering, degrading, blocking, active, and idle", which are used to reflect the evolution characteristics of the state in the time dimension.
[0040] The service status change information and the operation status change information are matched according to the time sequence to determine the operation and maintenance relationship, and the association mapping result is output.
[0041] Specifically, based on the time position in the service status change information, the records at the corresponding time positions in the operation status change information are located, and the service status change information and operation status change information within the same time position range are aligned according to time markers; based on the association relationship between service objects and resource objects, a corresponding connection relationship is established between the aligned service status change information and operation status change information; the service status change information and operation status change information with corresponding connection relationships are written into the same associated record position, so that the service status change information and operation status change information form a one-to-one correspondence in time sequence, which serves as the association mapping result.
[0042] Furthermore, aligning service status change information and operational status change information within the same time range by time stamp involves establishing a one-to-one correspondence between the time positions in the service status change information and the time positions in the operational status change information. Records with the same time positions are matched, and the matched service status change information and operational status change information are written into the same time position index, thus forming a unified alignment relationship between the service status change information and operational status change information in the time dimension.
[0043] The changes in the operation and maintenance relationship combinations in the association mapping results are used to identify the abnormal impact of operation and maintenance, and to determine the abnormal impact information generated by each change.
[0044] Specifically, in the association mapping results, service status change information and operation status change information are extracted from each associated record in chronological order. Service status change information and operation status change information in the same associated record are written to the same comparison position. The direction of status change, the magnitude of status change, and the order of status change are compared to see if they are opposite. The comparison results in adjacent time positions are connected sequentially to form the change trajectory of the corresponding operation and maintenance relationship combination. The location of status change, continuous offset, and abnormal interruption is located from the change trajectory. The operation and maintenance relationship combination containing status change, continuous offset, or abnormal interruption is marked as an abnormal change manifestation. The corresponding service object and resource object are extracted from the abnormal change manifestation. The abnormal change manifestation and the corresponding service object and resource object are written to the same abnormal record position to determine the abnormal impact information generated by each change manifestation.
[0045] The information on the impact of anomalies is categorized into management objects based on business objects and resource objects, and anomaly management event data is generated.
[0046] Specifically, the system reads the corresponding business object and resource object from each exception record in the exception impact information. Exception impact information for business objects with the same business object is written to the same business object record location, thus forming a centralized expression of exception impact information for the same business object. For exception impact information for resource objects with the same resource object, a corresponding resource object record location is established, thus forming an independent collection of exception impact information for the same resource object. In the business object record location and resource object record location, each exception impact information is arranged in chronological order, and adjacent exception impact information is continuously connected based on the changes in the exception impact information. Exception impact information with temporal continuity is combined into the same event record, so that each event record simultaneously contains the corresponding business object, resource object, and change performance information, outputting exception management event data.
[0047] S3. Based on the abnormal impact objects and impact manifestations in the abnormal management event data, establish the impact connection relationship, and continuously connect the abnormal impact objects to form an impact transmission sequence.
[0048] Based on the abnormal impact objects and impact manifestations in the abnormal event data, information on the source of change and information on the result of change are extracted from the impact manifestations. The abnormal impact objects in the source of change information and the abnormal impact objects in the result of change information are sequentially associated to form an impact connection relationship.
[0049] Specifically, based on the affected objects and their manifestations in the anomaly management event data, the manifestation field is read from each anomaly management event data entry. The state change description in the manifestation field is structurally decomposed to locate the source information indicating the start position of the state and the result information indicating the end position of the state. The affected object corresponding to the source information is used as the starting object marker, and the affected object corresponding to the result information is used as the arriving object marker. The starting object marker and the arriving object marker are arranged sequentially according to their time positions, and a corresponding connection is established between the starting object marker with the earlier time position and the arriving object marker with the later time position, forming an impact connection relationship. The impact connection relationship refers to the directed connection relationship established between the affected object corresponding to the source information and the affected object corresponding to the result information in the anomaly management event data, which is used to indicate the direction of the transmission of the anomaly impact between different objects.
[0050] Based on the influence connection relationship, abnormal influence objects that have influence connection relationships are written into the same connection structure to form influence connection data.
[0051] Specifically, based on the influence connection relationship, the starting object marker and the arriving object marker of each group of established connections are combined and recorded. Abnormal influence objects in the same influence connection relationship are written into the same record position in the order of the starting object and the arriving object, so that each record contains the starting object marker, the arriving object marker and the corresponding time and position information. Multiple influence connection relationships are arranged, and abnormal influence objects with continuous time and position relationships and connection continuation relationships are added to the same record position, so that abnormal influence objects in the same record position form a continuous connection expression, forming influence connection data.
[0052] The path connectivity of the impact connection data is confirmed, and the impact connection data belonging to the same impact chain are merged to form an impact transmission sequence.
[0053] Specifically, in the impact connection data, the starting object marker and the arriving object marker in each record position are read in chronological order. The arriving object marker in the current record position is compared with the starting object marker in the subsequent record positions one by one to determine whether the marker content is consistent. For record positions with consistent marker content, a connection relationship is established, and the record positions with connection relationship are sequentially connected in chronological order to form a continuous object connection chain. The record positions that form a continuous connection chain are merged and organized, and the abnormal impact objects in the same continuous connection chain are written into the same sequence position according to the connection order, so that the abnormal impact objects present a complete connection path in the same sequence position, and the impact transmission sequence is output.
[0054] Furthermore, when constructing the impact transmission sequence, a time synchronization algorithm (such as NTP) is used to correct the timestamps of log records to ensure that the clocks of each node are consistent. For log missing and duplicate alarm issues, missing logs are supplemented through a log retransmission mechanism or based on a unique event identifier (such as a transaction ID) to ensure the integrity of each abnormal event.
[0055] When confirming the path continuity of impact connection data, not only is the consistency comparison between the arrival object marker in the current record position and the starting object marker in the subsequent record positions performed, but the corresponding impact performance of the abnormal impact object in the anomaly management event data is also verified. That is, the change result information in the previous record position is read first, and then the change source information in the next record position is read to determine whether the change result information and the change source information are continuous in time position order, and whether the corresponding abnormal impact object maintains the sequential relationship. Only when the arrival object marker is consistent with the starting object marker, and the change result information and the change source information are continuous in time position order, is the corresponding record position confirmed as a connection position in the same impact chain. For record positions that only meet the time position order but not the sequential relationship, path continuity confirmation is not performed, so that the formed impact transmission sequence not only has a sequential connection relationship, but also a sequential relationship in impact performance. S4. Organize the operation and maintenance path and expand the business scope of the impact propagation sequence, output the anomaly propagation path data, perform operation and maintenance management decision reasoning on the anomaly propagation path data, identify the priority relationship and the acceptance relationship of the handling as decision constraints, optimize the operation and maintenance management decision-making process of the operation and maintenance log, and generate operation and maintenance management reasoning data.
[0056] The path structure of the impact propagation sequence is reconstructed, and the path structure data is output. The path structure data is then used to mark the path location and impact level of each abnormal impact object, and the operation and maintenance path structure data is output.
[0057] Specifically, the sequence of abnormal impact objects is read one by one in the impact propagation sequence. Abnormal impact objects in the same sequence are written into the path record positions according to their order of appearance, and the order of appearance is used as the path position number. After the path position number is written, each abnormal impact object is compared with its adjacent preceding abnormal impact object. The number of preceding abnormal impact objects directly connected to the current abnormal impact object is counted, and the result is written into the corresponding record position. The impact level value of each abnormal impact object is calculated based on the path position number and the number of preceding abnormal impact objects. The calculated impact level value and the path position number are written into the same record position, so that each abnormal impact object has a corresponding expression of path position and impact level in the path structure data. All record positions are arranged in order of path position number, and the operation and maintenance path structure data is output.
[0058] The formula for calculating the influence level value is as follows: ; in, Indicates the path location number identifier. This represents the sequential number identifier of the objects affected by the anomaly in the path record location. Indicates the first The first path location number The impact level value of the object affected by the exception. Indicates the first The first path location number The path location number of the object affected by the exception. Indicates the first The first path location number The number of objects affected by a preceding exception.
[0059] Based on the operation and maintenance path structure data, the business objects corresponding to each object affected by the anomaly are extended outward, and the extended business objects are written into the operation and maintenance path structure data, outputting the anomaly propagation path data.
[0060] Specifically, the system reads the abnormally affected objects and their corresponding business objects one by one from the path record positions in the operation and maintenance path structure data. Based on the call and access records between business objects, it locates the associated business objects that have an interaction relationship with the current business object. The associated business objects are written into the corresponding path record positions according to the interaction direction, and the associated business objects and business objects are arranged in the same record position. The business objects in the same path record position are extended and written in the order of path position number. After the business object extension is completed, the path record positions containing the abnormally affected objects and extended business objects are rearranged in the order of path position number, and the abnormal propagation path data is output. The abnormal propagation path data refers to the path structure data formed by combining the impact transmission sequence with the business object association relationship. Each path record contains multiple abnormally affected objects and their corresponding business objects arranged in sequence.
[0061] The AI agent performs rule matching on each abnormally affected object in the abnormal propagation path data according to the preset handling rules. Abnormally affected objects that meet the preset handling rules are marked as handleable objects and collected into a set of handling candidates.
[0062] Specifically, an AI agent (an AI agent is a computational execution subject with autonomous decision-making and reasoning capabilities) sequentially extracts the abnormally affected objects and their corresponding path location numbers and impact level values from the abnormal propagation path data. The attribute information corresponding to the abnormally affected objects is mapped to the condition fields in the preset handling rules. Each matching item in the condition field is verified. Abnormally affected objects that meet all matching items are written into the handleable identifier position, while abnormally affected objects that do not meet the matching items are kept in their original record state. After the handleable identifier positions are formed, all abnormally affected objects written into the handleable identifier positions are collectively written into the same set of record positions in the order of their path location numbers, so that all handleable objects form an ordered arrangement structure in the same set of record positions, generating a set of handling candidates.
[0063] Furthermore, the preset handling rules are a set of rules obtained by organizing the anomaly handling records in the historical operation and maintenance logs. The path location number, impact level value, business object association, and anomaly impact manifestation corresponding to each anomaly handling record in the historical operation and maintenance logs are extracted and mapped to the corresponding actual handling order, handling level, handling scope, and handling method. The mapping relationships are summarized and organized in multiple historical operation and maintenance logs, and the repeated mapping relationships are written into a unified rule record position, so that the same combination of conditions corresponds to a fixed handling result, forming preset handling rules that can be directly used for rule matching.
[0064] It should be noted that the AI agent includes a state-aware processing structure, a decision-making and reasoning processing structure, and a feedback and update processing structure. The state-aware processing structure extracts the path location number, impact level value, business object association, and change behavior information of the affected objects from the anomaly propagation path data, and combines these information to form the agent's input state data. The decision-making and reasoning processing structure constructs a candidate set of actions based on the matching relationship between the agent's input state data and preset handling rules. It then performs sequential filtering and relational constraint combination on each action according to its execution priority and constraints, generating a handling decision result that meets the constraints. The feedback and update processing structure records the execution state information corresponding to the handling result after the decision is formed, associates this information with the original preset handling rules, and marks the execution state information that meets the recurrence condition for rule correction, enabling the preset handling rules to dynamically adjust during multiple executions.
[0065] Based on the path location and impact level in the abnormal propagation path data, the disposal candidates are divided into disposal priority relationships and disposal succession relationships.
[0066] Specifically, the disposable objects in the candidate set are arranged in a unified order based on their path location numbers, with the disposable objects with earlier path location numbers placed first, followed by those with later path location numbers, forming a path order sequence. The influence level values of adjacent disposable objects in the path order sequence are compared, and the disposable objects with earlier path location numbers and corresponding influence level values are marked as priority objects, while the disposable objects immediately following them are marked as successor objects. The priority objects and their corresponding successor objects are written into the same relationship record position according to their order in the path order sequence, and the disposal priority relationship and disposal successor relationship are output.
[0067] Based on the priority and acceptance relationships of handling, an AI agent is used to analyze the data of anomaly propagation paths for operation and maintenance management decisions, identify the constraining effects of the priority and acceptance relationships on operation and maintenance management decisions, and output decision constraints.
[0068] Specifically, a corresponding constraint mapping is established in the anomaly propagation path data between the priority objects in the priority relationship and the successor objects in the succession relationship. The path position number and influence level value corresponding to the priority object are written as pre-constraints in the corresponding record positions, and the path position number and influence level value corresponding to the successor object are written as subsequent constraints in the corresponding record positions. In the same record position, the pre-constraints and subsequent constraints are combined and expressed, with the pre-constraints limited to first-execution conditions and the subsequent constraints limited to last-execution conditions, and the first-execution conditions and last-execution conditions are made to form a front-back correspondence in the path sequence. The constraint relationships are uniformly organized in all path record positions, and the sequential restriction relationship formed between the same priority object and successor object is written into the constraint record position, outputting the decision constraint conditions. The decision constraint conditions refer to the set of sequential restriction relationships formed by the combination of the priority relationship and the succession relationship, which are used to constrain the order of handling between the objects affected by the anomaly.
[0069] The abnormal propagation path data is sorted and scheduled according to decision constraints, and the optimized handling order data is output.
[0070] Specifically, the path location number and impact level value corresponding to each abnormally affected object are extracted from the abnormal propagation path data. Sequential constraints are established based on the first and last execution conditions in the decision constraints. Abnormally affected objects that meet the first execution condition are written to the first position of the sequence record, and those that meet the last execution condition are written to the last position. The path location numbers of the abnormally affected objects are aligned and corrected between the first and last position, ensuring that abnormally affected objects with earlier path location numbers remain in the previous position and those with later path location numbers are placed in the next position. After all abnormally affected objects are sorted, the sorting results are written into a unified sequence record structure according to the path order, outputting optimized handling sequence data.
[0071] The optimized handling sequence data is mapped to the abnormal impact objects in the operation and maintenance logs, and the abnormal impact objects are rearranged according to the order relationship in the optimized handling sequence data to output operation and maintenance management inference data.
[0072] Specifically, a corresponding sequence number is written to each affected object in the operation and maintenance log, giving the affected object a sequence identifier consistent with the optimization and handling sequence data. After the sequence identifier is written, the record position in the operation and maintenance log is replaced, and the original record position is re-mapped to the corresponding record position according to the sequence number correspondence, so that the record position of each affected object in the operation and maintenance log corresponds to the sequence number. After the record position is adjusted, the adjusted record result is written to a unified record structure, and operation and maintenance management inference data is output.
[0073] S5. Based on the operation and maintenance management reasoning data, make operation and maintenance resource allocation decisions and assign responsibilities to the objects affected by anomalies, output the responsibility object assignment information, perform operation and maintenance process orchestration, and output operation and maintenance management decision data.
[0074] Based on the handling order and handling correlation in the operation and maintenance management reasoning data, resource demand matching and resource usage order determination are performed on the objects affected by the anomaly, and resource allocation decision data is output.
[0075] Specifically, the resource requirement type and quantity for each affected object are recorded in the same record location. A sequential arrangement structure is established for the affected objects according to the disposal order. Within this structure, resource allocation priority is determined based on disposal dependencies. Resources corresponding to affected objects with earlier disposal orders and earlier positions in their disposal relationships are marked as priority resources, while those corresponding to later disposal orders and later positions in their disposal relationships are marked as subsequent resources. After resource marking, resource allocation is performed for affected objects of the same resource type according to the sequential arrangement structure. Affected objects with priority resources are recorded in the preceding resource record location, and those with subsequent resources are recorded in the following resource record location. This ensures a correspondence between resource requirement type, resource quantity, and resource usage order within the same record structure, outputting resource allocation decision data.
[0076] Based on the resource allocation decision data, assign responsibility objects to each abnormally affected object, establish the correspondence between abnormally affected objects and responsibility objects, and output responsibility object assignment information.
[0077] Specifically, in the resource allocation decision data, the resource occupancy relationship corresponding to each abnormally affected object is identified based on the resource demand type and resource usage order. The responsible objects with corresponding resource processing capabilities are written into the same record location, and a correspondence is established between the processing scope of the responsible object and the resource demand type of the abnormally affected object. After the responsible object is written, the correspondence between the abnormally affected objects and the responsible object under the same resource type is organized. The abnormally affected objects that can be covered by the same responsible object are written into the same responsible object record location, and the correspondence between the abnormally affected objects and the responsible object is written into the same record structure according to the resource usage order, and the responsible object assignment information is output.
[0078] Furthermore, when matching responsible parties to each affected object based on resource allocation decision data, the affected objects are arranged sequentially according to the resource demand type, quantity, and usage order in the resource allocation decision data. Each responsible object is matched item by item with its corresponding resource demand type. When the same responsible object corresponds to multiple affected objects, the order of handling is determined by the order of disposal and the disposal relationship. Affected objects at the beginning of the sequence are written first into the responsible object record, and subsequent affected objects are appended sequentially to the same responsible object record. After the responsible objects are written, they are organized sequentially according to the disposal order and disposal relationship based on the responsible object assignment information. This ensures that the responsible object assignment information is consistent with the operation and maintenance process orchestration, thus creating a continuous correspondence between resource allocation decisions, responsible object assignment, and operation and maintenance process orchestration under the same sequential relationship.
[0079] The system processes and orchestrates the operation and maintenance workflow for the information on the assigned responsibilities, and outputs operation and maintenance management decision data.
[0080] Specifically, in the responsibility assignment information, a process arrangement structure is established for each abnormal impact object and its corresponding responsible object according to the handling order. Based on the handling relationship, a process connection relationship is established for abnormal impact objects and their responsible objects that have a sequential dependency relationship. Abnormal impact objects and their corresponding responsible objects that are in the preceding position in the handling relationship are arranged in the preceding position of the process, and abnormal impact objects and their corresponding responsible objects that are in the subsequent position in the handling relationship are arranged in the following position of the process. After the process arrangement structure is formed, abnormal impact objects and their responsible objects with continuous handling relationships are connected in the path according to the handling order, so that the responsible objects corresponding to each abnormal impact object form a continuous execution path in the process, and output operation and maintenance management decision data.
[0081] Please see Figure 2 This embodiment also provides an operation and maintenance log analysis system based on AI intelligent agents, including: The data acquisition module is used to collect raw operation and maintenance logs, perform object attribution identification and semantic parsing on the raw operation and maintenance logs, and output a set of operation and maintenance management logs. The anomaly identification module is used to extract service status change information and running status change information from the operation and maintenance management log set, perform correlation mapping, identify the impact of operation and maintenance anomalies and classify management objects based on the correlation mapping results, and generate anomaly management event data. The impact propagation module is used to establish impact connection relationships based on the impact objects and impact manifestations in the anomaly management event data, and to continuously connect the impact objects to form an impact propagation sequence; The operation and maintenance decision module is used to organize the operation and maintenance path and expand the business scope of the impact propagation sequence, output anomaly propagation path data, perform operation and maintenance management decision reasoning on the anomaly propagation path data, identify the priority relationship and the handling acceptance relationship as decision constraints, optimize the operation and maintenance management decision process of operation and maintenance logs, and generate operation and maintenance management reasoning data. The object assignment module is used to make operational resource allocation decisions and assign responsibilities to objects affected by anomalies based on operational management reasoning data, output responsible object assignment information, perform operational process orchestration, and output operational management decision data.
[0082] In summary, this invention achieves a continuous transformation from the impact of operational anomalies to operational management decisions through the synergy of impact transmission sequence construction and operational management decision reasoning. By establishing impact connections based on the impact objects and manifestations in the anomaly management event data, the impact of anomalies is expressed continuously between business objects and resource objects with sequential connections, thus providing a stable structural foundation for anomaly propagation path data. Through AI agents performing operational management decision reasoning on the anomaly propagation path data, the anomaly propagation path data is transformed into operational management reasoning data with constraints, further supporting operational resource allocation decisions and responsibility assignment, enabling operational management decision data to directly serve operational process orchestration and execution.
[0083] It should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit it. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention, and all such modifications or substitutions should be covered within the scope of the claims of the present invention.
Claims
1. A method for analyzing operation and maintenance logs based on AI intelligent agents, characterized in that: include, Collect raw operation and maintenance logs, perform object attribution identification and semantic parsing on the raw operation and maintenance logs, and output a set of operation and maintenance management logs; Extract service status change information and running status change information from the operation and maintenance management log collection, perform correlation mapping, identify the impact of operation and maintenance anomalies and classify management objects based on the correlation mapping results, and generate anomaly management event data; Based on the abnormal impact objects and their manifestations in the abnormal event data, establish the impact connection relationship, and continuously connect the abnormal impact objects to form an impact transmission sequence; The system organizes the operation and maintenance path and expands the business scope of the impact propagation sequence, outputs anomaly propagation path data, performs operation and maintenance management decision reasoning on the anomaly propagation path data, identifies the priority relationship and the handling acceptance relationship as decision constraints, optimizes the operation and maintenance management decision-making process of operation and maintenance logs, and generates operation and maintenance management reasoning data. Based on the operation and maintenance management reasoning data, the system makes operation and maintenance resource allocation decisions and assigns responsibilities to objects affected by anomalies, outputs responsibility object assignment information, performs operation and maintenance process orchestration, and outputs operation and maintenance management decision data.
2. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for outputting the operation and maintenance management log set are as follows: Establish ownership relationships between each log record, business object, and resource object in the original operation and maintenance logs to form operation and maintenance object ownership data; Perform semantic determination and behavior type classification on the operation and maintenance behaviors and results in the original operation and maintenance logs, and output semantic data of operation and maintenance activities; The operation and maintenance object attribution data and operation and maintenance activity semantic data are associated and arranged to output a set of operation and maintenance management logs.
3. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for extracting service status change information and operation status change information are as follows: The log template parsing method is used to perform template matching on each log record in the operation and maintenance management log set, and log records with the same semantic structure are grouped into a unified log template to form a log template set. The system identifies status features and determines change types from the log template set, and outputs service status change information and running status change information.
4. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for generating abnormal management event data are as follows: Match service status change information with operation status change information according to time sequence to determine the operation and maintenance relationship, and output the association mapping result; The abnormal impact of changes in the combination of operation and maintenance relationships in the association mapping results is identified, and the abnormal impact information of each change is determined. The information on the impact of anomalies is categorized into management objects based on business objects and resource objects, and anomaly management event data is generated.
5. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for forming the influence transmission sequence are as follows: Based on the abnormal impact objects and impact manifestations in the abnormal management event data, extract the change source information and change result information from the impact manifestations, and sequentially associate the abnormal impact objects in the change source information with the abnormal impact objects in the change result information to form an impact connection relationship; Based on the influence connection relationship, abnormal influence objects that have influence connection relationships are written into the same connection structure to form influence connection data; The path connectivity of the impact connection data is confirmed, and the impact connection data belonging to the same impact chain are merged to form an impact transmission sequence.
6. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for outputting abnormal propagation path data are as follows: Reconstruct the path structure of the impact propagation sequence, output path structure data, and mark the path location and impact level of each abnormal impact object in the path structure data, outputting operation and maintenance path structure data; Based on the operation and maintenance path structure data, the business objects corresponding to each object affected by the anomaly are extended outward, and the extended business objects are written into the operation and maintenance path structure data, outputting the anomaly propagation path data.
7. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for generating operation and maintenance management inference data are as follows: The AI agent performs rule matching on each abnormally affected object in the abnormal propagation path data according to the preset handling rules, marks the abnormally affected objects that meet the preset handling rules as objects that can be handled, and collects them into a set of handling candidates. Based on the path location and impact level in the abnormal propagation path data, the disposal candidates are divided into disposal priority relationships and disposal succession relationships. Based on the priority and acceptance relationships of handling, an AI agent is used to analyze the data of anomaly propagation paths for operation and maintenance management decisions, identify the constraints of the priority and acceptance relationships of handling in operation and maintenance management decisions, and output decision constraints. The abnormal propagation path data is sorted and scheduled according to decision constraints, and the optimized handling order data is output. The optimized handling sequence data is mapped to the abnormal impact objects in the operation and maintenance logs, and the abnormal impact objects are rearranged according to the order relationship in the optimized handling sequence data to output operation and maintenance management inference data.
8. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The specific steps for outputting the responsibility assignment information are as follows: Based on the disposal order and disposal correlation in the operation and maintenance management reasoning data, resource demand matching and resource usage order determination are performed on the objects affected by the anomaly, and resource allocation decision data is output. Based on the resource allocation decision data, assign responsibility objects to each abnormally affected object, establish the correspondence between abnormally affected objects and responsibility objects, and output responsibility object assignment information.
9. The operation and maintenance log analysis method based on AI intelligent agents as described in claim 1, characterized in that: The operation and maintenance process orchestration process assigns information according to the responsible parties, organizes the objects affected by the anomalies in sequence according to the order of handling and the relationship between handling, and outputs operation and maintenance management decision data.
10. An operation and maintenance log analysis system based on AI intelligent agents, based on the operation and maintenance log analysis method based on AI intelligent agents according to any one of claims 1 to 9, characterized in that: include, The data acquisition module is used to collect raw operation and maintenance logs, perform object attribution identification and semantic parsing on the raw operation and maintenance logs, and output a set of operation and maintenance management logs. The anomaly identification module is used to extract service status change information and running status change information from the operation and maintenance management log set, perform correlation mapping, identify the impact of operation and maintenance anomalies and classify management objects based on the correlation mapping results, and generate anomaly management event data. The impact propagation module is used to establish impact connection relationships based on the impact objects and impact manifestations in the anomaly management event data, and to continuously connect the impact objects to form an impact propagation sequence; The operation and maintenance decision module is used to organize the operation and maintenance path and expand the business scope of the impact propagation sequence, output anomaly propagation path data, perform operation and maintenance management decision reasoning on the anomaly propagation path data, identify the priority relationship and the handling acceptance relationship as decision constraints, optimize the operation and maintenance management decision process of operation and maintenance logs, and generate operation and maintenance management reasoning data. The object assignment module is used to make operational resource allocation decisions and assign responsibilities to objects affected by anomalies based on operational management reasoning data, output responsible object assignment information, perform operational process orchestration, and output operational management decision data.