An artificial intelligence-based communication network full-time domain operation and maintenance guarantee method and system

By collecting device and user behavior characteristics in the communication network, a two-way time-series correlation fault rule base is constructed, which solves the problems of inaccurate identification of fault causality and low operation and maintenance efficiency in the existing technology, realizes full-time domain operation and maintenance closed loop, and improves the accuracy and efficiency of fault handling.

CN122247830APending Publication Date: 2026-06-19HANGZHOU ZHANYAO NETWORK TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
HANGZHOU ZHANYAO NETWORK TECH CO LTD
Filing Date
2026-04-10
Publication Date
2026-06-19

Smart Images

  • Figure CN122247830A_ABST
    Figure CN122247830A_ABST
Patent Text Reader

Abstract

This invention discloses an AI-based all-time-domain operation and maintenance (O&M) support method and system for communication networks, relating to the field of network O&M technology. This invention integrates the collection of device and user behavior characteristics, performs pre-judgment of anomaly tags and temporal relationships, and combines historical O&M data with dedicated logic to verify the authenticity of temporal relationships. It constructs a bidirectional temporal-related fault rule base based on three types of temporal relationships, then performs rule base matching and targeted O&M, followed by verification of O&M effectiveness and secondary handling. Continuous rule base iteration and strategy optimization achieve end-to-end all-time-domain O&M, from fault early warning, real-time identification, differentiated handling to effect verification and strategy iteration. It adapts to the cascading transmission characteristics of communication network faults, significantly improving O&M accuracy, efficiency, and practicality.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of network operation and maintenance technology, specifically to a method and system for all-time-domain operation and maintenance support of communication networks based on artificial intelligence. Background Technology

[0002] With the evolution of technologies such as 5G, computing networks, and the Industrial Internet, communication networks are characterized by ultra-large scale, heterogeneous structure, dynamic services, and explosive traffic growth. All-time-domain operations and maintenance emphasize uninterrupted, full lifecycle, and multi-dimensional network assurance to adapt to new network requirements.

[0003] Existing technologies, such as the Vue-based deep fusion system and method for communication network operation and maintenance analysis (CN120692174A), achieve real-time display and interaction of operation and maintenance analysis results by encapsulating the large model as a Vue component and deeply integrating it with the front-end page. Utilizing the intelligent analysis and early warning capabilities of the large model, it provides more accurate and reliable decision support for operation and maintenance analysts. By optimizing the rendering logic and data processing flow of the front-end page and leveraging the parallel processing capabilities of the large model, it improves the system's response speed and performance. Another existing technology, such as the BERT-based intelligent operation and maintenance alarm causal relationship analysis method (CN121435174A), improves the semantic similarity between core switch link interruption and access layer device disconnection by 40% through semantic modeling upgrades and joint encoding of device type and alarm content, significantly reducing the false positive rate compared to independent encoding schemes. Through the organic combination of deep semantic modeling and dynamic temporal analysis, it achieves a leap from rule-driven to data-intelligent-driven intelligent operation and maintenance alarm causal relationships, providing an efficient analysis tool for the intelligent operation and maintenance of complex systems. By using semantic similarity calculation under time constraints and dynamic fusion of multi-dimensional weights, accurate causal modeling of alarm data in complex systems can be achieved. This method is widely applicable to large-scale distributed system fault analysis and root cause localization scenarios in industries such as power systems, communication networks, Internet platforms, cloud computing infrastructure, and industrial IoT.

[0004] Existing technologies focus on optimizing front-end display interaction and system response speed, and use BERT for semantic modeling to locate the causal relationship of alarm texts. However, they still have the following shortcomings: 1. Communication network faults have strict cascading transmission characteristics. Only by dividing them into time-series scenarios and combining nanosecond-level timestamps can the fault transmission path be accurately restored to avoid misjudgment. However, existing technologies only cover the semantics of front-end display or alarm texts and do not design three exclusive time-series scenarios for communication network fault transmission: device first, user first, and synchronous association. They cannot distinguish the causal relationship of the occurrence of anomalies, which makes time-series analysis disconnected from actual operation and maintenance scenarios. In addition, the operation and maintenance handling strategies for different time-series relationships are completely mutually exclusive. Without dividing the time-series relationships, it is easy to have ineffective handling and secondary faults, which increases the probability of incorrect handling direction and anomalies that cannot be eliminated.

[0005] 2. Relying solely on the semantics of alarm text for single-dimensional judgment lacks a dual-dimensional quantitative analysis system that combines device behavior characteristics and user behavior characteristics. This results in incomplete fault characterization, making it easy for root cause localization to lead to misjudgments and omissions, thus reducing the accuracy of fault diagnosis.

[0006] 3. Existing technologies only focus on alarm analysis, result display, or root cause localization, without forming a full-time-domain operation and maintenance closed loop encompassing perception, analysis, handling, verification, and iteration. They lack implementable differentiated operation and maintenance handling, quantitative verification of operation and maintenance effects, and a continuous iteration mechanism for the rule base. As a result, intelligent analysis results cannot be transformed into actual operation and maintenance execution capabilities.

[0007] 4. The lack of a quantitative extraction and multi-dimensional evaluation system for effective operation and maintenance parameters, the lack of quantitative judgment standards for various operation and maintenance indicators, and the inability to select the optimal operation and maintenance parameters result in blind operation and maintenance and low efficiency.

[0008] 5. The system failed to design scenario-specific and refined operation and maintenance strategies for different time-series scenarios and fault types, and adopted a generalized handling method. This approach cannot adapt to the cascading and propagating characteristics of communication network faults, and is prone to problems such as incomplete handling, secondary faults, and expanded business impact. Summary of the Invention

[0009] In view of the above-mentioned technical shortcomings, the purpose of this invention is to provide a method and system for all-time-domain operation and maintenance support of communication networks based on artificial intelligence.

[0010] To solve the above-mentioned technical problems, the present invention adopts the following technical solution: First aspect. The present invention provides an artificial intelligence-based all-time domain operation and maintenance guarantee method for communication networks, including the following steps: S1, deploying integrated acquisition nodes to collect device behavior characteristics and user behavior characteristics within complex communication networks, adding nanosecond-level timestamps, standardizing the collected dual behavior characteristics, labeling real-time device anomaly tags and user anomaly tags, simultaneously performing temporal relationship pre-determination, and outputting the temporal causal relationship tags of real-time dual behavior characteristics and anomaly feature fitting feature curves.

[0011] S2. Extract all operation and maintenance samples from historical operation and maintenance related data, extract device behavior features and user behavior features from the operation and maintenance samples, and then perform labeling and time sequence relationship pre-judgment in the manner of S1, and verify the time sequence relationship. Then, construct a bidirectional time sequence association fault rule base, including device-first rule set, user-first rule set and synchronization association rule set. The rule set includes multiple abnormal labels, abnormal feature fitting feature curves of each abnormal label and suggested operation and maintenance parameter set.

[0012] S3. Match the time-series causal relationship labels and anomaly feature fitting curves of the real-time dual-behavioral features in S1 with the bidirectional time-series correlation fault rule base, output the corresponding anomaly labels, effective operation and maintenance handling parameters and fault types, and then carry out targeted operation and maintenance.

[0013] S4. After targeted maintenance, collect relevant maintenance parameters and verify whether the anomaly has been eliminated. If not, trigger a secondary handling process until the fault is eliminated.

[0014] S5. Collect new operation and maintenance cases during the operation and maintenance process, repeat S2 to complete the full iteration of the rule base, and generate an operation and maintenance strategy optimization report based on the operation and maintenance verification results.

[0015] Secondly, the present invention provides an AI-based all-time-domain operation and maintenance support system for communication networks, comprising: an anomaly monitoring module, deploying an integrated acquisition node to collect device behavior characteristics and user behavior characteristics within a complex communication network, adding nanosecond-level timestamps, standardizing the collected dual behavior characteristics, labeling real-time device anomaly tags and user anomaly tags, simultaneously performing time-series relationship pre-determination, and outputting time-series causal relationship tags of real-time dual behavior characteristics and anomaly feature fitting feature curves.

[0016] The rule building module extracts all operation and maintenance samples from historical operation and maintenance related data, extracts device behavior features and user behavior features from the operation and maintenance samples, and then performs labeling and temporal relationship pre-judgment according to the S1 method, and verifies the temporal relationship. Then, it builds a bidirectional temporal correlation fault rule base, including device-first rule set, user-first rule set and synchronous correlation rule set. The rule set includes multiple abnormal labels, abnormal feature fitting feature curves for each abnormal label and suggested operation and maintenance parameter set.

[0017] The anomaly operation and maintenance module is used to match the time-series causal relationship labels and anomaly feature fitting feature curves of real-time dual-behavioral features in the anomaly monitoring module with the bidirectional time-series correlation fault rule base, output the corresponding anomaly labels, effective operation and maintenance handling parameters and fault types, and then perform targeted operation and maintenance.

[0018] The operation and maintenance monitoring module is used to collect operation and maintenance-related parameters after targeted operation and maintenance, verify whether the anomaly has been eliminated, and if not, trigger a secondary handling process until the fault is eliminated.

[0019] The operation and maintenance optimization module is used to collect new operation and maintenance cases during the operation and maintenance process. The recurring rule building module completes the full iteration of the rule base and generates an operation and maintenance strategy optimization report based on the operation and maintenance verification results.

[0020] The beneficial effects of this invention are as follows: This invention provides a method and system for all-time-domain operation and maintenance assurance of communication networks based on artificial intelligence. By integrating the collection of device behavior characteristics and user behavior characteristics, and performing pre-judgment of anomaly tags and temporal relationships, and combining historical operation and maintenance, the authenticity of temporal relationships is verified through dedicated logic. A bidirectional temporal correlation fault rule base is constructed according to three types of temporal relationships, and then the rule base is matched and targeted operation and maintenance is performed. Then, the operation and maintenance effect is verified and secondary processing is carried out, and the rule base is continuously iterated and the strategy is optimized. This realizes end-to-end all-time-domain operation and maintenance from fault early warning, real-time identification, differentiated processing to effect verification and strategy iteration. It adapts to the characteristics of cascading transmission of communication network faults, greatly improving the accuracy, efficiency and practicality of operation and maintenance. Attached Figure Description

[0021] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0022] Figure 1 This is a schematic diagram of the implementation steps of the method of the present invention.

[0023] Figure 2This is a schematic diagram of the system structure connection of the present invention. Detailed Implementation

[0024] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0025] Please see Figure 1 As shown, an AI-based all-time-domain operation and maintenance guarantee method for communication networks includes the following steps: S1, deploying integrated acquisition nodes to collect device behavior features and user behavior features within complex communication networks, adding nanosecond-level timestamps, standardizing the collected dual behavior features, labeling real-time device anomaly tags and user anomaly tags, and simultaneously performing time-series relationship pre-determination, outputting the time-series causal relationship tags of real-time dual behavior features and the anomaly feature fitting feature curve.

[0026] In a specific embodiment, step S1 is as follows: S101, deploy an integrated end-network-cloud collection node to collect all-dimensional device behavior features and device behavior characteristics within a complex communication network, uniformly add nanosecond-level timestamps, and complete the dual behavior feature standardization processing through Z-score.

[0027] It should be noted that the edge-side data acquisition node deployment adopts a distributed deployment, covering user terminals and edge devices. Lightweight data acquisition plugins are embedded in user terminals, edge gateways, and edge servers. The data acquisition plugins are connected to the operating systems and business interfaces of the terminals and edge devices, and data acquisition parameters are configured, including user behavior characteristics such as user traffic, concurrent connection count, and terminal access status, as well as basic equipment characteristics such as latency and bandwidth utilization of edge devices. Each edge-side node is configured with a nanosecond-level timestamp module, and time synchronization with network-side and cloud-side nodes is achieved through the PTP protocol to ensure that the time accuracy error of the acquired data is ≤1μs, providing support for subsequent time difference calculations.

[0028] Network-side data collection node deployment: Network-side data collection nodes are deployed in zones, focusing on collecting behavioral characteristics of network devices in each domain to achieve full coverage of fault domains: Independent data collection nodes are deployed in the access domain (switches and access routers near user terminals), core domain (core switches, routers, firewalls), and transmission domain (transmission link nodes, optical transceivers). Each fault domain has at least one primary data collection node and one backup node to avoid single points of failure. The data collection nodes are connected to the network elements in each domain via protocols such as SNMP and NetFlow, and the collection scope is configured to include the collection of device behavioral characteristics such as latency, packet loss rate, CPU, memory utilization, number of sessions, device operating status, and link redundancy. The nanosecond-level timestamp collection function is enabled simultaneously to synchronize with the end-side and cloud-side nodes to ensure that the device and user characteristics collection time is consistent for the same fault event. A dedicated data transmission link is built for the network-side data collection nodes to connect with the cloud-side aggregation nodes to ensure that the collected data does not occupy core business bandwidth and avoid affecting the normal operation of the network.

[0029] Cloud-side data acquisition node deployment: Cloud-side data acquisition nodes are deployed on the communication network operation and maintenance cloud platform, focusing on data aggregation, pre-processing, and collaborative scheduling. High-performance data acquisition and aggregation nodes are deployed on the operation and maintenance cloud platform, configured with data receiving interfaces, pre-processing modules, and storage modules. They establish communication with end-side and network-side data acquisition nodes through encrypted links; data receiving rules are configured to receive dual-behavioral feature data collected from the end-side and network-side according to time series type; nanosecond-level timestamps are retained synchronously to provide a basis for subsequent feature standardization processing and time series causal pre-determination; at the same time, a data caching mechanism is configured to avoid data loss and ensure the traceability of collected data.

[0030] Among them, device behavior characteristics refer to the performance, status, load and link transmission characteristics generated by various network devices and network element nodes in the communication network during operation, including but not limited to: device port status, link connectivity status, device CPU utilization, memory utilization, bandwidth utilization, data forwarding latency, packet loss rate, jitter value, number of concurrent sessions, number of device restarts, firmware running status and optical power.

[0031] User behavior characteristics refer to the abnormal characteristics of traffic, access, service experience, and behavior generated by user terminals and service objects in the process of using network services, including but not limited to: user uplink and downlink traffic volume, traffic burst amplitude, service access frequency, number of concurrent connections, number of online users, user access location, service type, user-side experience quality (QoE), service lag, number of interruptions, abnormal access behavior, and user access authentication status.

[0032] It should be noted that in this embodiment, "dual behavior features" is an abbreviation for "device behavior features" and "user behavior features".

[0033] S102. Label real-time device anomaly tags and user anomaly tags based on threshold and AI mutation detection algorithms; and obtain the timestamps of device anomalies and user anomalies, and mark them as follows: Calculate the time difference , >0, the timing relationship is device first. <0, the timing relationship is user first. The temporal relationship is synchronous association.

[0034] In the above process, for device behavior characteristics, a threshold corresponding to each characteristic is first preset. Real-time device behavior characteristics are then compared to the corresponding thresholds. If a value exceeds the preset threshold, a preliminary candidate label for device anomaly is generated. Simultaneously, the nanosecond-level timestamp of the first time the characteristic exceeds the threshold is recorded as a candidate value for the device anomaly's start timestamp. Next, an improved AI mutation detection algorithm (based on a fusion model of isolated forest and LSTM) is introduced. The preprocessed time-series sequence of device behavior characteristics is input. The model learns the characteristic change patterns under normal device operation, accurately identifying instantaneous mutations and gradual anomalies in the feature sequence, filtering out false threshold exceedance signals caused by instantaneous network fluctuations, and verifying the initially labeled candidate labels. After successful verification, the device anomaly label is determined, and the timestamp of the device anomaly is accurately extracted. The labeling method for user anomaly labels is the same as that for device anomaly labels.

[0035] The improved AI mutation detection algorithm is based on the fusion of Isolation Forest and LSTM, which is an existing technology. The thresholds corresponding to the features are set by engineers according to the operating status of the communication network; no numerical limits are imposed here.

[0036] S103. Use polynomial fitting to generate abnormal feature fitting curves, including curves showing the change of abnormal equipment behavior characteristics over time and curves showing the change of abnormal user behavior characteristics over time.

[0037] In the above process, for the labeled device anomaly tags, the time-series sequence of the single device behavior characteristics corresponding to each tag is extracted. A two-dimensional time-series data point set is constructed with nanosecond-level timestamps as the horizontal axis and the quantified values ​​of the device anomaly behavior characteristics as the vertical axis. This data point set is then substituted into a polynomial fitting model, and the polynomial coefficients are solved using the least squares method to obtain the device anomaly feature fitting function. Based on this fitting function, a curve representing the change of device anomaly behavior characteristics over time is generated, clearly characterizing the core change patterns of device anomalies. The generation process for the curve representing the change of user anomaly behavior characteristics over time is the same.

[0038] S2. Extract all operation and maintenance samples from historical operation and maintenance related data, extract device behavior features and user behavior features from the operation and maintenance samples, and then perform labeling and time sequence relationship pre-judgment in the manner of S1, and verify the time sequence relationship. Then, construct a bidirectional time sequence association fault rule base, including device-first rule set, user-first rule set and synchronization association rule set. The rule set includes multiple abnormal labels, abnormal feature fitting feature curves of each abnormal label and suggested operation and maintenance parameter set.

[0039] In a specific embodiment, the process of verifying the temporal relationship is as follows: S201, device pre-verification: For the samples of devices pre-judged in the historical operation and maintenance related data, verify whether the user's abnormality disappears synchronously within a preset time after the device abnormality handling is lifted. If so, it is a true causal relationship; otherwise, it is a false association and is removed.

[0040] In the above, sample preparation involves selecting a pre-judgment sample set from historical operation and maintenance data in the cloud that is pre-judged as device-first by S1 time series. Each pre-judgment sample is associated with complete core data dimensions, including: device anomaly tags, user anomaly tags, nanosecond-level timestamps of device and user anomalies, device anomaly handling records (including handling execution time, handling parameters, executing entity, handling completion status, etc.), anomaly resolution duration (device-first anomaly resolution duration, associated user anomaly resolution duration), and effective quantitative evaluation parameters in the operation and maintenance process.

[0041] Data Extraction: For each pre-judgment sample, deconstruct key information according to the logical link of equipment anomaly - handling action - user anomaly: equipment anomaly start timestamp t de_start Equipment malfunction handling instruction issuance timestamp t de_dis The timestamp indicating that the equipment disposal was completed and confirmed to be effective. tde_fin The timestamp t for when the equipment malfunction is completely resolved. de_end The start timestamps t of all user anomalies that are time-series related to this device anomaly. ue_start User abnormal peak timestamp t ue_peak User exception resolution timestamp t ue_end Whether there were any independent actions taken on the user side and whether there were any third-party network incidents.

[0042] Verification steps: Using the timestamp t when the device malfunction is completely resolved. de_end As a baseline, that is, the device behavior characteristics do not exceed the preset threshold corresponding to the characteristics; within the verification time window [t] de_end ,t de_endWithin [+T], continuously collect related user behavior features, where T is a preset duration. The criteria are: all user behavior features with temporal correlation do not exceed the preset threshold for any of the features within three consecutive sampling periods; no independent user-side actions are performed within the verification time window; there are no third-party network events affecting the process; and the elimination of user anomalies is solely due to the effective handling of device anomalies. If all the above conditions are met, it is determined to be a genuine causal relationship; otherwise, it is considered a false correlation and is removed.

[0043] The preset duration is set by the engineer based on the scale and configuration of the communication network.

[0044] S202. User Pre-Verification: For user pre-judgment samples in historical operation and maintenance related data, verify whether the device abnormality is eliminated synchronously within a preset time after the user's abnormal behavior control. If so, it is a true cause and effect; otherwise, it is a false association and is removed.

[0045] The verification steps for user-first verification are the same as those for device-first verification.

[0046] S203. Synchronization Association Verification: For the pre-judgment samples of synchronization association in historical operation and maintenance related data, verify whether there are any third-party triggering factors, and whether the abnormal synchronization of devices and users disappears after the elimination of such factors. If so, it is a true cause and effect; otherwise, it is a false association and is removed.

[0047] Specifically, sample preparation and data extraction are performed following the sample preparation and data extraction process in the equipment pre-validation. Data extraction includes additional information on third-party factor types and the factor occurrence start time stamp t. third_start Peak influence of factors timestamp t third_peak Factor handling start timestamp t third_dis Factors completely eliminated timestamp t third_end Whether the factors affecting the dual anomalies cover the corresponding equipment and users, and whether there are independent handling actions for the equipment side and the user side.

[0048] The verification steps are the same as the verification steps of the device first verification, and the authenticity verification is added: the third-party factors have objective records, including but not limited to: attack detection logs, data center monitoring records, network cutover notifications, public network status alarms and device firmware update logs; correlation verification: verify the temporal correlation and impact correlation between the third-party factors and the two anomalies: (1) temporal correlation: the occurrence time t of the third-party factors third_start It must be earlier than or equal to the start timestamp of the double anomaly (t) third_start ≤max(t de_start ,t ue_start(1) The time difference is ≤5μs, ensuring that the two anomalies are triggered by the third-party factor, rather than other causes; (2) Impact correlation: the scope of the third-party factor's influence must fully cover the devices and users corresponding to the two anomalies; (3) there are no independent handling actions on the device side or user side during the operation and maintenance process. If all the above conditions are met, it is a true causal relationship; otherwise, it is a false correlation and should be eliminated.

[0049] S204. For verified real-world correlation samples, add the final temporal causal relationship label: Device anomaly label - User anomaly label - Final temporal relationship.

[0050] In a specific embodiment, the process of constructing a bidirectional time-series associated fault rule base is as follows: S211, extract all samples with final time-series causal relationship labels after time-series relationship verification. Each sample contains dual behavioral features, anomaly labels, final time-series relationships, fault types, and operation and maintenance handling records; extract from historical operation and maintenance related data.

[0051] S212. Standardize and calibrate the dual-behavioral features in each sample to generate anomaly feature fitting curve templates corresponding to each anomaly label; extract the effective operation and maintenance parameters in each sample and organize them into a standardized set of recommended operation and maintenance parameters.

[0052] In the above process, the abnormal labels are classified and organized to ensure that similar abnormal labels are grouped together. The Z-score standardization method is used to standardize the dual-behavioral features separately to eliminate differences in numerical magnitude and dimension. Each standardized feature value is verified, and outliers that exceed the [0,1] interval after calibration are removed. At the same time, the nanosecond-level timestamp corresponding to each feature value is retained to ensure that the calibrated data not only eliminates the difference in magnitude but also completely retains the change trend of the original abnormal features without changing the core change law of the features.

[0053] Based on standardized and calibrated dual-behavioral feature data, typical variation patterns of similar anomalies are aggregated. The specific steps are as follows: Using anomaly labels as the core grouping criterion, all samples in the standardized dual-behavioral feature dataset are aggregated, with all samples sharing the same anomaly label grouped together. This ensures that each group corresponds to a unique anomaly label. For each group of samples with the same anomaly label, polynomial fitting is performed on both device and user anomaly behavior features to obtain the anomaly feature fitting curve for each sample within that group. Subsequently, the fitting curves of all samples within the same group are aggregated, and the curve averaging fusion method is used to extract the standardized feature mean of all fitting curves at the same time sampling points. This constructs the average fitting curve corresponding to the anomaly label, serving as the initial curve template for that anomaly label. The initial curve template is then optimized by removing sampling points with abnormal fluctuations in the curve and smoothing the curve trend to ensure that the template accurately represents the typical variation patterns of the anomaly label. This completes the generation of the anomaly feature fitting curve template, and each template is labeled with complete association information, including but not limited to anomaly labels and time-series scene types.

[0054] Preferably, the extraction process of effective operation and maintenance (O&M) parameters in each sample is as follows: obtain the prior anomaly resolution time and associated anomaly resolution time after the execution of each O&M parameter from the O&M record of each sample; at the same time, extract the corresponding effective quantitative evaluation parameters, and then perform normalization processing to evaluate the O&M effectiveness coefficient of each O&M parameter; select each O&M parameter whose O&M effectiveness coefficient is greater than the preset O&M effectiveness coefficient threshold as each target parameter; and then aggregate and organize each target parameter to form standardized effective O&M parameters.

[0055] In the above, the effective quantitative evaluation parameters are quantitative parameters that reflect the effectiveness of the operation and maintenance handling results, including but not limited to memory utilization increment, the proportion of affected users who have not recovered, and the interval between abnormal recurrences. The min-max normalization method is used for processing. Then, all parameters in the effective quantitative evaluation parameters are positively oriented, and the mean of the positively oriented parameters is calculated as the effective quantitative evaluation parameter value, denoted as A. The operation and maintenance effectiveness coefficient = (1-T) / ( ... 先行异常解除时长 )×0.5+(1-T 关联异常解除时长 )×0.3+A×0.2, where, T 先行异常解除时长 T 关联异常解除时长 These represent the duration for resolving prior anomalies and the duration for resolving associated anomalies, respectively.

[0056] It should be noted that positive transformation involves adjusting the normalized parameters according to their attributes, ensuring that larger parameter values ​​represent better treatment outcomes. For example, a larger increase in memory utilization indicates a worse treatment outcome; its attribute is a negative parameter. The result of normalizing 1 - the increase in memory utilization is the memory utilization increase after positive transformation. A larger interval between abnormal recurrences indicates a better treatment outcome; its attribute is a positive parameter and does not require further positive transformation; the normalized value is used directly.

[0057] The preset effective maintenance coefficient threshold is set and adjusted by the engineer according to the communication network maintenance needs, and no specific numerical limit is specified here.

[0058] It should be noted that operation and maintenance (O&M) parameters refer to the quantifiable, configurable, and executable instructions, thresholds, strategies, and constraints used in the actual execution of O&M operations to eliminate equipment malfunctions, user anomalies, or third-party triggering factors when an anomaly occurs in the communication network. These are the specific execution parameters for repair. They include, but are not limited to, device port restart instructions, link switching thresholds, CPU utilization rate limiting thresholds, device restart latency, optical power calibration parameters, fault port shielding parameters, firmware repair parameters, user traffic rate limiting thresholds, concurrent connection limits, abnormal access blocking strategies, user bandwidth convergence ratios, and terminal access authentication restriction parameters.

[0059] S213. Using a time-series correlation algorithm, for the three types of time-series relationships, the correlation between anomaly labels, anomaly feature fitting curves, effective operation and maintenance parameters and fault types in each time-series relationship sample is mined and used as rules to generate an initial rule set.

[0060] Preferably, the process of mining the association relationship is as follows: Equipment-first scenario: Using a time-series association algorithm, the abnormal feature fitting curve corresponding to the same equipment abnormality label is extracted to form an equipment abnormality label-curve mapping, resulting in multiple equipment abnormality label-curve combinations; the user abnormality features and fault types corresponding to the equipment abnormality label-curve combinations are matched, and the effective operation and maintenance handling parameters of the fault type are extracted to form a complete association chain.

[0061] User-first scenario: Using a time-series correlation algorithm, the abnormal feature fitting curve corresponding to the same user abnormal label is extracted to form a user abnormal label-curve mapping, resulting in multiple user abnormal label-curve combinations. The device abnormal features and fault types corresponding to the user abnormal label-curve combinations are matched, and the effective operation and maintenance handling parameters of the fault type are extracted to form a complete correlation chain.

[0062] Synchronous association scenario: Using a time-series association algorithm, the device anomaly tags and user anomaly tags that occur synchronously are extracted, as well as the corresponding anomaly feature fitting curves, forming a dual-label-hyperbola mapping. Multiple dual-label-hyperbola mapping combinations are obtained, and the fault types of the dual-label-hyperbola mapping combinations are matched. The effective operation and maintenance handling parameters of the fault type are extracted to form a complete association chain.

[0063] S214. Filter each rule in the initial rule set, select the valid rules, add a unique identifier and corresponding time sequence relationship label, and set the priority of each valid rule. Then, according to the time sequence relationship, divide them into device-first rule set, user-first rule set and synchronization association rule set. Each type of rule base is further subdivided and stored according to the fault type to form a bidirectional time sequence association fault rule base.

[0064] Preferably, the effective rule screening process is as follows: extract the sample size of each rule and the operational efficiency coefficient of each sample, take the mean of the operational efficiency coefficient of each sample as the average operational efficiency coefficient, normalize the sample size and average operational efficiency coefficient of each rule, and then calculate the effective coefficient of each effective rule by means of the mean, and take the rule with effective coefficient > 0.6 as the effective rule.

[0065] The effective coefficient of each valid rule, the number of devices involved in each sample, and the number of users are obtained from the operation and maintenance records of each sample. The average number of devices and users is calculated to obtain the average number of devices and the average number of users involved in each valid rule. After normalization, the effective coefficient, average number of devices, and average number of users of each valid rule are averaged and the calculation result is used as the priority of each valid rule.

[0066] S3. Match the time-series causal relationship labels and anomaly feature fitting curves of the real-time dual-behavioral features in S1 with the bidirectional time-series correlation fault rule base, output the corresponding anomaly labels, effective operation and maintenance handling parameters and fault types, and then carry out targeted operation and maintenance.

[0067] In a specific embodiment, the specific process of S3 is as follows: the temporal causal relationship label and anomaly feature fitting curve of the real-time dual behavior feature in S1 are matched hierarchically with the bidirectional temporal correlation fault rule base. First, the device-first rule set, user-first rule set or synchronization correlation rule set corresponding to the bidirectional temporal correlation fault rule base are located according to the temporal causal relationship label. Then, the cosine similarity and fitting deviation value between the real-time anomaly feature fitting curve and the curve template of the corresponding anomaly label in the rule base are calculated.

[0068] The calculated cosine similarity is compared with a preset similarity threshold, and the fitting deviation value is compared with a preset deviation threshold. Only valid candidate rules with a cosine similarity greater than the preset similarity threshold and a fitting deviation value less than the preset deviation threshold are retained. Then, a comprehensive matching score is calculated by weighting the cosine similarity and the fitting deviation value, and the rule with the largest comprehensive matching score is selected as the optimal association rule for this fault identification.

[0069] Extract the anomaly label, effective operation and maintenance parameters, and fault type corresponding to the optimal association rule from the bidirectional time-series associated fault rule base.

[0070] It should be noted that cosine similarity and fitting deviation values ​​are existing technologies and will not be elaborated upon here. The preset similarity threshold and preset deviation threshold are set by staff according to network security requirements, and no specific numerical limits are imposed here.

[0071] The overall matching score is calculated as: cosine similarity × W1 + (1 - fitting deviation value) × W2, where W1 and W2 are the weights of cosine similarity and fitting deviation value, respectively, satisfying W1 + W2 = 1, W1 = 0.6, and W2 = 0.4.

[0072] It should be noted that the values ​​of W1 and W2 can also be set by staff according to network security requirements.

[0073] S4. After targeted maintenance, collect relevant maintenance parameters and verify whether the anomaly has been eliminated. If not, trigger a secondary handling process until the fault is eliminated.

[0074] In a specific embodiment, the specific process of S4 is as follows: S401, after targeted operation and maintenance, collect operation and maintenance related parameters after handling, including the time for initial anomaly resolution, the time for associated anomaly resolution, and effective quantitative evaluation parameters, and then perform normalization processing to evaluate the operation and maintenance effectiveness coefficient of this operation and maintenance. If the operation and maintenance effectiveness coefficient is greater than the preset operation and maintenance effectiveness coefficient threshold, the operation and maintenance is determined to be effective and the anomaly is eliminated; otherwise, the operation and maintenance is determined to be ineffective and the anomaly is not eliminated.

[0075] It should be noted that effective quantitative evaluation parameters refer to objective evaluation indicators used to quantitatively reflect the effectiveness of targeted operation and maintenance and whether the anomalies have been truly and effectively eliminated. These are all collectable, calculable, and normalizable values, including but not limited to: memory utilization increment, the proportion of affected users who have not recovered, and the interval between anomaly recurrences.

[0076] The calculation process for the operational effectiveness coefficient is as follows: Normalized effective quantitative evaluation parameters are positively oriented: For negative parameters, such as memory utilization increment and the proportion of affected users not yet recovered, the positively oriented value = 1 - the normalized value; for positive parameters, such as the interval between abnormal recurrences, the normalized value is directly used. The arithmetic mean of all positively oriented effective quantitative evaluation parameters is taken as the processed effective quantitative evaluation parameter value.

[0077] Operation and maintenance effectiveness coefficient = (1 - normalized value of time for resolving prior anomalies) × 0.5 + (1 - normalized value of time for resolving associated anomalies) × 0.3 + effective quantitative evaluation parameter processing value × 0.2.

[0078] The preset operation and maintenance effectiveness threshold is the critical value for determining whether operation and maintenance is effective. It is set by the staff themselves according to network security needs.

[0079] S402. Select the top three best matching scores as the preferred association rules for this fault identification. Extract the anomaly labels, effective operation and maintenance parameters and fault types corresponding to the three preferred association rules from the bidirectional time-series association fault rule library. If the anomaly labels and fault types corresponding to the three preferred association rules are consistent, merge and deduplicate the effective operation and maintenance parameters corresponding to the three preferred association rules as the effective operation and maintenance parameters for secondary processing.

[0080] S403. If the anomaly labels or fault types corresponding to the three preferred association rules are inconsistent, the preferred association rule with the highest comprehensive matching score shall be used as the benchmark rule. The anomaly labels and fault types corresponding to the benchmark rule shall be used as the core judgment criteria to select the remaining preferred association rules that are consistent with the anomaly labels and fault types of the benchmark rule. Then, they shall be merged and deduplicated to obtain the benchmark handling parameter set. For rules with inconsistent anomaly labels or fault types, only the handling parameters that are compatible with the current time-series causal relationship shall be retained. After the compatibility verification is passed, they shall be added to the benchmark handling parameter set to form effective operation and maintenance handling parameters for secondary handling.

[0081] It should be noted that handling parameters adapted to the current causal relationship refer to operational handling parameters whose direction and object of handling are consistent with the causal relationship and can directly eliminate the root cause of the anomaly. For example, if the current causal relationship is device-first and the anomaly originates from the device side, while user anomalies are passively propagated, then the appropriate handling parameters are those addressing device-side issues such as those related to devices, ports, links, CPUs, memory, optical power, and firmware. Incompatible parameters are those related to the control of user traffic, user access, and user behavior.

[0082] The aforementioned adaptability verification is a dedicated scenario verification performed during the secondary processing on the processing parameters extracted from rules with inconsistent anomaly labels or fault types. Specifically, it includes: temporal causality adaptation verification: verifying whether the object and direction of the processing parameter match the current device-first, user-first, or synchronously associated temporal causality, ensuring that the parameter acts on the root cause of the anomaly, rather than a related anomaly resulting from propagation; baseline rule consistency verification: verifying whether the parameter is compatible with the anomaly labels, fault types, and processing logic of the baseline rules, avoiding conflicts with the baseline processing parameter set; execution scenario legality verification: verifying whether the parameter value, instruction format, and execution conditions conform to the requirements of the current fault domain, device type, and user scenario; and security without secondary risk verification: verifying that executing the parameter will not cause device overload, service interruption, network fluctuations, or other secondary anomalies.

[0083] The timing adaptation check obtains the timing causal relationship type of the current fault and determines whether the target and direction of the supplementary handling parameters match the timing type: if the device comes first, only the handling parameters acting on the device side are retained.

[0084] If the user takes precedence, only the processing parameters that apply to the user side will be retained.

[0085] If it is a synchronous association, only the processing parameters that apply to third-party triggering factors will be retained; if they do not match, they will be discarded directly.

[0086] The parameter range verification is performed by staff who pre-set legal value ranges and reasonable threshold ranges for various operation and maintenance parameters. The staff then determines whether the value of the parameter to be added is within the preset legal range. If it exceeds the range, the verification fails.

[0087] The conflict check compares the parameter to be supplemented with all parameters in the baseline parameter set one by one: if there are parameters with opposite effects, mutually exclusive execution, or logical contradictions, then a conflict is determined and the parameter is discarded; if there is no conflict, then it is retained.

[0088] The execution condition verification checks whether the current device status, network status, user status, and fault scenario meet the execution prerequisites for the handling parameter. If not, the parameter will not be used.

[0089] The security check determines whether executing the processing parameters will cause equipment overload, service interruption, link oscillation or other secondary anomalies. If there is a risk, the check will fail.

[0090] Only processing parameters that simultaneously meet all the above verification conditions are deemed to have passed the adaptability verification and can be added to the baseline processing parameter set; if any condition is not met, the parameter is discarded. The specific verification process is a conventional technique in this field and will not be elaborated here.

[0091] S5. Collect new operation and maintenance cases during the operation and maintenance process, repeat S2 to complete the full iteration of the rule base, and generate an operation and maintenance strategy optimization report based on the operation and maintenance verification results.

[0092] In a specific embodiment, the process of S5 is as follows: S501, continuously collect newly added real operation and maintenance cases generated in the entire operation and maintenance process, and fully collect all the data including time-series causal relationship labels, abnormal feature fitting curves, fault types, equipment and user abnormal labels, operation and maintenance handling parameters, operation and maintenance effectiveness coefficients, and effective quantitative evaluation parameters. Then, according to the rules of S2, the process is constructed to complete the full iterative update of the bidirectional time-series associated fault rule base.

[0093] S502. Based on the results of multiple rounds of operation and maintenance verification, classify, statistically analyze and quantify the results according to time sequence relationships and fault types, analyze the improvement indicators of operation and maintenance efficiency and fault handling effect of each time sequence relationship, determine the optimization direction of operation and maintenance strategy for each fault type in each time sequence relationship, and form an operation and maintenance strategy optimization report.

[0094] Preferably, the operation and maintenance efficiency analysis process is as follows: Statistics are performed based on three time-series relationships: equipment-first, user-first, and synchronously related, as well as various fault types. The success rate of first-time handling and the average operation and maintenance effectiveness coefficient are calculated, and then normalized. Operation and maintenance efficiency = First-time handling success rate × 0.5 + Average operation and maintenance effectiveness coefficient × 0.5. Wherein, the first-time handling success rate is: (Number of cases where one operation and maintenance is effective (no secondary handling) / Total number of cases) × 100%; the average operation and maintenance effectiveness coefficient is the average of the operation and maintenance effectiveness coefficients of all cases of the same type of fault.

[0095] Indicators for improving fault handling effectiveness: Based on the rule base before iteration, extract the improvement value of the success rate of handling and the improvement of the operation and maintenance effectiveness coefficient for each type of fault in the three types of time-series relationships.

[0096] Specifically, the improvement in the first-time handling success rate = the first-time handling success rate after iteration - the first-time handling success rate before iteration. The improvement in the operational efficiency coefficient = the average operational efficiency coefficient after iteration - the average operational efficiency coefficient before iteration.

[0097] Optimization directions for operation and maintenance strategies: If the success rate of a single incident is less than the threshold for the success rate of a single incident, expand the sample; if the average operational efficiency coefficient is less than the threshold for the average operational efficiency coefficient, refine the curve template and optimize the incident response speed; if the improvement value of the success rate of a single incident is less than the threshold for the improvement value of the success rate of a single incident, optimize the secondary incident parameter fusion strategy; if the improvement magnitude of the operational efficiency coefficient is less than the threshold for the improvement magnitude of the operational efficiency coefficient, optimize the operational and maintenance incident parameter strategy.

[0098] The above-mentioned thresholds for first-time handling success rate, average maintenance effectiveness coefficient, first-time handling success rate improvement value, and maintenance effectiveness coefficient improvement magnitude are set in the same way as the preset maintenance effectiveness coefficient thresholds, and will not be repeated here.

[0099] Please see Figure 2 As shown, an AI-based all-time-domain operation and maintenance support system for communication networks includes: an anomaly monitoring module, which deploys integrated acquisition nodes to collect device behavior characteristics and user behavior characteristics within complex communication networks, adds nanosecond-level timestamps, standardizes the collected dual-behavioral characteristics, labels real-time device anomaly tags and user anomaly tags, performs temporal relationship pre-determination, and outputs temporal causal relationship tags of real-time dual-behavioral characteristics and anomaly feature fitting feature curves.

[0100] The rule building module extracts all operation and maintenance samples from historical operation and maintenance related data, extracts device behavior features and user behavior features from the operation and maintenance samples, and then performs labeling and temporal relationship pre-judgment according to the S1 method, and verifies the temporal relationship. Then, it builds a bidirectional temporal correlation fault rule base, including device-first rule set, user-first rule set and synchronous correlation rule set. The rule set includes multiple abnormal labels, abnormal feature fitting feature curves for each abnormal label and suggested operation and maintenance parameter set.

[0101] The anomaly operation and maintenance module is used to match the time-series causal relationship labels and anomaly feature fitting feature curves of real-time dual-behavioral features in the anomaly monitoring module with the bidirectional time-series correlation fault rule base, output the corresponding anomaly labels, effective operation and maintenance handling parameters and fault types, and then perform targeted operation and maintenance.

[0102] The operation and maintenance monitoring module is used to collect operation and maintenance-related parameters after targeted operation and maintenance, verify whether the anomaly has been eliminated, and if not, trigger a secondary handling process until the fault is eliminated.

[0103] The operation and maintenance optimization module is used to collect new operation and maintenance cases during the operation and maintenance process. The recurring rule building module completes the full iteration of the rule base and generates an operation and maintenance strategy optimization report based on the operation and maintenance verification results.

[0104] The examples described in this invention are not limited to the specific embodiments listed above. The examples are merely illustrative to facilitate understanding of the invention and do not constitute a limitation on the scope of protection of this invention. Any modifications, equivalent substitutions, etc., made within the spirit and principles of this invention should be included within the scope of protection.

[0105] The above description is merely an example and illustration of the concept of the present invention. Those skilled in the art can make various modifications or additions to the specific embodiments described or use similar methods to replace them, as long as they do not deviate from the concept of the invention or exceed the scope defined in this specification, they should all fall within the protection scope of the present invention.

Claims

1. A method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence, characterized in that, Includes the following steps: S1. Deploy integrated acquisition nodes to collect device behavior characteristics and user behavior characteristics within complex communication networks, add nanosecond-level timestamps, standardize the collected dual behavior characteristics, label real-time device anomaly tags and user anomaly tags, and simultaneously perform temporal relationship pre-determination, outputting temporal causal relationship tags of real-time dual behavior characteristics and anomaly feature fitting feature curves. S2. Extract all operation and maintenance samples from historical operation and maintenance related data, extract device behavior features and user behavior features from the operation and maintenance samples, and then perform labeling and time sequence relationship pre-judgment in the manner of S1, and verify the time sequence relationship. Then, construct a bidirectional time sequence correlation fault rule base, including device-first rule set, user-first rule set and synchronization correlation rule set. The rule set includes multiple abnormal labels, abnormal feature fitting feature curves of each abnormal label and suggested operation and maintenance parameter set. S3. Match the time-series causal relationship labels and anomaly feature fitting feature curves of the real-time dual behavior features in S1 with the bidirectional time-series correlation fault rule base, output the corresponding anomaly labels, effective operation and maintenance handling parameters and fault types, and then carry out targeted operation and maintenance. S4. After targeted maintenance, collect relevant maintenance parameters to verify whether the anomaly has been eliminated. If not, trigger a secondary handling process until the fault is eliminated. S5. Collect new operation and maintenance cases during the operation and maintenance process, repeat S2 to complete the full iteration of the rule base, and generate an operation and maintenance strategy optimization report based on the operation and maintenance verification results.

2. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 1, characterized in that, The steps in S1 are as follows: S101, deploys an integrated end-network-cloud data collection node to collect all-dimensional device behavior characteristics and device behavior features in complex communication networks, uniformly adds nanosecond-level timestamps, and completes dual behavior feature standardization processing through Z-score; S102. Label real-time device anomaly tags and user anomaly tags based on threshold and AI mutation detection algorithms; and obtain the timestamps of device anomalies and user anomalies, and mark them as follows: Calculate the time difference , >0, the timing relationship is device first. <0, the timing relationship is user first. The temporal relationship is synchronous association; S103. Use polynomial fitting to generate abnormal feature fitting curves, including curves showing the change of abnormal equipment behavior characteristics over time and curves showing the change of abnormal user behavior characteristics over time.

3. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 1, characterized in that, The process of verifying the temporal relationship is as follows: S201. Equipment Pre-Verification: For samples of equipment pre-judgment in historical operation and maintenance data, verify whether the user's abnormality disappears synchronously within a preset time after the equipment abnormality handling is resolved. If so, it is a true cause and effect; otherwise, it is a false correlation and is removed. S202. User Pre-Verification: For user pre-judgment samples in historical operation and maintenance related data, verify whether the device abnormality is eliminated synchronously within a preset time after the user's abnormal behavior control is implemented. If so, it is a true cause and effect; otherwise, it is a false association and is removed. S203. Synchronization Association Verification: For the pre-judgment samples of synchronization association in historical operation and maintenance related data, verify whether there are third-party triggering factors, and whether the abnormal synchronization of devices and users disappears after the factor is eliminated. If so, it is a true cause and effect; otherwise, it is a false association and is removed. S204. For verified real-world correlation samples, add the final temporal causal relationship label: Device anomaly label - User anomaly label - Final temporal relationship.

4. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 1, characterized in that, The process of constructing the bidirectional time-series correlation fault rule base is as follows: S211. Extract all samples with final temporal causal relationship labels after the temporal relationship verification. Each sample contains dual behavioral features, anomaly labels, final temporal relationship, fault type, and operation and maintenance records. S212. Standardize and calibrate the dual-behavioral features in each sample to generate anomaly feature fitting curve templates corresponding to each anomaly label; extract the effective operation and maintenance parameters for each sample and organize them into a standardized set of recommended operation and maintenance parameters. S213. Using a time-series correlation algorithm, for the three types of time-series relationships, the correlation between anomaly labels, anomaly feature fitting curves, effective operation and maintenance parameters and fault types in each time-series relationship sample is mined, and the resulting rules are used to generate an initial rule set. S214. Filter each rule in the initial rule set, select the valid rules, add a unique identifier and corresponding time sequence relationship label, and set the priority of each valid rule. Then, according to the time sequence relationship, divide them into device-first rule set, user-first rule set and synchronization association rule set. Each type of rule base is further subdivided and stored according to the fault type to form a bidirectional time sequence association fault rule base.

5. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 4, characterized in that, The extraction process for valid operation and maintenance parameters in each sample is as follows: The system retrieves the pre-anomaly resolution time and associated anomaly resolution time after each operation and maintenance (O&M) parameter is executed from the O&M records of each sample. At the same time, it extracts the corresponding effective quantitative evaluation parameters, performs normalization processing, evaluates the O&M effectiveness coefficient of each O&M parameter, selects each O&M parameter whose O&M effectiveness coefficient is greater than the preset O&M effectiveness coefficient threshold as each target parameter, and then aggregates and organizes each target parameter to form standardized effective O&M parameters.

6. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 4, characterized in that, The process of mining the aforementioned relationships: Equipment-first scenario: Using a time-series correlation algorithm, the abnormal feature fitting curve corresponding to the same equipment abnormality label is extracted to form an equipment abnormality label-curve mapping, resulting in multiple equipment abnormality label-curve combinations; the user abnormality features and fault types corresponding to the equipment abnormality label-curve combinations are matched, and the effective operation and maintenance handling parameters of the fault type are extracted to form a complete correlation chain; User-first scenario: Using a time-series correlation algorithm, the abnormal feature fitting curve corresponding to the same user abnormal label is extracted to form a user abnormal label-curve mapping, resulting in multiple user abnormal label-curve combinations. The device abnormal features and fault types corresponding to the user abnormal label-curve combinations are matched, and the effective operation and maintenance handling parameters of the fault type are extracted to form a complete correlation chain. Synchronous association scenario: Using a time-series association algorithm, the device anomaly tags and user anomaly tags that occur synchronously are extracted, as well as the corresponding anomaly feature fitting curves, forming a dual-label-hyperbola mapping. Multiple dual-label-hyperbola mapping combinations are obtained, and the fault types of the dual-label-hyperbola mapping combinations are matched. The effective operation and maintenance handling parameters of the fault type are extracted to form a complete association chain.

7. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 1, characterized in that, The specific process of S3 is as follows: The time-series causal relationship labels and anomaly feature fitting curves of real-time dual behavior features in S1 are matched hierarchically with the bidirectional time-series associated fault rule base. First, the corresponding device-first rule set, user-first rule set, or synchronization association rule set in the bidirectional time-series associated fault rule base is located based on the time-series causal relationship labels. Then, the cosine similarity and fitting deviation value between the real-time anomaly feature fitting curve and the curve template of the corresponding anomaly label in the rule base are calculated. The calculated cosine similarity is compared with a preset similarity threshold, and the fitting deviation value is compared with a preset deviation threshold. Only valid candidate rules with a cosine similarity greater than the preset similarity threshold and a fitting deviation value less than the preset deviation threshold are retained. Then, a comprehensive matching score is calculated by weighting the cosine similarity and the fitting deviation value, and the rule with the largest comprehensive matching score is selected as the optimal association rule for this fault identification. Extract the anomaly label, effective operation and maintenance parameters, and fault type corresponding to the optimal association rule from the bidirectional time-series associated fault rule base.

8. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 7, characterized in that, The specific process of S4 is as follows: S401. After targeted maintenance, collect relevant maintenance parameters, including the time for initial anomaly resolution, the time for associated anomaly resolution, and effective quantitative evaluation parameters. Then, perform normalization processing to evaluate the maintenance effectiveness coefficient of this maintenance. If the maintenance effectiveness coefficient is greater than the preset maintenance effectiveness coefficient threshold, the maintenance is deemed effective and the anomaly is eliminated; otherwise, the maintenance is deemed ineffective and the anomaly is not eliminated. S402. Select the top three best matching scores as the preferred association rules for this fault identification. Extract the anomaly labels, effective operation and maintenance parameters and fault types corresponding to the three preferred association rules from the bidirectional time-series association fault rule library. If the anomaly labels and fault types corresponding to the three preferred association rules are consistent, merge and deduplicate the effective operation and maintenance parameters corresponding to the three preferred association rules as the effective operation and maintenance parameters for secondary processing. S403. If the anomaly labels or fault types corresponding to the three preferred association rules are inconsistent, the preferred association rule with the highest comprehensive matching score shall be used as the benchmark rule. The anomaly labels and fault types corresponding to the benchmark rule shall be used as the core judgment criteria to select the remaining preferred association rules that are consistent with the anomaly labels and fault types of the benchmark rule. Then, they shall be merged and deduplicated to obtain the benchmark handling parameter set. For rules with inconsistent anomaly labels or fault types, only the handling parameters that are compatible with the current time-series causal relationship shall be retained. After the compatibility verification is passed, they shall be added to the benchmark handling parameter set to form effective operation and maintenance handling parameters for secondary handling.

9. The method for all-time-domain operation and maintenance support of communication networks based on artificial intelligence according to claim 1, characterized in that, The process of S5 is as follows: S501. Continuously collect new real operation and maintenance cases generated in the entire operation and maintenance process, and fully collect all the data in the entire process, including time-series causal relationship labels, abnormal feature fitting curves, fault types, equipment and user abnormal labels, operation and maintenance handling parameters, operation and maintenance effectiveness coefficients, and effective quantitative evaluation parameters. Then, according to the rules of S2, construct the process to complete the full iterative update of the bidirectional time-series correlation fault rule base. S502. Based on the results of multiple rounds of operation and maintenance verification, classify, statistically analyze and quantify the results according to time sequence relationships and fault types, analyze the improvement indicators of operation and maintenance efficiency and fault handling effect of each time sequence relationship, determine the optimization direction of operation and maintenance strategy for each fault type in each time sequence relationship, and form an operation and maintenance strategy optimization report.

10. A system executed using the AI-based all-time-domain operation and maintenance support method for communication networks according to any one of claims 1-9, characterized in that, include: The anomaly monitoring module deploys an integrated acquisition node to collect device and user behavior characteristics within complex communication networks, adds nanosecond-level timestamps, standardizes the collected dual-behavioral characteristics, labels real-time device anomaly tags and user anomaly tags, performs temporal relationship pre-determination, and outputs temporal causal relationship tags of real-time dual-behavioral characteristics and anomaly feature fitting feature curves. The rule building module extracts all operation and maintenance samples from historical operation and maintenance related data, extracts device behavior features and user behavior features from the operation and maintenance samples, and then performs labeling and time-series relationship pre-judgment according to the S1 method, and verifies the time-series relationship. Then, it builds a bidirectional time-series correlation fault rule base, including device-first rule set, user-first rule set and synchronization correlation rule set. The rule set includes multiple abnormal labels, abnormal feature fitting feature curves for each abnormal label and suggested operation and maintenance parameter set. The anomaly operation and maintenance module is used to match the time-series causal relationship labels and anomaly feature fitting feature curves of real-time dual behavior features in the anomaly monitoring module with the bidirectional time-series correlation fault rule base, output the corresponding anomaly labels, effective operation and maintenance handling parameters and fault types, and then perform targeted operation and maintenance. The operation and maintenance monitoring module is used to collect operation and maintenance-related parameters after targeted operation and maintenance, verify whether the anomaly has been eliminated, and if not, trigger a secondary handling process until the fault is eliminated. The operation and maintenance optimization module is used to collect new operation and maintenance cases during the operation and maintenance process. The recurring rule building module completes the full iteration of the rule base and generates an operation and maintenance strategy optimization report based on the operation and maintenance verification results.