Substation monitoring information intelligent standardization method and system based on bert-crf and rule constraint
By using the BERT-CRF model and rule constraint technology, intelligent standardization and automated testing of substation monitoring information were achieved, solving the problems of non-standard monitoring information point tables and reliance on manual rule configuration, thus improving the reliability and efficiency of the system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHENGDU DENGLU ELECTRIC POWER TECH CO LTD
- Filing Date
- 2026-04-20
- Publication Date
- 2026-06-19
AI Technical Summary
The descriptions of substation monitoring information points are inconsistent and non-standardized. The configuration of event-based rules relies on manual experience, and testing and acceptance methods are lacking. This leads to rule misconfiguration, omissions, and insufficient verification, affecting the reliability and efficiency of the system.
The BERT-CRF generative language model is used for semantic recognition and standardization conversion. A reusable rule template library is built, test cases are generated, and a simulation environment is constructed through the IEC104 standard for closed-loop simulation testing and optimization.
It has achieved full-process automation and standardization of substation monitoring information, improved the accuracy of signal mapping and the efficiency of rule configuration, reduced the risk of false alarms and missed alarms, and enhanced the reliability of the system and the trust in operation and maintenance.
Smart Images

Figure CN122064930B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the fields of power system automation and artificial intelligence technology, specifically to a method and system for intelligent standardization of substation monitoring information based on BERT-CRF and rule constraints. Background Technology
[0002] With the deepening of smart grid construction, substation monitoring information event-based technology has become a key means to quickly identify power grid anomalies and improve fault handling efficiency. This technology aims to automatically aggregate massive, discrete monitoring signals into equipment faults or anomaly events with clear semantics based on predefined rules and logic, thereby directly guiding maintenance personnel to perform precise handling and significantly reducing reliance on human experience. However, in the process of large-scale promotion and practical application, this technology faces long-standing technical bottlenecks that severely restrict its reliability and application effectiveness, specifically in the following three aspects:
[0003] First, the description of the substation monitoring information point table, as the signal source for event-based rules, has long been inconsistent and non-standard. The same equipment or signal may have multiple abbreviations, acronyms, or habitual expressions in different substations or even different systems. The format is chaotic and lacks a unified standard. This heterogeneity of the underlying data makes it impossible to establish an accurate and stable mapping relationship between the abstract features defined in the upper-level event-based rules (such as line protection outputs and main transformer oil level anomalies) and the actual signals on site. This has become the root cause of rule mismatch and omission.
[0004] Secondly, event-based rules themselves involve complex combinations of features, logical operations, and timing requirements. Currently, the on-site configuration and instantiation of rules heavily rely on manual experience, requiring technicians to interpret each rule one by one and manually search for and associate corresponding signals in the meters of a specific substation. This process is not only labor-intensive and inefficient, but also prone to errors when dealing with hundreds or thousands of signals. Furthermore, the generalization ability of rule templates is insufficient, making it difficult to adapt to substations with different voltage levels, wiring methods (such as 3 / 2 wiring versus single busbar), and equipment models. This results in high rule migration costs and is prone to logical conflicts or incomplete coverage.
[0005] Finally, because substations need to operate continuously, it is difficult to inject test signals into the production environment, resulting in a lack of effective and comprehensive acceptance methods after the event-based rules are configured. Traditional sampling verification methods cannot simulate all fault scenarios, especially for complex events (such as multi-device linkage, protection failure to operate) and scenarios with strict requirements for signal timing. This leads to two serious problems after the event-based application is put into operation: first, "missed reports," that is, when an actual fault occurs, the event cannot be synthesized due to signal matching errors or logical defects; second, "false reports," that is, events are erroneously triggered in non-fault situations due to signal interference or rule constraints. Both of these situations will seriously undermine the maintenance personnel's trust in the automation system, forcing a return to the traditional mode of relying on manual monitoring, and preventing the true value of event-based technology from being realized. Summary of the Invention
[0006] The purpose of this invention is to provide an intelligent standardization method and system for substation monitoring information based on BERT-CRF and rule constraints, which realizes full-process automation, high-reliability acceptance and continuous self-optimization of substation event-based technology.
[0007] The objective of this invention can be achieved through the following technical solutions:
[0008] This application provides an intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints, including the following steps:
[0009] S1. Standardization of monitoring information point table: The BERT-CRF generative language model is used to perform semantic recognition and standardization transformation on the original point table description, and output a structured standardized signal;
[0010] S2. Event-based rule template construction: Parse rule definition files and build a reusable rule template library through keyword matching and AI assistance;
[0011] S3. Event triggering condition determination: Triggering conditions are instantiated based on semantic matching algorithm association rule features and standardized signals, combined with substation configuration parameters.
[0012] S4. Automatic test instance generation: The signal depth traversal algorithm is used to generate test instances containing positive and negative logic signal groups and timing requirements.
[0013] S5. Closed-loop simulation test and acceptance: Construct a simulation environment based on the IEC104 standard, inject test signals and collect the results of the main station event generation;
[0014] S6. Automatic configuration defect analysis: Compare the main site results with the expected results to identify and locate misconfiguration and omission issues in event-based configuration;
[0015] S7. Feedback Optimization Closed Loop: Update the event-based rule template library based on configuration defect information in the acceptance data, and improve the relationship between signal standardization and feature mapping by incrementally learning the BERT-CRF model and optimizing the semantic matching algorithm.
[0016] This application provides an intelligent standardization system for substation monitoring information based on BERT-CRF and rule constraints, applied to an intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints, including:
[0017] The intelligent semantic standardization engine module, based on the domain-pre-trained BERT-CRF model, uses deep semantic understanding and sequence labeling, combined with rigid regular expression rules, to accurately convert unstructured original descriptions into standard formats and extract key features to form structured and standardized signal representations.
[0018] The hybrid rule template building module parses the rule definition file and uses a hybrid mode of keyword matching and AI inference to automatically establish the mapping relationship between abstract features and standardized signals, and intelligently completes the default parameters to generate a single event rule template.
[0019] The feature matching and trigger judgment module uses a semantic matching algorithm to screen reliable feature-signal correspondences. Combined with the specific configuration of the substation, it binds and instantiates the calculation formulas in the rules and verifies the triggering conditions of the event through calculation.
[0020] The positive and negative logic test instance generation module uses a signal keyword depth traversal algorithm to filter and arrange signal timing in a multi-dimensional associated signal pool, generating a set of test instances containing positive and negative logic.
[0021] The IEC104 simulation test module simulates the monitoring information forwarding model based on the IEC104 protocol. It serializes test instances into standard communication messages, simulates station behavior to inject signals into the master station through a dedicated acceptance device, and collects the master station response results in real time, performing a fully automated simulation test from signal injection result collection.
[0022] The automatic acceptance and defect analysis module automatically compares the actual event results generated by the main station with the test expectations from multiple dimensions, identifies inconsistencies, locates the type and location of configuration defects, and generates structured acceptance data and preliminary analysis reports containing detailed defect information.
[0023] The self-evolutionary closed-loop optimization module utilizes the defect information discovered during acceptance testing to optimize the rule template library in a targeted manner and drive the AI standardized model to perform incremental learning. Subsequently, a regression verification process is initiated to confirm the optimization effect.
[0024] The beneficial effects of this invention are as follows:
[0025] To address the problem of inaccurate mapping between rules and signals caused by non-standard point table descriptions, this invention adopts a dual guarantee mechanism based on deep semantic understanding and rigid rule constraints. By constructing a domain-pre-trained generative language model, it accurately parses the implicit semantics of power texts and uses conditional random fields and regular expressions to ensure the absolute compliance of the output format. This automatically converts heterogeneous original descriptions into unified structured standard signals, fundamentally eliminating ambiguity at the data level, providing a high-quality signal source for event-based rules, and solving the root cause problems of rule mismatch and omission.
[0026] To address the issues of complex rule configuration and high reliance on human experience, this invention designs an intelligent configuration system that combines rule template construction with AI-assisted parameter reasoning. By combining a framework defined by human experience with a mode of AI filling in details, reusable and adaptable rule templates are quickly generated. Semantic matching algorithms and contextual reasoning are used to automatically associate rule features with specific signals and complete parameters, significantly reducing reliance on human intervention and improving configuration efficiency. Furthermore, template standardization and intelligent parameter reasoning enhance the adaptability of rules to different substations, effectively avoiding logical conflicts and incomplete coverage.
[0027] To address the issue of unreliable functionality due to a lack of testing and acceptance methods, this invention constructs a full-process verification system that automatically generates positive and negative logic test cases and performs secure isolation simulation acceptance. It systematically generates test cases covering both normal and abnormal scenarios using a signal depth traversal algorithm, and performs automated testing and result comparison in a high-fidelity isolated simulation environment based on standard communication protocols. This achieves comprehensive and rigorous verification of event-driven functions, exposing and locating configuration defects in advance. Simultaneously, feedback from acceptance data drives continuous optimization of the rule base and artificial intelligence model, ensuring high system reliability before operation, significantly reducing the risk of missed and false alarms, and ultimately establishing trust among maintenance personnel in automated events. Attached Figure Description
[0028] To better understand and implement this application, the technical solution is described in detail below with reference to the accompanying drawings.
[0029] Figure 1 A flowchart illustrating the intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints provided in Embodiment 1 of this application;
[0030] Figure 2 This is a schematic diagram of the structure of the intelligent standardization system for substation monitoring information based on BERT-CRF and rule constraints provided in Embodiment 2 of this application. Detailed Implementation
[0031] To further illustrate the technical means and effects adopted by the present invention to achieve its intended purpose, exemplary embodiments will be described in detail below, examples of which are illustrated in the accompanying drawings. In the following description, when referring to the drawings, unless otherwise indicated, the same numbers in different drawings represent the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application.
[0032] The following detailed description of the specific implementation methods, features, and effects of the present invention, in conjunction with the accompanying drawings and preferred embodiments, is provided in detail.
[0033] Example 1
[0034] Please see Figure 1 This embodiment provides an intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints, including the following steps:
[0035] Among them, BERT-CRF is a hybrid model that combines Bidirectional Encoder Representation Transformation (BERT) with Conditional Random Field (CRF), specifically designed for handling text semantic understanding and standardized output tasks. In this embodiment, it is responsible for the automated standardized transformation of the substation monitoring information point table.
[0036] S1. Collect the original point table description of substation monitoring information, adopt a generative language model architecture that combines bidirectional encoder representation transformation with conditional random fields, pre-train and fine-tune it in conjunction with the substation equipment monitoring information specifications, optimize the model through cross-layer parameter sharing and embedding layer decomposition technology, introduce conditional random field layer and regular expression to constrain the output format, perform semantic recognition and standardization transformation on the original point table description, and form a standardized signal representation containing key information such as voltage level, interval name and equipment type.
[0037] Further, step S1 specifically includes:
[0038] A generative language model for the power industry is built based on the BERT-CRF architecture. The original point table description text of substation monitoring information is collected as input. The BERT bidirectional encoder is used to perform contextual representation of the text to obtain deep semantic vectors. The standardized corpus of substation equipment monitoring information is used, and domain pre-training is completed through masked language model to form initial model parameters with power semantic understanding.
[0039] The initial model was fine-tuned using manually labeled standardized samples. The model architecture was optimized using cross-layer parameter sharing and embedding layer decomposition techniques. The fine-tuned BERT output vector was input into the CRF conditional random field layer, and a preliminary standardized semantic label sequence was generated through sequence labeling and label transfer constraints.
[0040] The CRF decoding output is format-checked and forcibly corrected using regular expressions to obtain standardized description text that conforms to the specifications. Key information fields such as voltage level, interval name, and equipment type are extracted from the standardized description text to form a structured standardized signal representation.
[0041] The text is represented using a BERT bidirectional encoder to obtain a deep semantic vector. This process includes: segmenting the original dot table-described text into words and sub-words, and adding special markers to construct the input sequence; then, each word is converted into a vector through an embedding layer, and positional encoding is superimposed to inject sequence order information; this sequence is input into a BERT model composed of multiple Transformer encoders, which uses a self-attention mechanism to perform bidirectional parallel computation on all words in the sequence, thereby dynamically capturing long-distance dependencies between arbitrary words; finally, the context representation of each word is composed of the attention weighted sum of itself and all words before and after it, and the output is a deep vector representation containing complete contextual semantics.
[0042] Domain pre-training was performed using a masked language model to form initial model parameters with power semantic understanding. This included: constructing a training dataset from massive amounts of power industry text corpora such as power industry operation procedures, protection setting sheets, typical point table descriptions, and historical alarm logs; randomly masking approximately 15% of the words in the input sentence (prioritizing professional terms such as circuit breaker, differential protection, and GOOSE) and replacing them with special mask tags; the training task required the model to predict the original words at the masked locations based on the power domain context surrounding the masked words; through this process, the model parameters were continuously adjusted over hundreds of thousands of iterations, ultimately learning the distributed representations and specific combination patterns of professional concepts such as power equipment, signal types, and logical relationships, forming an initial semantic understanding capability deeply adapted to the power domain.
[0043] The model architecture is optimized using cross-layer parameter sharing and embedding layer decomposition techniques. These include: in the Transformer encoder, the attention feedforward network parameters of several adjacent layers (e.g., every 2 or 3 layers) are set to be shared, rather than independent for each layer, thereby significantly reducing the total number of parameters and enhancing the consistency of features at different levels; in the embedding layer, the originally large vocabulary-dimensional embedding matrix (e.g., vocabulary of 30,000 × dimension of 768) is decomposed into the product of two low-rank matrices (e.g., 30,000 × 256 and 256 × 768), greatly reducing storage and computational complexity; the optimized model maintains or even improves the ability to represent power text while reducing computational load and memory usage by about 30-40%, making it more suitable for deployment in edge environments with limited computing resources, such as central control stations.
[0044] The CRF decoding output is format-checked and forcibly corrected using regular expressions. This includes: pre-writing a set of precise regular expression matching rules based on the standard naming and format templates commonly used in the power industry (such as voltage level + bay name + equipment type + signal description); after the system decodes the tag sequence output by the CRF layer into text, it immediately performs a rule-by-rule matching and verification; if a format deviation is found (such as the voltage level 10kV being mistakenly written as 10KV, or the necessary bay identifier being missing), the correction engine is triggered to automatically rewrite the text according to the replacement templates defined in the rules (such as uniformly replacing KV with kV, or inserting the standard bay name before the equipment name); this step serves as a rigid rule guarantee after semantic standardization based on deep learning, ensuring that the final output text is strictly compliant in structure.
[0045] Specifically, when processing the original point table description text of substation monitoring information, the text is first analyzed. Taking the description text "110kV bus voltage abnormal alarm" as an example, the text contains voltage level, equipment type and status information, but the expression format is not uniform. Through the BERT bidirectional encoder, the model can capture the contextual dependency relationship between 110kV and the bus, generate a deep semantic vector, and thus accurately understand the voltage abnormal state of the specific power equipment pointed to by the description.
[0046] During the pre-training phase, a masked language model is used for training based on standardized corpora in the power sector. For example, after masking the busbar, the model can infer from the context of "110kV… voltage anomaly alarm" that the masked word should be a term related to the equipment type. This process enables the model to learn the specific semantic associations of the power sector, forming initial model parameters with domain understanding capabilities.
[0047] In the supervised fine-tuning phase, manually labeled samples are used to optimize the model. For example, the abnormal state of a 220kV switch sample is labeled in a structured form of voltage level-equipment-state. Fine-tuning is then used to adapt the model to the requirements of monitoring information specifications. After fine-tuning, the model's recognition accuracy for similar descriptions is significantly improved. Cross-layer parameter sharing technology can reduce the number of model parameters, while embedding layer decomposition technology maps high-dimensional vectors to low-dimensional subspaces, thereby improving computational efficiency and reducing memory usage, supporting deployment in edge devices.
[0048] In the sequence labeling and standardized output stage, the semantic vector output by BERT is input into the CRF layer for processing. Taking the description of a 35kV transformer overheating as an example, the CRF ensures that 35kV is correctly labeled as the voltage level label through label transition constraints, outputting a sequence that conforms to the specification structure. Further, regular expressions are used to validate and force corrections on the output text, such as uniformly correcting inconsistent formats of 35kV to 35kV, ensuring the text fully complies with the specification. Finally, key fields are extracted from the standardized text; for example, information such as 110kV, bus, and voltage anomaly are extracted from 110kV bus voltage anomaly, forming a structured feature set.
[0049] Specifically, by constructing a generative language model based on BERT-CRF and rule constraints, the fundamental problem of inaccurate signal matching caused by the non-standard description and inconsistent format of substation monitoring information point tables is effectively solved. Domain pre-training and fine-tuning enable the model to have a deep understanding of power semantics. The introduction of CRF layer and regular expressions for dual constraints realizes the automated and high-precision conversion from the original point table to standard signals, forming a structured signal representation containing key information such as voltage level and interval name. This lays a high-quality data foundation for the reliable application of event-based technology and significantly improves the efficiency and accuracy of data standardization processing.
[0050] S2. Parse the substation event-based rule definition file, extract elements such as event type feature name and calculation formula, construct an event-based rule template library, define core fields such as expression participation quantity, establish the mapping relationship between feature name and standardized signal representation through keyword matching technology, extract default parameters using pre-trained model and fill them into standard templates, forming a reusable rule template system covering typical events.
[0051] Furthermore, step S2 specifically includes:
[0052] By using a predefined set of professional keywords, the feature names and standardized signal descriptions in the event rule definition file are automatically matched to determine their one-to-one correspondence and establish a mapping table from feature names to standard signals.
[0053] Based on the mapping table, the voltage level, equipment type and interval name associated with each feature name are automatically extracted to form a basic parameter set. At the same time, a pre-trained language model is called to perform semantic extraction and contextual reasoning on the numerical default parameters that are not explicitly defined in the rule description, so as to complete the complete parameter set.
[0054] The event type, feature name, calculation formula, logical relationship, list of participating quantities, and supplemented complete parameter set are encapsulated according to a preset standardized structure to generate an independent, complete single event rule template that can be directly applied to specific substation configurations.
[0055] The system iterates through all event types in the rule definition file to generate a set of rule templates covering typical scenarios such as transformer faults, switch tripping, and GOOSE chain breaks. These templates are then categorized, coded, and stored, ultimately establishing a rule template library that supports version management, query maintenance, and continuous expansion.
[0056] Furthermore, a mapping table is established between feature names and standard signals. This includes: after initial keyword matching between feature names and standardized signal descriptions, a multi-level confirmation and disambiguation process is initiated. First, the matching results are reviewed using a semantic similarity model to identify and prompt possible ambiguous or polysemous matching items, such as a feature protection action corresponding to a line protection output or a bus protection output. Then, these items to be confirmed are pushed to domain experts for manual review and adjudication through a graphical interface to ensure the absolute accuracy of the mapping relationship. Finally, the final correspondence after manual confirmation or correction, along with its matching confidence level, reviewer, and timestamp information, is structured and recorded in the mapping table to form traceable and reliable mapping knowledge.
[0057] The pre-trained language model is invoked to perform semantic extraction and contextual reasoning on undefined numerical default parameters in the rule description. This includes: when parameters such as overcurrent setting or delay threshold are not given specific values in the rule description, the contextual understanding and reasoning engine based on the pre-trained language model is activated. This engine locates the rule text segment containing the parameter and extracts its surrounding contextual information, such as the associated equipment type (main transformer protection), voltage level (110kV), and operation description (overcurrent stage). Then, the model combines knowledge learned from a large number of unstructured documents such as protection setting sheets and equipment technical manuals to infer the most likely parameter value or value range in that context. For example, it infers that the typical value of the overcurrent setting for a 110kV main transformer is 1200A. At the same time, existing configuration parameters of similar equipment are searched from a standardized point table for cross-validation. Finally, the parameter values obtained from the reasoning and validation are automatically populated into the parameter set, and their source is marked as AI reasoning or cross-reference for subsequent review.
[0058] Specifically, by building an intelligent rule template library, the problems of event-based rule configuration relying on manual intervention, low efficiency, and difficulty in cross-site reuse are solved. Keyword matching and semantic verification are used to establish accurate feature mapping, and AI contextual reasoning is combined to automatically complete default parameters, realizing the automated generation and standardized encapsulation of rule templates, forming a traceable and scalable rule knowledge base, which greatly improves configuration efficiency, accuracy, and cross-site adaptability.
[0059] S3. Based on the mapping relationship between abstract features and standardized signal representations in the rule template library, a semantic matching algorithm is used to calculate the similarity. For correspondences where the similarity meets the preset requirements, the correlation between the calculation formula in the rule and the source of the standardized signal is verified. Combined with configuration parameters such as interval type and wiring method, the reliable triggering conditions of the event are determined.
[0060] Furthermore, step S3 specifically includes:
[0061] Abstract features to be processed are obtained from the rule template library, and the similarity value between them and the standardized signal representation is calculated using a semantic matching algorithm. The calculation results are filtered according to a preset threshold, retaining the feature-signal correspondence that meets the similarity standard and eliminating the matching items that do not meet the requirements.
[0062] For the correspondences that pass the screening, predefined calculation formulas and logical relationships are extracted from the rule template; at the same time, the corresponding original signal data is located and extracted from the standardized signal sources based on the correspondences, which serve as the input basis for formula calculation.
[0063] Based on the specific bay type and wiring configuration parameters of the target substation, the extracted calculation formula is instantiated, and the abstract variables are replaced with specific signal data sources or set values to generate specific event trigger expressions that can be directly calculated.
[0064] The instantiated trigger expression is applied to the associated original signal data, and logical and numerical calculations are performed. Based on the calculation result of the expression and the trigger threshold or condition defined in the rule, it is determined whether the current event meets the reliable trigger condition, and the final Boolean judgment result is output.
[0065] Furthermore, a semantic matching algorithm is employed to calculate the similarity value between the target signal and the standardized signal representation. This includes using a semantic matching algorithm based on a pre-trained language model (such as Sentence-BERT or SimCSE) to calculate the text embedding vector of the abstract features (such as the differential protection action of the main transformer) in the rule template and the text embedding vector of the standardized signal description (such as the differential protection action of the 220kV#1 main transformer) output from step S1. By calculating the cosine similarity between the two in the high-dimensional semantic space, a similarity quantification value between 0 and 1 is obtained. To improve matching accuracy, the system also combines keyword weights based on word frequency-inverse document frequency for weighted calculation and references a historical mapping relationship library for auxiliary verification, ultimately forming a dynamic matching confidence score that integrates semantic similarity and domain prior knowledge.
[0066] The process extracts predefined calculation formulas and logical relationships from the rule template, including: accurately extracting mathematical or logical calculation formulas (such as current > constant value) defined in the expression field based on the structured field definition of the rule template, and feature combination conditions (such as feature A AND (feature B OR feature C) defined in the logical relationship field); at the same time, parsing the list of participants to determine the abstract feature corresponding to each variable in the formula; the process not only performs syntax parsing, but also uses a lightweight inference engine to perform preliminary verification of the completeness of the formula and potential logical conflicts, ensuring that the extracted calculation logic is well-structured and executable at both the mathematical and business levels.
[0067] Based on the specific bay type and wiring configuration parameters of the target substation, the extracted calculation formulas are instantiated. This includes: obtaining the specific bay type (e.g., line bay, transformer bay) and electrical wiring method (e.g., 3 / 2 connection, single busbar segmentation) according to the target substation's configuration file (SCD file or bay parameter table); using this configuration information, the system binds the abstract variables in the formulas: for example, the abstract feature of the bay switch position is instantiated as the specific signal point QF102 circuit breaker position; the differential current setting is instantiated as the specific value 1.2A according to the equipment model and voltage level. For complex events involving multiple bays, the system also needs to derive the associated signals of the relevant bays based on the main wiring topology and complete multi-point synchronous instantiation.
[0068] The instantiated trigger expression is applied to the associated raw signal data to perform logical and numerical calculations, including: obtaining the current or specified time-series values of all raw signals (telecommunications, telemetry) associated with the instantiated expression from the real-time database or simulation data source, and calculating them item by item according to the logical operation order of the expression: first processing the logical combinations within the parentheses, then performing relational comparisons (such as >, =), and finally performing Boolean algebra operations (such as AND, OR); for logic involving time sequence (such as signal B not returning within 200ms after signal A is activated), a time window is introduced to compare and judge the signal sequence. After the calculation is completed, the final Boolean result is compared with the trigger condition (usually TRUE) defined in the rule to determine whether the event meets the reliable trigger condition, and a detailed judgment log including the trigger time, the actual values of the participating signals, and the intermediate calculation results is generated.
[0069] Specifically, by using semantic matching and parameter instantiation, the unreliability of event triggering conditions caused by inaccurate feature matching and formulas not being combined with actual configurations is solved. Based on semantic similarity, the rule features are accurately associated with field signals. Combined with the specific configuration of the substation, the abstract formula is bound into a computable expression. Logical operations are performed through real-time data, thereby achieving accurate and automated determination of event triggering conditions. This significantly improves the reliability of event synthesis decision-making and the credibility of the system's autonomous judgment.
[0070] S4. Based on reliable triggering conditions, a signal keyword depth traversal algorithm is used to perform multi-dimensional correlation search on standardized signals, filter related signals and arrange timing logic to generate test instances containing positive logic signal groups and negative logic signal groups, covering primary equipment failures, secondary equipment anomalies and edge scenarios.
[0071] Furthermore, step S4 specifically includes:
[0072] The standardized signal description set is matched with multi-level keywords by using a signal keyword depth traversal algorithm. A hierarchical signal association tree structure is constructed based on voltage level, interval name and equipment type. The association tree is traversed by breadth-first search to obtain the signal set with association relationship layer by layer, forming a candidate signal pool.
[0073] Based on the event type and feature name, candidate signals that conform to the calculation formula are selected from the set of associated signals. Combination operations are performed on these candidate signals according to the logical relationship defined in the rules to obtain a positive logic signal group that meets the event triggering condition. Based on the negation form of the same logical relationship, reverse constraint operations are performed on the positive logic signal group to generate a negative logic signal group for verifying event suppression.
[0074] The signals in the positive logic signal group and the negative logic signal group are arranged sequentially along the time axis according to the timing logic and time interval requirements of the actual events, forming a set of test instances with clear timing dependencies. This set has the basic structure to verify the triggering and suppression of events.
[0075] Based on the unique signal feature combination rules in typical scenarios such as primary equipment failure and secondary equipment anomaly, the existing test instance set is supplemented and generated by adding edge and abnormal scenario test cases such as abnormal signal combinations, timing deviations, and signal missing, ultimately forming a comprehensive and scenario-complete extended test instance set.
[0076] Furthermore, a breadth-first search (BFS) approach is used to traverse the association tree, acquiring the set of related signals layer by layer. This includes: using the core equipment or bay involved in the event rule as the root node, initiating a breadth-first search (BFS) in the pre-constructed voltage level-bay-equipment hierarchical signal association tree. The traversal process starts from the root node, first visiting all its direct child nodes (such as all primary equipment signals within the same bay), forming the first layer of related signal set; then expanding outward layer by layer, visiting sibling nodes (such as other bays under the same bus) and functionally related nodes (such as secondary signals of corresponding protection devices); each layer of traversal is filtered according to signal type (such as remote signaling, telemetry, alarm) and functional description (such as action, alarm, location), ensuring that the collected signal set not only has topological correlation, but also logical and functional relevance, thereby constructing a comprehensive and focused candidate signal pool.
[0077] Based on the negation form of the same logical relationship, a reverse constraint operation is performed on the positive logic signal group to generate a negative logic signal group for verifying event suppression. This includes: analyzing the complete logical relationship satisfied by the positive logic signal group (e.g., signal A and signal B act simultaneously); when generating the negative logic signal group, it is not simply a matter of inverting the entire logic, but rather a systematic construction of a series of test scenarios for events to be suppressed. Specific methods include: missing critical signals: removing one or more necessary signals from the positive logic group (e.g., keeping only signal A and missing signal B);
[0078] Signal state error: Set the state of a critical signal to the opposite invalid state (e.g., change the action of signal A to the reset of signal A).
[0079] Timing logic violation: breaking the timing constraints between signals (e.g., triggering signal A to reset before signal B acts);
[0080] Introducing contradictory or interlocking signals: Based on the positive logic signals, add interlocking condition signals defined in the rules (such as protection function plate exit). Through this systematic reverse construction, the negative logic signal group can effectively verify the suppression capability of event-based rules in abnormal, boundary and interference scenarios, ensuring no false alarms.
[0081] Specifically, by using a signal depth traversal algorithm and a systematic construction of positive and negative logic systems, the verification problem of test cases relying on manual writing, being inefficient, and failing to fully cover various fault and abnormal scenarios is solved. Based on a hierarchical signal association tree, relevant signals are automatically retrieved and filtered. This not only generates positive logic test sequences that simulate the correct triggering of events, but also constructs negative logic test scenarios to verify the rule suppression capability. Based on this, precise timing sequences and supplementary edge test cases are arranged, thereby achieving fully automated and high-coverage generation of test instances. This provides complete and reliable verification materials for subsequent closed-loop acceptance, significantly improving the efficiency and depth of event-based functional testing.
[0082] S5. Based on the power industry standard IEC104 protocol, a monitoring information forwarding model is constructed, a simulation test environment is built, and an event-based acceptance device is used to simulate the station-end data gateway to achieve isolation between the simulation signal and the actual data flow. Positive and negative logic signals in the test instance are sent to the front-end computer of the centralized control station, and the event generation results of the main station are collected in real time.
[0083] Furthermore, step S5 specifically includes:
[0084] Load the IEC104 protocol specification, parse its message format and application service data unit type; based on the actual communication characteristics of the station-end data gateway, establish a monitoring information forwarding model, and form an independent simulation communication link that is physically isolated from the actual operation data;
[0085] The establishment of a monitoring information forwarding model includes: by loading the IEC104 protocol specification, a detailed analysis of the application service data unit (ASDU) type definition, information body address structure, and transmission process (such as single-point information, two-point information, normalized value, short floating-point number, etc.) message format is performed. Based on in-depth analysis of the communication behavior of actual station-end data gateways (such as measurement and control devices, protection devices) (including link establishment, active data transmission, burst transmission, heartbeat maintenance, etc.), a high-fidelity monitoring information forwarding behavior model is constructed. This model runs on independent hardware devices or in a virtualized environment. Through physical network isolation technology (such as VLAN segmentation or independent network cards), it is completely isolated from the actual data flow of the production control area (safety I / II zone), forming a simulated communication link used only for testing, ensuring that the testing process does not affect the real-time monitoring service of the power grid.
[0086] Obtain instantiated positive and negative logic test instances from the rule template library; parse the signal sequence and its precise timing required by each instance using the signal traversal algorithm; and use the IEC104 simulation protocol stack to encapsulate the signal list frame by frame into standard IEC104 application layer messages, generating an injectable simulation message sequence.
[0087] The algorithm parses the required signal sequence and its precise timing for each instance using a signal traversal algorithm. This includes: performing structured parsing on the test instance, identifying each defined signal point and its target state; for timing logic, the algorithm parses and constructs a dependency graph between signals, distinguishing between concurrent signal groups, sequential signal groups, and conditional triggering relationships; by traversing this graph, the algorithm calculates the absolute timestamp for each signal and precisely arranges the injection timetable of the entire signal sequence according to the fixed delay, random jitter, or event-driven relative delay (such as triggering after a change in the state of a preceding signal) defined in the rules, thereby generating an executable list of signal actions with millisecond-level timing precision.
[0088] Using the IEC104 emulation protocol stack, the signal list is encapsulated frame by frame in time sequence into standard IEC104 application layer messages. This includes: matching the corresponding IEC104 Type Identifier (TI) and Quality Description Word (QDS) for each signal in the signal action list (such as single-point remote control, two-point remote signaling, normalized value), assigning a unique Common Address (COA) and Information Body Address (IOA) to each information object, and setting the corresponding transmission reason (COT) based on whether the signal is periodically transmitted, suddenly changed, or responding to a general call; finally, the protocol stack combines all these elements with precise time stamps and assembles them into a complete communication message frame containing a start character, length, control field, address field, and ASDU (Application Service Data Unit) according to the IEC104 APDU (Application Protocol Data Unit) frame format, forming a raw data packet sequence that can be directly sent at the network layer.
[0089] By loading the monitoring information forwarding model through the event-based acceptance device, the behavior of the station-end data gateway is simulated, and the simulated message sequence is sent to the front-end machine of the central control master station. At the same time, according to the output interface specification of the master station event processing module, the event records generated after its response are collected in real time to form an actual event generation result dataset.
[0090] The system automatically compares the preset expected events (positive logic should trigger, negative logic should suppress) in the test instance with the actual event generation results to obtain preliminary matching records. Then, it calculates key indicators: statistically analyzes the success rate of positive logic event synthesis, analyzes the response time of each event from signal transmission to completion of generation, summarizes all comparison records, synthesis rate and response time series, and generates a complete event-based intelligent acceptance report.
[0091] Specifically, by constructing a high-fidelity simulation test environment based on the IEC104 protocol, the problem of traditional manual testing being unable to securely and comprehensively verify event-driven functions was solved. A forwarding model physically isolated from the real data stream was established according to the protocol specification. Test instances were precisely encapsulated into time-sequenced communication messages, and the behavior of station-end equipment was simulated to inject signals into the master station. At the same time, the event generation results were collected in real time for automated comparison and index statistics. This enabled efficient, accurate, and comprehensive closed-loop acceptance of event-driven functions without affecting the real-time operation of the power grid, and automatically generated a structured intelligent acceptance report, greatly improving the security and objectivity of test verification.
[0092] S6. Extract the expected event results corresponding to the test instance, automatically compare the main site event generation results with the expected results, detect misconfiguration or omission issues in the event-based configuration, determine the validity of the event, and generate acceptance data containing the location type of configuration defects.
[0093] Furthermore, step S6 specifically includes:
[0094] Extract the preset expected event results from the test instance and automatically compare them with the event records actually generated by the main site; use the matching algorithm to detect inconsistencies such as missing events, incorrect triggering, and misaligned timing, and identify misconfiguration and omission issues in the event-based configuration;
[0095] The identification of misconfiguration and omission issues in event-based configuration includes: identifying misconfiguration and omission issues in event-based configuration across three core dimensions: In the event content dimension, detecting missing events (omission) and unexpected triggering of event types and levels (misconfiguration); In the signal logic dimension, verifying substantive misconfigurations such as incorrect signal point mapping, discrepancies between signal states and logical relationships, and deviations in telemetry parameter calculations, as well as the omission of key feature signals; In the timing dimension, verifying whether the sequence of signal actions and time intervals conform to the constraints defined by the rules, identifying configuration defects such as timing misalignment or excessive delay, thereby systematically locating various inconsistencies in the rule configuration.
[0096] For the identified inconsistencies, analyze their corresponding signal configurations, rule logic, and timing relationships to determine whether the defect substantially affects the validity of the event; clarify whether the defect is a logical error, signal mapping deviation, or improper parameter setting, and locate the specific rule entry or signal point number.
[0097] The results of defect location, type, and impact analysis are fed back to the rule template library and instantiation module, triggering the rule tuning process; based on expert experience or automatic correction strategies, updated rule instances or signal mapping relationships are generated to form a corrected dataset for re-verification;
[0098] The test was re-executed in the simulation environment using the corrected dataset. The event generation results after the main station was updated were collected and compared with the final results. The results of the initial test and regression verification were combined to generate structured acceptance data containing defect details, repair status, event validity judgment and overall compliance conclusion.
[0099] Specifically, through a multi-dimensional automatic comparison and intelligent analysis mechanism, the system solves the problem of systematically discovering, locating, and closing-loop repairing event-based configuration defects. It automatically identifies misconfiguration and omission issues through three dimensions: event content, signal logic, and timing, and accurately locates defects to specific rule entries or signal points. At the same time, a feedback mechanism is established to drive rule optimization and regression verification, ultimately generating structured acceptance data. This achieves automated and accurate diagnosis and closed-loop repair of configuration defects, significantly improving the objectivity, comprehensiveness, and iterative optimization efficiency of event-based acceptance.
[0100] S7. Based on the configuration defect information in the acceptance data, update the event-based rule template library, optimize the model parameters and semantic matching algorithm that combine bidirectional encoder representation transformation and conditional random field through incremental learning, improve the standardized transformation logic and feature mapping relationship, and form a closed-loop mechanism for standardized templated instantiation acceptance optimization.
[0101] Furthermore, step S7 specifically includes:
[0102] Based on the clearly defined location and type of configuration defects in the structured acceptance data, the event-based rule template library is updated in a targeted manner. For issues such as misconfiguration, omission, and logical conflicts, the feature mapping relationships, calculation formulas, or parameter settings in the corresponding rule templates are corrected to form an optimized version of the rule knowledge base.
[0103] Defect samples related to semantic understanding and signal mapping in the acceptance data are converted into labeled data, and the BERT-CRF standardized model is incrementally trained to optimize its semantic representation and sequence labeling capabilities; the feature weights and similarity calculation logic of the semantic matching algorithm are optimized simultaneously to improve the accuracy of rule features and standardized signal mapping.
[0104] Based on the updated rule template library and optimized AI model, a new round of standardization and acceptance processes is executed to verify the effectiveness of defect repair and overall system performance improvement. Finally, the optimized rule templates, model parameters, and verification results from this iteration are archived to form a complete closed-loop knowledge evolution mechanism.
[0105] Furthermore, the feature mapping relationships, calculation formulas, or parameter settings in the corresponding rule templates are corrected. This includes: performing structured corrections based on the defect types identified by the acceptance data; for feature mapping relationships, associating mismatched abstract features with correct standardized signal points in the participant list or mapping table of the rule template; for calculation formulas, correcting logical operators, mathematical expressions, or threshold comparison relationships in the expression fields, such as changing incorrect logical OR to logical AND, or calibrating the threshold formula for exceeding limits; and for parameter settings, recalibrating the setpoints and delay parameters involved in the formulas. All corrections are completed under a version control system, generating a new version of the rule template and recording the root cause of the defect, the correction content, and the associated test cases, achieving traceable and reproducible updates to the knowledge base.
[0106] Incremental training of the BERT-CRF standardization model optimizes its semantic representation and sequence labeling capabilities. This includes converting erroneous samples identified during acceptance testing that are directly related to the semantic understanding of point table descriptions and standardization outputs (such as standardization errors caused by model ambiguity) into structured training data and labeling them with correct standardization sequence labels. Using this data as an incremental training set, multiple rounds of iterative training are conducted on the already trained BERT-CRF model with a low learning rate. This process simultaneously optimizes the semantic representation capabilities of the BERT encoder, enabling it to form more robust vector representations of easily confused power domain terms (such as abbreviations and alternative names). It also optimizes the label transition probability of the CRF layer to correct common label sequence errors (such as misalignment of voltage level and equipment type labels). Through this targeted and continuous incremental learning, the model's accuracy and generalization ability on power text standardization tasks are specifically improved.
[0107] Specifically, by using a dual optimization of rules and models driven by defect feedback, the problem of traditional event-based systems being unable to continuously learn and improve from practical applications is solved. The mapping relationships and calculation logic in the rule templates are accurately corrected based on acceptance data. At the same time, defect samples are transformed into training data to incrementally learn the BERT-CRF model, optimizing semantic understanding and sequence labeling capabilities. The improvement effect is confirmed through regression verification, forming a complete knowledge loop of problem discovery, analysis and repair, and verification and optimization. This enables the system to have the ability to continuously evolve and significantly improves the long-term accuracy, robustness, and practicality of event-based configuration.
[0108] Example 2
[0109] Please see Figure 2 This embodiment provides an intelligent standardization system for substation monitoring information based on BERT-CRF and rule constraints, applied to the intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints, including:
[0110] The intelligent semantic standardization engine module, based on the domain pre-trained BERT-CRF model, uses deep semantic understanding and sequence labeling, combined with rigid regular expression rules, to accurately convert unstructured original descriptions into standard formats and extract key features such as voltage levels and interval names to form structured and standardized signal representations, laying a unified and high-quality data foundation for all subsequent processes.
[0111] The hybrid rule template building module parses the rule definition file and uses a hybrid mode of keyword matching and AI inference to automatically establish the mapping relationship between abstract features and standardized signals, and intelligently completes the default parameters. Finally, it encapsulates and generates standardized, reusable single-event rule templates that cover various typical scenarios and integrates them into a rule template library that supports full lifecycle management.
[0112] The feature matching and trigger judgment module uses a semantic matching algorithm to filter reliable feature-signal correspondences. Combined with the specific configuration of the substation (such as bay type and wiring method), it binds and instantiates the calculation formulas in the rules. By calculating and verifying the reliable triggering conditions of the event, it provides an accurate logical basis for test case generation.
[0113] The positive and negative logic test instance generation module uses a signal keyword depth traversal algorithm to filter and arrange signal timing in a multi-dimensional associated signal pool, generating a set of test instances containing positive logic (verifying correct triggering) and negative logic (verifying reliable suppression), and supplementing edge and abnormal scenario test cases to ensure the completeness of test coverage;
[0114] The IEC104 simulation test module simulates the monitoring information forwarding model based on the IEC104 protocol. It serializes test instances into standard communication messages, simulates station behavior to inject signals into the master station through a dedicated acceptance device, and collects the master station response results in real time, performing a fully automated simulation test from signal injection result collection.
[0115] The automatic acceptance and defect analysis module automatically compares the actual event results generated by the main station with the test expectations from multiple dimensions, accurately identifies inconsistencies such as missing events, false triggers, and misaligned timing, locates the type and location of configuration defects (misconfiguration, omission), and generates structured acceptance data and preliminary analysis reports containing detailed defect information.
[0116] The self-evolving closed-loop optimization module utilizes defect information discovered during acceptance testing to optimize the rule template library in a targeted manner and drive the AI standardized model for incremental learning. A regression verification process is then initiated to confirm the optimization effect. By solidifying the mechanism of problem discovery, analysis, correction, and verification within a closed loop, the system's knowledge, rules, and models achieve self-evolution.
[0117] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention in any way. Although the present invention has been disclosed above with reference to preferred embodiments, it is not intended to limit the present invention. Any person skilled in the art can make some modifications or alterations to the above-disclosed technical content to create equivalent embodiments without departing from the scope of the present invention. Any simple modifications, equivalent changes and alterations made to the above embodiments based on the technical essence of the present invention without departing from the scope of the present invention are within the scope of the present invention.
Claims
1. A method for intelligent standardization of substation monitoring information based on BERT-CRF and rule constraints, characterized in that: Includes the following steps: S1. Standardization of monitoring information point table: Based on the BERT-CRF architecture, a generative language model for the power industry is constructed. The original point table description text of substation monitoring information is collected as input. The BERT bidirectional encoder is used to perform contextual representation of the text to obtain deep semantic vectors. The standardized corpus of substation equipment monitoring information is used, and domain pre-training is completed through the masked language model to form initial model parameters with power semantic understanding. The initial model was fine-tuned using manually labeled standardized samples. The model architecture was optimized using cross-layer parameter sharing and embedding layer decomposition techniques. The fine-tuned BERT output vector was input into the CRF conditional random field layer, and a preliminary standardized semantic label sequence was generated through sequence labeling and label transfer constraints. The CRF decoding output is format-checked and forcibly corrected using regular expressions to obtain standardized description text that conforms to the specifications. Key information fields are extracted from the standardized description text to form a structured standardized signal representation. S2. Event-based rule template construction: Parse the substation event-based rule definition file, extract event types, feature names, and calculation formulas, build an event-based rule template library, define expression parameters, establish the mapping relationship between feature names and standardized signal representations through keyword matching, extract default parameters using a pre-trained model and fill them into standard templates to form reusable rule templates. S3. Event triggering condition determination: Triggering conditions are instantiated based on semantic matching algorithm association rule features and standardized signals, combined with substation configuration parameters. S4. Automatic generation of test cases: Multi-level keyword matching is performed on the standardized signal description set through the signal keyword depth traversal algorithm. A hierarchical signal association tree structure is constructed based on voltage level, interval name and equipment type. The association tree is traversed in a breadth-first search manner to obtain the signal set with association relationship layer by layer, forming a candidate signal pool. Based on the event type and feature name, candidate signals that conform to the calculation formula are selected from the set of associated signals. Combination operations are performed on these candidate signals according to the logical relationship defined in the rules to obtain a positive logic signal group that meets the event triggering condition. Based on the negation form of the same logical relationship, reverse constraint operations are performed on the positive logic signal group to generate a negative logic signal group for verifying event suppression. The signals in the positive logic signal group and the negative logic signal group are arranged sequentially along the time axis according to the timing logic and time interval requirements of the actual events, forming a set of test instances with clear timing dependencies. Based on the unique signal feature combination rules in typical scenarios of primary equipment failure and secondary equipment anomaly, the existing test instance set is supplemented and generated by adding edge and abnormal scenario test cases, ultimately forming a comprehensive and scenario-complete extended test instance set. S5. Closed-loop simulation test and acceptance: Construct a simulation environment based on the IEC104 standard, inject test signals and collect the results of the main station event generation; S6. Automatic configuration defect analysis: Compare the results from the main site with the expected results to identify and locate misconfiguration and omission issues in event-based configuration.
2. The intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints according to claim 1, characterized in that: Step S2 specifically includes: By using a predefined set of professional keywords, the feature names and standardized signal descriptions in the event rule definition file are automatically matched to determine their one-to-one correspondence and establish a mapping table from feature names to standard signals. Based on the mapping table, the voltage level, equipment type and interval name associated with each feature name are automatically extracted to form a basic parameter set. At the same time, a pre-trained language model is called to perform semantic extraction and contextual reasoning on the numerical default parameters that are not explicitly defined in the rule description, so as to complete the complete parameter set. The event type, feature name, calculation formula, logical relationship, list of participating quantities, and supplemented complete parameter set are encapsulated according to a preset standardized structure to generate a single event rule template for specific substation configuration. The system iterates through all event types in the rule definition file, generates a set of rule templates for typical scenarios, and then categorizes, encodes, and stores them. Finally, it establishes a rule template library that supports version management, query maintenance, and continuous expansion.
3. The intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints according to claim 1, characterized in that: Step S3 specifically includes: Abstract features to be processed are obtained from the rule template library, and the similarity value between them and the standardized signal representation is calculated using a semantic matching algorithm. The calculation results are filtered according to a preset threshold, retaining the feature-signal correspondence that meets the similarity standard and eliminating the matching items that do not meet the requirements. For the correspondences that have passed the screening, predefined calculation formulas and logical relationships are extracted from the rule template. Based on the correspondences, the corresponding original signal data is located and extracted from the standardized signal sources as the input basis for formula calculation. Based on the configuration parameters of the target substation, the extracted calculation formula is instantiated, and the abstract variables are replaced with specific signal data sources or set values to generate specific event triggering expressions. The instantiated trigger expression is applied to the associated original signal data, and logical and numerical calculations are performed. Based on the calculation result of the expression and the trigger threshold or condition defined in the rule, it is determined whether the current event meets the reliable trigger condition, and the final Boolean judgment result is output.
4. The intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints according to claim 1, characterized in that: Step S5 specifically includes: Load the IEC104 protocol specification, parse its message format and application service data unit type; based on the actual communication characteristics of the station-end data gateway, establish a monitoring information forwarding model, and form an independent simulation communication link that is physically isolated from the actual operation data; Obtain instantiated positive and negative logic test instances from the rule template library; parse the required signal sequence and its precise timing for each instance using the signal traversal algorithm; and use the IEC104 simulation protocol stack to encapsulate the signal list frame by frame into standard IEC104 application layer messages, generating a simulation message sequence. By loading the monitoring information forwarding model through the event-based acceptance device, the behavior of the station-end data gateway is simulated, and the simulated message sequence is sent to the front-end machine of the central control master station. At the same time, according to the output interface specification of the master station event processing module, the event records generated after its response are collected in real time to form an actual event generation result dataset. The system automatically compares the preset expected events in the test instance with the actual event generation results to obtain preliminary matching records. Then, it calculates key indicators: statistically analyzes the success rate of positive logic event synthesis, analyzes the response time of each event from signal transmission to completion of generation, summarizes all comparison records, synthesis rate and response time series, and generates a complete event-based intelligent acceptance report.
5. The intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints according to claim 1, characterized in that: Step S6 specifically includes: Extract the preset expected event results from the test instances and automatically compare them with the event records actually generated by the main site; detect inconsistencies through matching algorithms and identify misconfiguration and omission issues in the event-based configuration; For the identified inconsistencies, analyze their corresponding signal configurations, rule logic, and timing relationships to determine whether the defect substantially affects the validity of the event; clarify whether the defect is a logical error, signal mapping deviation, or improper parameter setting, and locate the specific rule entry or signal point number. The results of defect location, type, and impact analysis are fed back to the rule template library and instantiation module, triggering the rule tuning process; based on expert experience or automatic correction strategies, updated rule instances or signal mapping relationships are generated to form a corrected dataset for re-verification; The test was re-executed in the simulation environment using the corrected dataset. The event generation results after the main station was updated were collected and compared with the final results. The results of the initial test and regression verification were combined to generate structured acceptance data.
6. The intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints according to claim 1, characterized in that: Also includes: S7. Feedback Optimization Closed Loop: Update the event-based rule template library based on configuration defect information in the acceptance data, and improve the relationship between signal standardization and feature mapping by incrementally learning the BERT-CRF model and optimizing the semantic matching algorithm.
7. The intelligent standardization method for substation monitoring information based on BERT-CRF and rule constraints according to claim 6, characterized in that: Step S7 specifically includes: Based on the clearly defined location and type of configuration defects in the structured acceptance data, the event-based rule template library is updated in a targeted manner, and the feature mapping relationship, calculation formula or parameter setting in the corresponding rule template is corrected to form an optimized version of the rule knowledge base. Defect samples related to semantic understanding and signal mapping in the acceptance data are converted into labeled data, and the BERT-CRF standardized model is incrementally trained to optimize its semantic representation and sequence labeling capabilities; at the same time, the feature weights and similarity calculation logic of the semantic matching algorithm are optimized to improve the accuracy of rule features and standardized signal mapping. Based on the updated rule template library and optimized AI model, a new round of the standardization and acceptance process was executed to verify the effectiveness of defect repair and the overall performance improvement of the system.
8. A substation monitoring information intelligent standardization system based on BERT-CRF and rule constraints, applied to the substation monitoring information intelligent standardization method based on BERT-CRF and rule constraints as described in any one of claims 1-7, characterized in that: include: The intelligent semantic standardization engine module, based on the domain-pre-trained BERT-CRF model, uses deep semantic understanding and sequence labeling, combined with rigid regular expression rules, to accurately convert unstructured original descriptions into standard formats and extract key features to form structured and standardized signal representations. The hybrid rule template building module parses the rule definition file and uses a hybrid mode of keyword matching and AI inference to automatically establish the mapping relationship between abstract features and standardized signals, and intelligently completes the default parameters to generate a single event rule template. The feature matching and trigger judgment module uses a semantic matching algorithm to screen reliable feature-signal correspondences. Combined with the specific configuration of the substation, it binds and instantiates the calculation formulas in the rules and verifies the triggering conditions of the event through calculation. The positive and negative logic test instance generation module uses a signal keyword depth traversal algorithm to filter and arrange signal timing in a multi-dimensional associated signal pool, generating a set of test instances containing positive and negative logic. The IEC104 simulation test module simulates the monitoring information forwarding model based on the IEC104 protocol. It serializes test instances into standard communication messages, simulates station behavior to inject signals into the master station through a dedicated acceptance device, and collects the master station response results in real time, performing a fully automated simulation test from signal injection result collection. The automatic acceptance and defect analysis module automatically compares the actual event results generated by the main station with the test expectations from multiple dimensions, identifies inconsistencies, locates the type and location of configuration defects, and generates structured acceptance data and preliminary analysis reports containing detailed defect information. The self-evolutionary closed-loop optimization module utilizes the defect information discovered during acceptance testing to optimize the rule template library in a targeted manner and drive the AI standardized model to perform incremental learning. Subsequently, a regression verification process is initiated to confirm the optimization effect.