Energy internet hidden abnormal data identification and repair method based on space-time logic
By constructing time series and spatial correlation relationships combined with physical constraint operators to identify hidden anomaly data, and using a spatiotemporal attention compensation network to generate repair values that conform to physical constraints, the problem of identifying and repairing hidden anomaly data in the energy internet is solved, improving the reliability and decision-making accuracy of the system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING SMART CHINA ENERGY INTERNET RES INST CO
- Filing Date
- 2026-04-22
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies struggle to accurately identify hidden anomalies and generate repair values that conform to physical constraints in the energy internet. Conventional methods may introduce biases, affecting system reliability and economy.
By constructing time-series trend relationships and spatial correlations, combining physical constraint operators to identify hidden abnormal data, and using a spatiotemporal attention compensation network to generate compensation values that conform to energy conservation and mass balance, the threshold and weights are adaptively adjusted, and a physical information loss function is embedded to optimize the compensation process.
It enables accurate identification and repair of hidden abnormal data, ensuring that the repaired data follows physical laws, improving the reliability and decision-making accuracy of the system, and reducing the impact of errors.
Smart Images

Figure CN122196837A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of energy internet data processing technology, and in particular to a method for identifying and repairing hidden abnormal data in the energy internet based on spatiotemporal logic. Background Technology
[0002] In the field of the energy internet, anomaly detection and repair are crucial for ensuring the safe and stable operation of the system. Existing technologies typically rely on statistical methods or machine learning models to process the collected data. These conventional approaches mainly involve statistically analyzing data sequences, setting fixed thresholds to identify anomalies that deviate from the normal range, or using historical data to train predictive models and discovering anomalies by comparing the differences between predicted and actual measurements.
[0003] However, these conventional methods are significantly inadequate when dealing with the complex, high-dimensional spatiotemporal data in the energy internet. Data in the energy internet not only exhibits temporal continuity but also possesses close physical connections in space.
[0004] Furthermore, after detecting abnormal data, conventional repair or compensation strategies are often based on purely data-driven approaches, such as using interpolation or simple regression models to generate alternative values. The repair values generated by these strategies may only mathematically smooth the data curve, but cannot guarantee that they conform to the actual physical operating constraints of the system. Directly using such data, which violates physical laws, for subsequent system state analysis, load forecasting, or control decisions can introduce subtle biases, potentially misleading operators' judgments and even triggering cascading failures, severely impacting the reliability and economic efficiency of the entire energy internet.
[0005] Therefore, how to design a method that can deeply integrate spatiotemporal correlation and physical laws to more accurately identify hidden anomalies and generate repair values that conform to physical constraints has become a technical problem that urgently needs to be solved in this field. Summary of the Invention
[0006] This invention provides a method for identifying and repairing hidden abnormal data in the energy internet based on spatiotemporal logic, which can solve the problems in the prior art.
[0007] A first aspect of this invention provides a method for identifying and repairing hidden anomaly data in the energy internet based on spatiotemporal logic, comprising:
[0008] The original data set is collected, and the time series trend relationship and spatial correlation relationship are constructed. The deviation between the current sampled value and the theoretical expected value is calculated through physical constraint operators. When the deviation exceeds the adaptive threshold, it is determined to be hidden abnormal data. Data combinations that violate the energy conservation or mass balance relationship are determined to be combinational logic abnormal data.
[0009] The process involves calling a spatiotemporal attention compensation network to generate compensation values for the hidden anomaly data and the combinational logic anomaly data, wherein the spatiotemporal attention compensation network embeds a physical information loss function to inject energy conservation and mass balance logic into the process of generating compensation values in real time.
[0010] A low-confidence flag is added to the data points corresponding to the compensation value. When training the prediction algorithm, the data points with the low-confidence flag are assigned a weight lower than that of normal data, and a physical logic penalty term is added to the loss function. The parameters of the spatiotemporal attention compensation network are adjusted according to the verification results of the prediction algorithm.
[0011] The deviation between the current sampled value and the theoretical expected value is calculated using a physical constraint operator. When the deviation exceeds an adaptive threshold, it is determined to be hidden abnormal data, including:
[0012] Historical data from the same period are extracted as a reference benchmark based on the time series trend relationship, and adjacent sensor data are extracted as a spatial reference based on the spatial correlation relationship. The theoretical expected value of the current sampled value is calculated by fusing the reference benchmark and the spatial reference through a physical constraint operator. The physical constraint operator transforms the physical inertial characteristics of the equipment into data change rate constraints and transforms the causal correlation of upstream and downstream parameters into coupling relationship constraints.
[0013] The deviation between the current sampled value and the theoretical expected value is calculated, and the adaptive threshold is dynamically adjusted based on the statistical distribution of the data within the sliding time window. The adaptive threshold is automatically adjusted according to the load level changes of the system operating conditions.
[0014] When the deviation exceeds the adaptive threshold, it is determined to be hidden abnormal data. The value of the hidden abnormal data is within the statistically normal range but violates the physical logic constraints defined by the physical constraint operator. The hidden abnormal data includes sensor systematic drift data and progressive offset data.
[0015] Data combinations that violate the laws of energy conservation or mass balance are classified as combinational logic aberrations, including:
[0016] Multiple parameters with physical relationships are extracted from the original dataset to form a data combination to be verified. The data combination to be verified includes node input and output parameter combinations and upstream and downstream relationship parameter combinations.
[0017] The network topology connection relationship in the spatial association is transformed into a node energy conservation constraint equation, which represents the balance relationship that the energy injected into the node is equal to the energy outflow plus the energy loss. The node flow transmission relationship in the spatial association is transformed into a quality balance constraint equation, which represents the conservation relationship that the inflow quality of the node is equal to the outflow quality.
[0018] The combination of data to be verified is substituted into the node energy conservation constraint equation and the mass balance constraint equation for verification. When each parameter in the combination of data to be verified is within the normal range individually, but violates the balance relationship after being substituted into the constraint equation, the combination of data to be verified is determined to be combinational logic abnormal data.
[0019] The spatiotemporal attention compensation network is invoked to generate compensation values for the hidden anomaly data and the combinational logic anomaly data, including:
[0020] Based on the type of abnormal data, a spatiotemporal attention weight allocation strategy is selected. When the data to be compensated is the hidden abnormal data, the attention weight of the time dimension is increased, and historical periodic data is extracted from the time series trend relationship to calculate the time attention weight. When the data to be compensated is the combinational logic abnormal data, the attention weight of the spatial dimension is increased, and adjacent sensor data is extracted from the spatial correlation relationship to calculate the spatial attention weight.
[0021] A multi-head attention mechanism is used to construct short-term attention heads and long-term attention heads respectively. The short-term attention head captures instantaneous change features within a preset short-term time window, while the long-term attention head captures periodic pattern features of historical data from the same period. The outputs of the short-term and long-term attention heads are weighted by the time attention weight to obtain time-compensated features, and adjacent sensor data are weighted by the spatial attention weight to obtain spatial-compensated features. The time-compensated features and the spatial-compensated features are fused to generate a compensation value. During the training of the spatiotemporal attention compensation network, a physical information loss function is embedded to constrain the compensation value to satisfy the energy conservation and mass balance constraint equations.
[0022] A multi-head attention mechanism is used to construct short-term and long-term attention heads, including:
[0023] Multiple short-term attention heads are constructed to capture instantaneous change features at different time scales. Different short-term attention heads use preset short-term time windows of different lengths, ranging from minutes to hours. Multiple long-term attention heads are constructed to capture features of different cycle patterns. Different long-term attention heads extract historical data from the same period of the daily cycle, weekly cycle, and seasonal cycle.
[0024] Calculate the correlation score between the output of each attention head and the current data point to be compensated. The correlation score is calculated based on the matching degree between the features extracted by the attention head and the historical evolution pattern of the data point to be compensated.
[0025] The outputs of each attention head are weighted and fused based on the relevance score. The relevance score is converted into the fusion weight of each attention head through normalization. The outputs of all short-term attention heads are weighted and fused to obtain short-term comprehensive features. The outputs of all long-term attention heads are weighted and fused to obtain long-term comprehensive features. The short-term comprehensive features and the long-term comprehensive features are weighted by the time attention weight to obtain time-compensated features.
[0026] A low-confidence flag is added to the data points corresponding to the compensation value. When training the prediction algorithm, the data points with the low-confidence flag are assigned a weight lower than normal data, and a physical-logic penalty term is added to the loss function, including:
[0027] The confidence score of the compensation value is calculated based on the spatiotemporal data integrity and physical constraint satisfaction during the compensation process. The spatiotemporal data integrity is calculated based on the availability of reference historical data and the redundancy of adjacent sensor data. The physical constraint satisfaction is calculated based on the residual after substituting the compensation value into the node energy conservation constraint equation and the mass balance constraint equation. When the confidence score is lower than the preset confidence threshold, a low confidence marker is added to the corresponding data point.
[0028] When training the prediction algorithm, the confidence scores of all data points are extracted. For data points with the low confidence marker, the confidence scores are mapped to sample weights using the sigmoid function. The sample weights are negatively correlated with the confidence scores.
[0029] A physical logic penalty term is added to the loss function of the prediction algorithm. The physical logic penalty term calculates the degree of violation between the prediction output of the prediction algorithm and the node energy conservation constraint equation and the mass balance constraint equation. When the prediction output violates the constraint equation, the loss value is increased.
[0030] Adjusting the parameters of the spatiotemporal attention compensation network based on the verification results of the prediction algorithm includes:
[0031] The prediction accuracy of the prediction algorithm is evaluated using a validation dataset. The local prediction error for the time period containing the compensation value and the baseline prediction error for the time period containing only the original data are calculated respectively. When the local prediction error is greater than a preset multiple of the baseline prediction error, the compensation quality is determined to be insufficient.
[0032] The deviation distribution between the compensation value and its corresponding abnormal data value before replacement is calculated to identify temporal continuity deviation and spatial logical consistency deviation. The temporal continuity deviation is quantified by calculating the trend deviation between the compensation value and historical periodic data. The spatial logical consistency deviation is quantified by substituting the compensation value into the node energy conservation constraint equation and the mass balance constraint equation.
[0033] Based on the identified deviation type, the allocation ratio of temporal attention weights and spatial attention weights in the spatiotemporal attention compensation network is adjusted. When the proportion of temporal continuity deviation to total deviation exceeds a preset temporal threshold, the temporal dimension weight is increased. When the proportion of spatial logical consistency deviation to total deviation exceeds a preset spatial threshold, the spatial dimension weight is increased. The spatiotemporal attention compensation network is retrained and the verification and adjustment are iteratively performed until the local prediction error converges.
[0034] A second aspect of the present invention provides an electronic device, comprising:
[0035] processor;
[0036] Memory used to store processor-executable instructions;
[0037] The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0038] A third aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0039] This method effectively identifies hidden anomalies in the energy internet, which are often difficult to detect using conventional threshold detection methods. By constructing time-series trend relationships and spatial correlations, and combining them with physical constraint operators to calculate deviations, it can accurately capture anomalies that deviate from theoretically expected values. The introduction of adaptive thresholds avoids the misjudgment or missed detection problems that may be caused by fixed thresholds, improving the flexibility of detection. For data combinations that violate energy conservation or mass balance relationships, the method can directly determine them as combinatorial logic anomalies, thereby identifying data where individual data points seem normal but the combination violates physical laws.
[0040] By using a spatiotemporal attention compensation network to generate compensation values for anomalous data, the system achieves data repair rather than simple removal. Core physical principles such as energy conservation and mass balance are injected into the compensation value generation process in real time, ensuring that the repaired data strictly adheres to physical laws and significantly improving the physical consistency and reliability of the repair results.
[0041] By adding low-confidence markers to the compensation data points and assigning them lower weights in subsequent prediction model training, the negative impact of potential errors in the repaired data on model performance is effectively reduced.
[0042] By using the verification results of the prediction algorithm to adjust the parameters of the spatiotemporal attention compensation network, a closed-loop optimization system is formed. This allows the anomaly detection and repair module to iterate and improve itself based on feedback from downstream tasks, thereby continuously improving the accuracy of detection and repair. Attached Figure Description
[0043] Figure 1 This is a flowchart illustrating a method for identifying and repairing hidden anomaly data in the energy internet based on spatiotemporal logic.
[0044] Figure 2 A flowchart illustrating the process of adjusting the parameters of the spatiotemporal attention compensation network based on the validation results of the prediction algorithm. Detailed Implementation
[0045] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0046] The technical solution of the present invention will be described in detail below with reference to specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments.
[0047] Figure 1 This is a flowchart illustrating the method for identifying and repairing hidden anomaly data in the energy internet based on spatiotemporal logic, as described in an embodiment of the present invention. Figure 1 As shown, the method for identifying and repairing hidden anomaly data in the energy internet based on spatiotemporal logic includes:
[0048] The original data set is collected, and the time series trend relationship and spatial correlation relationship are constructed. The deviation between the current sampled value and the theoretical expected value is calculated through physical constraint operators. When the deviation exceeds the adaptive threshold, it is determined to be hidden abnormal data. Data combinations that violate the energy conservation or mass balance relationship are determined to be combinational logic abnormal data.
[0049] The process involves calling a spatiotemporal attention compensation network to generate compensation values for the hidden anomaly data and the combinational logic anomaly data, wherein the spatiotemporal attention compensation network embeds a physical information loss function to inject energy conservation and mass balance logic into the process of generating compensation values in real time.
[0050] A low-confidence flag is added to the data points corresponding to the compensation value. When training the prediction algorithm, the data points with the low-confidence flag are assigned a weight lower than that of normal data, and a physical logic penalty term is added to the loss function.
[0051] The parameters of the spatiotemporal attention compensation network are adjusted based on the verification results of the prediction algorithm.
[0052] In one optional implementation, the deviation between the current sampled value and the theoretical expected value is calculated using a physical constraint operator. When the deviation exceeds an adaptive threshold, it is determined to be hidden abnormal data, including:
[0053] Collect raw data sets and construct time-series trend relationships and spatial correlations;
[0054] Historical data from the same period are extracted as a reference benchmark based on the time series trend relationship, and adjacent sensor data are extracted as a spatial reference based on the spatial correlation relationship. The theoretical expected value of the current sampled value is calculated by fusing the reference benchmark and the spatial reference through a physical constraint operator. The physical constraint operator transforms the physical inertial characteristics of the equipment into data change rate constraints and transforms the causal correlation of upstream and downstream parameters into coupling relationship constraints.
[0055] The deviation between the current sampled value and the theoretical expected value is calculated, and the adaptive threshold is dynamically adjusted based on the statistical distribution of the data within the sliding time window. The adaptive threshold is automatically adjusted according to the load level changes of the system operating conditions.
[0056] When the deviation exceeds the adaptive threshold, it is determined to be hidden abnormal data. The value of the hidden abnormal data is within the statistically normal range but violates the physical logic constraints defined by the physical constraint operator. The hidden abnormal data includes sensor systematic drift data and progressive offset data.
[0057] For example, in the actual operation of the energy internet, the data acquisition system needs to continuously monitor the operating parameters of various types of equipment, such as distributed photovoltaic power stations, energy storage systems, and smart distribution networks. The collection of raw data sets covers physical quantities such as voltage, current, power, temperature, and flow rate, with sampling frequencies ranging from 1 second to 15 minutes depending on the equipment characteristics. For instance, in a regional integrated energy system, the output power of 50 photovoltaic inverters, the charging and discharging current of 30 energy storage converters, and the voltage phasor data of 20 key nodes are collected simultaneously, forming a multi-dimensional data set containing timestamps, equipment identifiers, measurement point numbers, numerical values, and quality codes.
[0058] The construction of time series trend relationships relies on in-depth mining of historical data. For photovoltaic power generation data, output curves at the same time within the past 30 days are extracted, and the median curve is calculated after removing extreme weather days as a typical daily trend template. For the state-of-charge data of energy storage systems, a gradual trend model based on charge-discharge cycles is constructed to capture the decay law of battery capacity with the number of cycles. The time series analysis adopts a seasonal decomposition method, splitting the original series into a trend term, a periodic term, and a residual term. The trend term reflects the long-term evolution of equipment performance, the periodic term reflects daily cycle and seasonal changes, and the residual term is used for benchmark noise level assessment for anomaly detection.
[0059] Establishing spatial correlations requires utilizing the topology and physical coupling characteristics of energy systems. In distribution network scenarios, the voltage amplitudes of adjacent nodes on the same feeder are strongly correlated, and voltage drop relationships can be established using line impedance parameters. For combined cooling, heating, and power (CCHP) systems, there is a proportional relationship between gas input flow, power generation, and hot water flow determined by the rated parameters of the equipment. The spatial correlation matrix is constructed using a combination of mutual information and Pearson correlation coefficient. The former captures nonlinear correlations, while the latter quantifies the degree of linear dependence. Strong correlation edges are established for measurement point pairs with an absolute correlation coefficient greater than 0.7 and a mutual information value exceeding 0.5.
[0060] The design of the physical constraint operator integrates the dynamic response characteristics of the equipment and system-level balance constraints. For the power output data of the gas turbine, the equipment's physical inertia characteristic is reflected in the fact that the power change rate cannot exceed 5% per minute of the rated capacity. This constraint is translated into a data change rate constraint between adjacent sampling points. ,in Indicates the power at the current moment. The power at the previous moment, Sampling interval, This refers to the rated power of the equipment. The causal relationship between upstream and downstream parameters is reflected in the heat pump system, specifically in the compressor power consumption. With heating capacity evaporator side temperature Condensation side temperature There are coupling constraints based on Carnot cycle. In actual calculations, a comprehensive performance coefficient correction term is introduced.
[0061] The theoretical expected value is calculated using a multi-source information fusion strategy. Historical data is extracted using a time alignment algorithm. For photovoltaic power data at 10:00 AM on a given day, valid data within a 30-minute window before and after the same time within the past 15 days are extracted. Abnormal days with irradiance deviating from the normal range by more than 20% are removed, and the remaining data are assigned an exponential decay weight based on time distance. ,in It represents the number of days from the current date. With an attenuation coefficient of 0.1, a weighted average is used to obtain the reference standard. Spatial references for adjacent sensor data are obtained through a topology propagation algorithm, for node voltages that are missing or suspected of being abnormal. Utilizing the voltage of adjacent nodes Spatial estimates for line parameters ,in and For line resistance and reactance, For node active power, The power factor angle.
[0062] When the physical constraint operator integrates the aforementioned reference benchmark and spatial reference, a constraint optimization framework is used to solve for the theoretical expected value. Taking the output power of a power generation device as an example, the integrated calculation formula is as follows: ,in and For dynamic weighting coefficients, minimize In satisfying and Solve under the constraints. This is a rate of change coefficient specific to the device. For the State of Charge (SOC) data of an energy storage system, the theoretically expected value must simultaneously satisfy the energy balance constraint. and voltage-SOC mapping constraints, where For charging efficiency, For charging current, This refers to the battery capacity.
[0063] The adaptive threshold dynamic adjustment mechanism is based on the statistical characteristics of the data within the sliding time window. Valid data from the past 72 hours is selected as the sliding window, and the deviation sequence is calculated. mean and standard deviation The initial threshold is set to To address changes in system operating conditions, a load level correction factor is introduced. When the current load factor is below 30%, take To address low load fluctuations, a threshold should be relaxed; when the load factor is between 30% and 70%, [the threshold should be set accordingly]. When the load factor exceeds 70%, take To improve detection sensitivity, the final adaptive threshold is... For data with obvious diurnal cycle characteristics, thresholds are calculated separately for each time period, with different statistical benchmarks used for low-load periods at night and high-load periods during the day.
[0064] The logic for identifying hidden anomalies requires simultaneous verification of statistical characteristics and physical constraints. For systematic drift data from sensors, a typical manifestation is a slow shift of 2 degrees Celsius in temperature sensor readings over a week. While a single sampling point deviation of only 0.3 degrees Celsius falls within the normal fluctuation range, the cumulative shift violates the seasonal trend constraint of ambient temperature. The detection method involves calculating the deviation of the 7-day moving average from the historical baseline for the same period. When the deviation of the moving average over five consecutive days exceeds a certain value... The time markers are defined as systematic drift, where The standard deviation is seasonal. Progressive offset data is commonly found in the accumulation of metering errors in electricity meters, manifesting as a linearly increasing trend in the deviation between the daily accumulated electricity consumption and the theoretically calculated value. By performing linear regression on the deviation sequence, the slope significance test... Values less than 0.05 and slopes exceeding 1% of the daily average are identified as progressive anomalies. The numerical range of these hidden anomalies perfectly conforms to statistical norms. However, if the criteria are violated, the balance between the cumulative quantities of upstream and downstream measuring points as defined by the law of conservation of energy is disrupted, or the operating boundary constraints defined by the equipment nameplate parameters are violated. Therefore, physical constraint operators are necessary for effective identification.
[0065] In one optional implementation, data combinations that violate energy conservation or mass balance relationships are determined to be combinational logic aberrations, including:
[0066] Multiple parameters with physical relationships are extracted from the original dataset to form a data combination to be verified. The data combination to be verified includes node input and output parameter combinations and upstream and downstream relationship parameter combinations.
[0067] The network topology connection relationship in the spatial association is transformed into a node energy conservation constraint equation, which represents the balance relationship that the energy injected into the node is equal to the energy outflow plus the energy loss. The node flow transmission relationship in the spatial association is transformed into a quality balance constraint equation, which represents the conservation relationship that the inflow quality of the node is equal to the outflow quality.
[0068] The combination of data to be verified is substituted into the node energy conservation constraint equation and the mass balance constraint equation for verification. When each parameter in the combination of data to be verified is within the normal range individually, but violates the balance relationship after being substituted into the constraint equation, the combination of data to be verified is determined to be combinational logic abnormal data.
[0069] For example, deep data validation requires identifying and extracting multiple parameter combinations with inherent physical relationships from the original dataset. These parameter combinations differ from isolated verification of single-point data; instead, they focus on the physical constraints that multiple measurements should satisfy. Specifically, the data combinations to be validated include two typical forms: node input-output parameter combinations and upstream-downstream correlation parameter combinations. Node input-output parameter combinations typically involve the inflow and outflow flow, power, or quality data at a specific energy node, such as the combination of input and output current at a substation in a power grid, or the combination of inlet and outlet temperature and flow at a heat exchange station in a heating network. Upstream-downstream correlation parameter combinations focus on the data correspondence between adjacent or related nodes along the energy transmission path, such as the consistency between the flow rate at an upstream measuring point and the flow rate at a downstream measuring point in a pipeline without branches, or the correspondence between the outlet power of a generator unit and the injected power of the connected bus in a power grid. By establishing these parameter combination mappings, dispersed measurement point data can be incorporated into a unified physical logic framework for validation.
[0070] To transform spatial relationships into computable constraints, it is necessary to establish node energy conservation constraint equations using network topology connections. In the energy internet, every node should satisfy the law of energy conservation, meaning the total energy injected into a node should equal the sum of the total energy flowing out of that node and the energy lost within the node. Taking a power grid node as an example, the injected power includes generator input and external power supply, while the outflowing power includes load consumption and power transmitted to adjacent nodes. Energy losses are manifested as line resistance losses, transformer iron losses, and copper losses. For a heat pipe network node, the injected energy is the heat power supplied by the heat source, the outflowing energy includes user-side heat load and heat power transmitted downstream, and the energy losses are the heat dissipation losses in the pipeline. The general form of the node energy conservation constraint equations can be expressed as follows: ,in Representing the Energy flow is injected into the path. Representing the Energy flows out of the path. This equation represents the energy loss at each node. Establishing this equation requires clarifying the connections between each node based on the network topology diagram, identifying all energy input and output branches connected to that node, and determining the calculation method for energy loss based on the device characteristics. For complex nodes, there may be multiple inputs and outputs; in this case, it is necessary to enumerate all branches one by one and calculate the cumulative energy loss.
[0071] Simultaneously, the flow transmission relationships at nodes need to be transformed into mass balance constraint equations. Mass balance is a fundamental physical law of fluid transport systems. For incompressible fluids or steady-state gases, the inflow mass flow rate at a node should strictly equal the outflow mass flow rate. In a natural gas pipeline network, the gas mass flow rate delivered from the upstream pipeline of a distribution node should equal the sum of the mass flow rates allocated by that node to all downstream users; without considering leakage, there should be no mass accumulation or deficit. In a water supply network, the sum of the inflow and outflow rates (considering branches) at a pumping station or pressure regulating valve should remain balanced. The typical form of the mass balance constraint equation is... ,in Indicates the first Mass flow rate into the road Indicates the first The mass flow rate is the outflow rate. For gaseous systems, the mass flow rate can be obtained by multiplying the volumetric flow rate by the density, and the effects of temperature and pressure on the density need to be considered. For liquid systems, the density change is usually small and can be considered constant; in this case, the volumetric flow rate balance can reflect the mass balance. In practical applications, the consistency of measurement units must also be considered to ensure that all flow parameters are calculated using a uniform dimension.
[0072] After establishing the constraint equations, the data to be verified is substituted into them for logical verification. This process first requires confirming whether the values of each parameter in the data combination are within a reasonable physical range. For example, the power and current values of an energy node should conform to the rated parameter range of the equipment, the temperature should be within the allowable operating range, and the flow rate should be non-negative. The range check of a single parameter can be quickly completed using upper and lower threshold values; these anomalies are explicit and easily identified by traditional methods. However, even if all parameters are normal individually, substituting them into the physical constraint equations may still expose hidden logical contradictions. Taking a distribution node in an integrated energy system as an example, the measured input power is... Output power is The power loss is Individually, all three values are within the normal range, but when substituted into the energy conservation equation... An imbalance was found between the left and right sides, with the difference reaching [value missing]. If the measurement error exceeds the allowable range, the data combination is determined to have a logical anomaly.
[0073] The specific verification and judgment logic is as follows: Calculate the difference between the left-hand side and the right-hand side of the constraint equation, and denot it as the residual. For the energy conservation constraint, the residual For quality balance constraints, residuals Ideally, the residual should be zero, but considering measurement error, computational accuracy, and modeling simplification, in practice, the residual is allowed to fluctuate within a certain tolerance range. Define the residual tolerance threshold. and ,when or When the data combination to be verified violates the physical equilibrium relationship, it is marked as combinational logic abnormal data. The setting of the threshold needs to comprehensively consider the accuracy level of the measuring instrument, the data sampling frequency, and the stability of the system operating conditions. It can usually be taken as the combined value of the uncertainty of each measurement parameter, or a reasonable range can be determined through statistical analysis of historical normal data.
[0074] Furthermore, for complex multi-node networks, a systematic approach is needed to traverse all nodes for combinational logic verification. First, a node list is generated based on the network topology. For each node, all associated measurement parameters are extracted to construct the data combination to be verified for that node. Then, the energy conservation constraint equation and mass balance constraint equation corresponding to that node are substituted sequentially for calculation, and the residual value for each node is recorded. For nodes with multiple constraint equations (e.g., simultaneously having electrical energy balance and thermal energy balance), the residuals of each constraint need to be calculated separately. If any constraint is violated, the data combination for that node is determined to be abnormal. During the determination process, time synchronization must also be considered to ensure that the data of each parameter used for combinational verification is collected at the same or sufficiently close timestamps to avoid false logical conflicts caused by time differences. For parameters with inconsistent sampling frequencies, interpolation or resampling methods can be used to align the data to a unified time reference.
[0075] Identifying combinational logic anomalies is crucial for improving data quality. Compared to single-point threshold checks, combinational logic verification can uncover more subtle data quality issues. For example, multiple sensors may simultaneously exhibit slight drifts without exceeding limits when checked individually, or data from different measurement points may be mixed up due to data transmission misalignment. These anomalies are difficult to detect without physical constraint verification but can severely impact subsequent applications such as state estimation and optimization scheduling. By introducing combinational logic verification based on energy conservation and mass balance during the data preprocessing stage, quality control can be performed before data enters the business system, preventing erroneous data from contaminating the analysis model and improving the overall system reliability and decision accuracy. Simultaneously, data combinations marked as combinational logic anomalies will be passed to the subsequent compensation and repair stage. A spatiotemporal attention compensation network will regenerate repair values that conform to physical constraints, ensuring the inherent consistency of the dataset.
[0076] In one optional implementation, a spatiotemporal attention compensation network is invoked to generate compensation values for the hidden anomaly data and the combinational logic anomaly data, including:
[0077] Based on the type of abnormal data, a spatiotemporal attention weight allocation strategy is selected. When the data to be compensated is the hidden abnormal data, the attention weight of the time dimension is increased, and historical periodic data is extracted from the time series trend relationship to calculate the time attention weight. When the data to be compensated is the combinational logic abnormal data, the attention weight of the spatial dimension is increased, and adjacent sensor data is extracted from the spatial correlation relationship to calculate the spatial attention weight.
[0078] A multi-head attention mechanism is used to construct short-term attention heads and long-term attention heads respectively. The short-term attention head captures instantaneous change features within a preset short-term time window, while the long-term attention head captures periodic pattern features of historical data from the same period. The outputs of the short-term and long-term attention heads are weighted by the time attention weight to obtain time-compensated features, and adjacent sensor data are weighted by the spatial attention weight to obtain spatial-compensated features. The time-compensated features and the spatial-compensated features are fused to generate a compensation value. During the training of the spatiotemporal attention compensation network, a physical information loss function is embedded to constrain the compensation value to satisfy the energy conservation and mass balance constraint equations.
[0079] For example, during the operation of the energy internet, for identified hidden anomaly data and combinational logic anomaly data, a spatiotemporal attention compensation network is needed to generate accurate compensation values. The core of this compensation network lies in dynamically adjusting the spatiotemporal attention weight allocation strategy according to the type of anomaly data, thereby selectively mining effective compensation information.
[0080] Before generating compensation values, the data to be compensated is first classified. If a data point is identified as hidden anomaly, it indicates that the data deviates from historical trends in the time series, but maintains consistency with adjacent nodes in spatial relationships. In this case, the time dimension attention weight is applied. Increased to the range of 0.7 to 0.85, while spatial dimension attention weights The corresponding values are reduced to the range of 0.15 to 0.3, and both satisfy the normalization condition. Extract the time series trend relationship corresponding to this data point, and select historical periodic data from the past 7 to 30 days for each historical data point. Time attention weight The similarity between the historical data point and the current time is measured. Specifically, the calculation examines the cosine similarity between the auxiliary feature vectors such as the operating conditions, ambient temperature, and load level at the time of the historical data point and the feature vector at the current time. The higher the similarity, the greater the corresponding weight. The weights of all historical data points are normalized so that their sum is 1.
[0081] If the data to be compensated is determined to be combinational logic aberration data, it means that the data point violates the energy conservation or mass balance relationship with spatially adjacent sensor data, but may exhibit normal continuity in the time series. In this case, spatial dimension attention weights are applied. Increased to 0.7 to 0.85, attention weight in the time dimension. Reduced to 0.15 to 0.3. Physically connected neighboring sensor data for the data point are extracted from spatial correlations. For a node in a power system, neighboring sensors include upstream power supply nodes, downstream load nodes, and parallel nodes on the same bus. Each neighboring sensor... Spatial attention weights corresponding to the data The weight is determined based on the physical coupling strength between the data point to be compensated and the data point to be compensated. For example, the coupling strength can be characterized by physical parameters such as electrical distance, pipe flow resistance, and thermal conductivity. The tighter the coupling, the higher the weight.
[0082] After determining the spatiotemporal attention weight allocation strategy, a multi-head attention mechanism is used to construct short-term and long-term attention heads respectively. The short-term attention head has a preset short-term time window length of 15 minutes to 2 hours, covering several sampling points before the current moment. For a system with a sampling frequency of 1 minute, the short-term window contains 15 to 120 historical data points. The short-term attention head captures the instantaneous change characteristics of the data within the window through self-attention calculation, and converts the data vectors at each moment within the window... Mapped to query vector Key vector Sum value vector Calculate the similarity score between the query and the key. ,in The vector dimension is represented by the attention score obtained by performing softmax normalization on the score. The final short-term attention head output is This output captures data fluctuation patterns caused by instantaneous events such as load changes and equipment switching.
[0083] Long-term attention heads extract periodic pattern features from historical data from the same period. They select data from the same time period corresponding to the current moment within a certain number of past periods. For example, for load data with obvious daily cycle characteristics, they extract data points from the same hour of each day over the past 7 to 14 days; for weekly cycle characteristics, they extract data points from the same day of the week and the same hour of each week over the past 4 to 8 weeks. Long-term attention heads also calculate historical data vectors from the same period using a self-attention mechanism. The correlation between them yields the long-term attention head output. This output reflects the periodic patterns of electricity consumption habits and production rhythms.
[0084] Short-term attention output and long-term attention output Time-compensated features are obtained by weighted fusion based on time attention weights. ,in and For learnable fusion coefficients, satisfying For data points with drastic fluctuations, Adaptive enhancement to strengthen instantaneous feature capture; for the steady-state operation phase, Improve to take advantage of periodic patterns.
[0085] At the same time, regarding the spatial dimension, each adjacent sensor Current sampling data Spatial attention weights Weighted spatial compensation features This feature aggregates real-time measurements of physical connection nodes, implicitly containing spatial constraint information on energy flow or matter transport.
[0086] Further integration of time compensation features and spatial compensation features The final compensation value is generated. The fusion process employs a gating mechanism to adaptively adjust the contribution of the two types of features, and the gating coefficient is calculated. ,in Indicates feature splicing, and For learnable parameters, The sigmoid activation function is used. The value ranges from 0 to 1. The compensation value is calculated as follows: This formula ensures that when the reliability of the time characteristics is high, the increase is achieved. The value decreases when the spatial features are more reliable. value.
[0087] During the training phase of the spatiotemporal attention compensation network, an embedded physical information loss function is used to constrain the compensation value, ensuring that the generated compensation value satisfies the energy conservation and mass balance constraint equations. For a node in the energy internet, the energy conservation constraint requires that the difference between the inflow power and the outflow power equal the node's injected power. Let the node... The compensation value is Its upstream node set is The downstream node set is Then the constraint equation is The physical information loss function consists of two parts: data fitting loss. Physical constraint loss measures the deviation between the compensated value and the true value. The degree to which the penalty is imposed for violations of the energy conservation or mass balance equations. To ensure the balance coefficient ranges from 0.1 to 1.0, adjustments are made... While ensuring data fitting accuracy, physical interpretability is enhanced. The total loss function is... The network parameters are trained by optimizing the total loss using the backpropagation algorithm. During training, the physical consistency of the compensation values is periodically evaluated on the validation set. If the physical constraint violation rate exceeds a set threshold of 3% to 5%, the compensation value is increased. Strengthen constraints; if the data fitting error is too large, resulting in insufficient compensation accuracy, then appropriately reduce the constraint. The value, through this dynamic adjustment mechanism, achieves an organic integration of data-driven and physical-driven approaches, ultimately generating a high-quality compensation value that conforms to both historical statistical patterns and physical laws.
[0088] In one optional implementation, a multi-head attention mechanism is used to construct short-term attention heads and long-term attention heads separately, including:
[0089] Multiple short-term attention heads are constructed to capture instantaneous change features at different time scales. Different short-term attention heads use preset short-term time windows of different lengths, ranging from minutes to hours.
[0090] Multiple long-term attention heads are constructed to capture features of different cycle patterns, and different long-term attention heads extract historical data from the same period of the day, week and season respectively;
[0091] Calculate the correlation score between the output of each attention head and the current data point to be compensated. The correlation score is calculated based on the matching degree between the features extracted by the attention head and the historical evolution pattern of the data point to be compensated.
[0092] The outputs of each attention head are weighted and fused based on the relevance score. The relevance score is converted into the fusion weight of each attention head through normalization. The outputs of all short-term attention heads are weighted and fused to obtain short-term comprehensive features. The outputs of all long-term attention heads are weighted and fused to obtain long-term comprehensive features. The short-term comprehensive features and the long-term comprehensive features are weighted by the time attention weight to obtain time-compensated features.
[0093] For example, in the spatiotemporal attention compensation network, the temporal dimension feature extraction adopts a multi-head attention mechanism, which constructs short-term attention heads and long-term attention heads to capture the changing patterns at different time scales. This hierarchical design can simultaneously focus on the instantaneous fluctuation characteristics and long-term evolution patterns of the data, thereby improving the accuracy of identifying and compensating for hidden abnormal data.
[0094] The design of short-term attention heads targets the rapidly changing characteristics of the energy internet, such as sudden load changes and instantaneous phenomena like power jumps caused by equipment start-up and shutdown. Multiple short-term attention heads are constructed, each using a preset short-term time window of different lengths. Specifically, the first short-term attention head uses a 15-minute time window to capture minute-level rapid fluctuations, suitable for identifying anomalies caused by sudden load increases or instantaneous equipment failures; the second short-term attention head uses a 1-hour time window to capture hourly trends, suitable for analyzing the gradual change process of electricity load; and the third short-term attention head uses a 4-hour time window to capture transitional characteristics across time periods. The length of different short-term time windows ranges from 5 minutes to 6 hours, configured according to the data sampling frequency and business requirements in the actual application scenario. For power data with a sampling interval of 5 minutes, the short-term time window can contain 3, 12, or 48 sampling points, corresponding to time spans of 15 minutes, 1 hour, and 4 hours, respectively.
[0095] The design of long-term attention heads targets the periodic patterns in energy data, including daily, weekly, and seasonal cycles. The first long-term attention head specifically extracts daily cycle features by constructing daily cyclical patterns using historical data from the same time point. For example, for the current data point to be compensated, data from the same time point in the past 7, 14, and 21 days are extracted as reference samples to capture the repetitive characteristics of the daily load curve. The second long-term attention head extracts weekly cycle features, targeting the differences between weekdays and non-weekdays by extracting historical data from the same day of the week. For example, for Monday data points, data from the past 4, 8, and 12 Mondays are extracted to capture fixed weekly load patterns. The third long-term attention head extracts seasonal cycle features, targeting changes in energy consumption patterns caused by climate change by extracting historical data from the same season or month. For example, for data points during the hot summer months, data from the same period last year is extracted as a reference to capture seasonal electricity consumption characteristics.
[0096] In the specific implementation of the attention head, each attention head contains a module for calculating the query vector, key vector, and value vector. For the data point to be compensated, its temporal features are first encoded into a query vector. , where subscript Indicates the first Each historical data point within a time window is encoded as a key vector. Sum value vector , where subscript Indicates the first time within the time window There are 10 historical points. The original attention score is obtained by calculating the dot product of the query vector and the key vector. The attention weights are then obtained through scaling and softmax normalization. ,in For vector dimensions. The final output of this attention head is... , represents the characteristics of all historical points within the weighted aggregation time window.
[0097] When calculating the correlation score between the output of each attention head and the current data point to be compensated, a feature matching metric is used. For the output of each short-term attention head... The instantaneous change characteristics inherent in the data are extracted, including statistics such as rate of change, fluctuation amplitude, and direction of change, and compared with the evolution patterns of the data points to be compensated in the historical time series. Specifically, the first-order difference sequence of the data points to be compensated over a previous period is calculated to obtain the historical rate of change pattern, and then the cosine similarity between this pattern and the short-term attention head output features is calculated as a correlation score. For the output of long-term attention heads The periodic features are extracted, and the frequency domain components are analyzed using Fourier transform. The periodic features of the time period in which the data points to be compensated are matched with the historical periodic patterns captured by long-term attention heads. The Pearson correlation coefficient of the frequency domain features is calculated as the correlation score. .
[0098] When weighting and fusing the outputs of each attention head based on relevance scores, the relevance scores are first converted into fusion weights through a normalization operation. For all short-term attention heads, their relevance scores are normalized using a softmax function to obtain the fusion weights for each short-term attention head. ,in This represents the total number of short-term attention heads. The outputs of all short-term attention heads are weighted and fused to obtain the short-term comprehensive feature. Similarly, for all long-term attention heads, the fusion weights are calculated. ,in The total number of long-term attention heads is used to weightedly fuse and obtain the long-term comprehensive features. .
[0099] When finally fusing short-term and long-term composite features, a time attention weight is introduced. and ,satisfy The time attention weights are dynamically adjusted based on the time context of the current data point. For example, during periods of rapid load change, the weights of short-term features are increased. During stable operation, increase the weight of long-term features. The time compensation feature is calculated as follows: The determination of temporal attention weights is achieved through an adaptive mechanism, which calculates the variance of the data point to be compensated within the previous time window. ,when A larger value indicates drastic data fluctuations; setting... , ;when A smaller value indicates stable data; setting... , .
[0100] In practical applications, for a hidden anomalous data point detected at 10:15 AM, the short-term attention head extracts a 30-minute window of data from 9:45 AM to 10:15 AM, a 1-hour window of data from 9:15 AM to 10:15 AM, and a 4-hour window of data from 6:15 AM to 10:15 AM, respectively capturing the instantaneous changes in the last half hour, the gradual trend in the last hour, and the overall evolution of the morning rush hour. The long-term attention head extracts data points from 10:15 AM over the past 7 days to construct a daily cycle reference, extracts data points from the same time of day in the past 4 weeks to construct a weekly cycle reference, and extracts data points from the same time of day in the same month of last year to construct a seasonal cycle reference. Correlation scoring calculations show that the anomalous data point has the highest correlation with the short-term features of the 1-hour time window, followed by the correlation with the long-term features of the daily cycle. Therefore, these two attention heads are given high weights during fusion, and the final generated time compensation feature accurately reflects the normal value range of the data point, providing a reliable time dimension basis for the subsequent generation of compensation values.
[0101] In one optional implementation, a low-confidence flag is added to the data points corresponding to the compensation value. When training the prediction algorithm, the data points with the low-confidence flag are assigned a weight lower than that of normal data, and a physical-logic penalty term is added to the loss function, including:
[0102] The confidence score of the compensation value is calculated based on the spatiotemporal data integrity and physical constraint satisfaction during the compensation process. The spatiotemporal data integrity is calculated based on the availability of reference historical data and the redundancy of adjacent sensor data. The physical constraint satisfaction is calculated based on the residual after substituting the compensation value into the node energy conservation constraint equation and the mass balance constraint equation. When the confidence score is lower than the preset confidence threshold, a low confidence marker is added to the corresponding data point.
[0103] When training the prediction algorithm, the confidence scores of all data points are extracted. For data points with the low confidence marker, the confidence scores are mapped to sample weights using the sigmoid function. The sample weights are negatively correlated with the confidence scores.
[0104] A physical logic penalty term is added to the loss function of the prediction algorithm. The physical logic penalty term calculates the degree of violation between the prediction output of the prediction algorithm and the node energy conservation constraint equation and the mass balance constraint equation. When the prediction output violates the constraint equation, the loss value is increased.
[0105] For example, after obtaining the compensation values, the reliability of these repaired data needs to be assessed. The confidence score of the compensation values is determined by two dimensions: spatiotemporal data completeness and physical constraint satisfaction. Spatiotemporal data completeness reflects the sufficiency of contextual information available when generating compensation values, specifically quantified by the availability of statistically referenced historical data and the redundancy of adjacent sensor data. For a data point that needs repair, the number of historical data records within its corresponding time window is first retrieved, denoted as . Simultaneously, the number of valid data points of spatially adjacent nodes at the same time is counted, denoted as . Spatiotemporal data completeness The calculation is a weighted combination of the two, expressed as: ,in The ideal number of historical records that should exist. This represents the total number of spatial neighbors of this node. and The weighting coefficients are and satisfy the following conditions: When historical data or spatial neighbor data is severely lacking, This will significantly reduce the information base upon which the compensation process depends.
[0106] The physical constraint satisfaction assesses the extent to which the compensation value conforms to the fundamental physical laws of the energy system. Substituting the compensation value into the nodal energy conservation constraint equations, the difference between the sum of energy inputs and the sum of energy outputs at the nodes is calculated as the energy residual. For distribution network nodes, the energy conservation constraint requires that the sum of the power flowing into the node minus the sum of the power flowing out should equal the node load, expressed as: Ideally... The value should be close to zero, but deviations within the measurement error range are allowed in actual calculations. Similarly, substituting the compensation value into the mass balance constraint equation, for a gas pipeline node, the mass balance requirement is that the difference between the inflow mass flow rate and the outflow mass flow rate equals the node's consumption; this deviation is denoted as the mass residual. Physical constraint satisfaction It is calculated using the normalized reciprocal of the residual, and expressed as: ,in This is a scaling factor that maps the residuals to a reasonable numerical range. A larger residual indicates that the compensation value deviates more from physical laws. The corresponding reduction.
[0107] Based on the above two dimensions, the final confidence score of the compensation value is... Calculated as ,in and These are the weighting coefficients for each dimension. The confidence score ranges from 0 to 1; the closer the value is to 1, the more reliable the compensation value. A preset confidence threshold is set. The confidence score is typically between 0.6 and 0.8, with the specific value determined based on the data quality requirements of the application scenario. When a low-confidence flag is added to a data point, it is assigned a low-confidence flag. This flag is implemented using a Boolean flag, adding a field to the data record where a value of True indicates low confidence and False indicates normal confidence. The introduction of the low-confidence flag allows subsequent algorithms to differentiate between data of different qualities, avoiding the mistreatment of questionable repaired data with the original reliable data.
[0108] When training a prediction algorithm using the repaired dataset, it's crucial to fully consider the confidence differences between data points. Before training begins, the entire dataset is traversed, and the confidence score for each data point is extracted. Data points with low confidence indicators cannot be simply excluded, as they may contain useful information despite their lower quality; complete discarding would reduce the sample size and waste information. A more reasonable approach is to dynamically adjust the influence weights of each data point during training based on its confidence score. Specifically, this is achieved by mapping the confidence score to sample weights using the sigmoid function, with the mapping formula being: ,in Parameters are used to control the steepness of the mapping curve. This mapping design ensures a negative correlation between sample weights and confidence scores; that is, the lower the confidence score, the smaller the corresponding sample weight. When the confidence score is exactly equal to a threshold... At this point, the sample weight is approximately 0.5, indicating that its influence is reduced by half. For data points with confidence scores far below the threshold, the sample weight will be close to 0 but not completely zero, still retaining its weak training contribution.
[0109] The calculated sample weights are applied to the training process of the prediction algorithm. In the gradient descent-based training framework, the sample weights are directly multiplied by the loss value of each data point. Assume the prediction algorithm... The predicted output for each data point is: The real label is If the base loss function uses mean squared error, then the weighted loss is calculated as follows: ,in For the total number of samples, For the first The weights of each sample. For data points with normal confidence, To maintain its original contribution to the loss function, the value should be close to 1; for low-confidence data points, The value is significantly less than 1, reducing its impact on gradient updates. This weighting mechanism allows the optimization of model parameters to rely more on high-quality data, reducing noise interference introduced by low-quality data, thereby improving the generalization performance of the prediction algorithm.
[0110] Relying solely on sample weighting is insufficient to adequately constrain the prediction algorithm; a physical logic penalty term must also be explicitly introduced into the loss function. The design goal of the physical logic penalty term is to ensure that the output of the prediction algorithm conforms to the fundamental physical laws of the energy system, and does not violate constraints such as energy conservation and mass balance, even in data-driven optimization. The calculation of the penalty term first substitutes the output of the prediction algorithm into the node energy conservation constraint equations and mass balance constraint equations to check whether the predicted values satisfy these constraints. For energy conservation constraints, the predicted power values of each branch are substituted into the node power balance equations to calculate the degree of violation. ,in To allow for energy balance error tolerance. (Using...) The function ensures that only violations exceeding the tolerance will incur penalties; deviations less than the tolerance are considered acceptable measurement errors.
[0111] For mass balance constraints, the predicted flow rates are also substituted into the constraint equations to calculate the degree of violation. ,in and These are the predicted inflow and outflow mass flows, respectively. This refers to the node consumption. This is for the mass balance error tolerance. Physical logic penalty term. The degree of violation of all nodes is summarized and represented as follows: ,in The total number of system nodes, and These are the weighting coefficients for each penalty term. A squared form is used to amplify the penalty for serious violations, prompting the model to prioritize eliminating large physical-logic biases.
[0112] In one optional implementation, adjusting the parameters of the spatiotemporal attention compensation network based on the verification results of the prediction algorithm includes:
[0113] The prediction accuracy of the prediction algorithm is evaluated using a validation dataset. The local prediction error for the time period containing the compensation value and the baseline prediction error for the time period containing only the original data are calculated respectively. When the local prediction error is greater than a preset multiple of the baseline prediction error, the compensation quality is determined to be insufficient.
[0114] The deviation distribution between the compensation value and its corresponding abnormal data value before replacement is calculated to identify temporal continuity deviation and spatial logical consistency deviation. The temporal continuity deviation is quantified by calculating the trend deviation between the compensation value and historical periodic data. The spatial logical consistency deviation is quantified by substituting the compensation value into the node energy conservation constraint equation and the mass balance constraint equation.
[0115] Based on the identified deviation type, the allocation ratio of temporal attention weights and spatial attention weights in the spatiotemporal attention compensation network is adjusted. When the proportion of temporal continuity deviation to total deviation exceeds a preset temporal threshold, the temporal dimension weight is increased. When the proportion of spatial logical consistency deviation to total deviation exceeds a preset spatial threshold, the spatial dimension weight is increased. The spatiotemporal attention compensation network is retrained and the verification and adjustment are iteratively performed until the local prediction error converges.
[0116] For example, in combination Figure 2 The flowchart illustrating the adjustment of spatiotemporal attention compensation network parameters based on the verification results of the prediction algorithm is as follows: After the spatiotemporal attention compensation network has completed the repair of hidden and combinational logic anomalies, a verification feedback mechanism needs to be established to evaluate the compensation effect and dynamically adjust the network parameters based on the evaluation results to improve the repair quality. This process is achieved by constructing a verification dataset, quantifying the compensation error, identifying the sources of bias, and optimizing the attention weight allocation accordingly.
[0117] A validation dataset is extracted from historical running data independent of the training process. This validation dataset must contain raw data segments with known running states and no repairs, as well as data segments with compensated values after processing by a spatiotemporal attention compensation network. The trained prediction algorithm is applied to these two types of data segments respectively, and the prediction accuracy index is calculated for each. For time periods containing only raw data, the root mean square error between the predicted and actual values is calculated as the baseline prediction error. For time periods that include compensation values, the root mean square error between the predicted and actual values is also calculated as the local prediction error. The actual values here refer to the real measurement data at subsequent times in the validation set for that time period, used to verify whether the compensation data can support accurate future predictions.
[0118] When discovered At that time, among them The preset multiplier, typically between 1.2 and 1.5, is used to determine insufficient compensation quality. This criterion is based on the following logic: if the compensation value accurately reflects the true physical state, then the prediction accuracy based on the compensation data should be comparable to or even better than that based on the original data, because the compensation process eliminates the interference of outlier data; if the compensation introduces new biases, it will lead to a decline in subsequent prediction performance. This verification method avoids the dilemma of directly comparing the compensation value with the unknown true value, instead indirectly evaluating the compensation quality through downstream tasks of prediction performance.
[0119] After determining that the compensation quality is insufficient, it is necessary to conduct an in-depth analysis of the sources of deviation. For each data point being compensated, the deviation between the compensation value and the abnormal data value before replacement should be calculated. ,in Indicates the first One compensation value, This represents the corresponding original outlier data value. Although the outlier data itself is unreliable, the statistical distribution characteristics of the deviation can reveal patterns in the compensation behavior. For example, if all deviation values converge in the same direction, it may indicate a systematic bias in the compensation network; if the deviation exhibits periodic fluctuations over time, it suggests insufficient extraction of time features.
[0120] To verify the continuity of the time series, historical periodic data before and after the time period in which the compensation value is located are extracted as a reference. For energy data with obvious periodicity, such as electricity load or renewable energy output, the trend deviation between the compensation value and historical data for the same period is calculated. Specifically, statistical modeling is performed on simultaneous data from multiple historical periods to obtain the expected trend curve. and confidence interval ,in The standard deviation. The compensation value. Compared with expected value The degree of deviation is used as the temporal continuity deviation. The calculation method is as follows This indicator reflects whether the compensation value conforms to the historical evolution pattern. The larger the value, the more serious the deviation between the compensation result and the historical pattern in the time dimension.
[0121] To verify spatial logical consistency, the compensation value is substituted into the physical constraint equations for testing. For the energy conservation constraints of nodes in the energy internet, the energy input and output of a node should satisfy a balance relationship. Let node... The input power is Output power is The internal power consumption is Theoretically, it should have When one of the parameters is replaced by a compensation value, the residual of the constraint equation is calculated. Similarly, for mass balance constraints, such as the flow conservation at nodes in a heating network system, let the total flow into the node be... The total outflow is residual The average or maximum value of the residuals of all nodes involving compensation values is taken as the spatial logical consistency deviation. This deviation directly reflects whether the compensation value meets the constraints of physical laws and is a key indicator for assessing the rationality of the compensation.
[0122] After quantifying the two types of bias, their proportions in the total bias are calculated. The total bias is defined as follows. The proportion of time series continuity deviation Spatial logical consistency deviation ratio Set preset timing thresholds. The value is typically set to 0.6, which is the preset space threshold. It also takes the value 0.6. When This indicates that the current compensation network is insufficient in capturing temporal evolution patterns, and its ability to extract features in the temporal dimension needs to be enhanced; when This indicates that the compensation results failed to fully comply with spatial physical constraints, and it is necessary to strengthen the modeling of spatial relationships.
[0123] Based on the bias analysis results, the weight distribution ratio of temporal attention and spatial attention in the spatiotemporal attention compensation network was adjusted. In the network architecture, the temporal attention mechanism and the spatial attention mechanism are integrated through learnable weight parameters. and By fusing their respective feature representations, the final feature vector is: ,in and These are temporal and spatial features, respectively. During initial training, these two weights may be set to equal values or preset based on experience, but they need to be dynamically adjusted during the validation and feedback phase. When a feature is detected... When, increase The initial value or the weight coefficient of the time consistency loss term is added to the loss function to encourage the network to pay more attention to the error in the time dimension during backpropagation. Specifically, this involves... Multiply by adjustment factor ,in The sensitivity coefficient is typically set between 0.5 and 1.0, so that the degree to which the deviation exceeds the threshold directly affects the weight increase.
[0124] Similarly, when At that time, Application of adjustment factor Enhancements are then made. Simultaneously, the weight coefficients of the corresponding energy conservation loss term and mass balance loss term in the network's physical information loss function also need to be increased to ensure that the training process places greater emphasis on satisfying physical constraints. The adjusted weight parameters are used as new initial values, and the spatiotemporal attention compensation network is retrained using the original training dataset. During retraining, other hyperparameters remain unchanged, and only the attention weight allocation is updated, enabling the network to specifically improve its weak points.
[0125] After training, the compensation effect of the new network is evaluated again using the validation dataset, and the new local prediction error is calculated. and the baseline prediction error Comparison. If Still greater than Repeat the above deviation analysis and weight adjustment process to form an iterative optimization loop.
[0126] A second aspect of the present invention provides an electronic device, comprising:
[0127] processor;
[0128] Memory used to store processor-executable instructions;
[0129] The processor is configured to invoke instructions stored in the memory to execute the aforementioned method.
[0130] A third aspect of the present invention provides a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the aforementioned method.
[0131] This invention can be a method, apparatus, system, and / or computer program product. The computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for performing various aspects of the invention.
[0132] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for identifying and repairing hidden anomaly data in the energy internet based on spatiotemporal logic, characterized in that, include: The original data set is collected, and time series trend relationships and spatial correlation relationships are constructed. The deviation between the current sampled value and the theoretical expected value is calculated through physical constraint operators. When the deviation exceeds the adaptive threshold, it is determined to be hidden abnormal data. Data combinations that violate the energy conservation or mass balance relationship are determined to be combinational logic abnormal data. The process involves calling a spatiotemporal attention compensation network to generate compensation values for the hidden anomaly data and the combinational logic anomaly data, wherein the spatiotemporal attention compensation network embeds a physical information loss function to inject energy conservation and mass balance logic into the process of generating compensation values in real time. A low-confidence flag is added to the data points corresponding to the compensation value. When training the prediction algorithm, the data points with the low-confidence flag are assigned a weight lower than that of normal data, and a physical logic penalty term is added to the loss function. The parameters of the spatiotemporal attention compensation network are adjusted according to the verification results of the prediction algorithm.
2. The method according to claim 1, characterized in that, The deviation between the current sampled value and the theoretical expected value is calculated using a physical constraint operator. When the deviation exceeds an adaptive threshold, it is determined to be hidden abnormal data, including: Historical data from the same period are extracted as a reference benchmark based on the time series trend relationship, and adjacent sensor data are extracted as a spatial reference based on the spatial correlation relationship. The theoretical expected value of the current sampled value is calculated by fusing the reference benchmark and the spatial reference through a physical constraint operator. The physical constraint operator transforms the physical inertial characteristics of the equipment into data change rate constraints and transforms the causal correlation of upstream and downstream parameters into coupling relationship constraints. The deviation between the current sampled value and the theoretical expected value is calculated, and the adaptive threshold is dynamically adjusted based on the statistical distribution of the data within the sliding time window. The adaptive threshold is automatically adjusted according to the load level changes of the system operating conditions. When the deviation exceeds the adaptive threshold, it is determined to be hidden abnormal data. The value of the hidden abnormal data is within the statistically normal range but violates the physical logic constraints defined by the physical constraint operator. The hidden abnormal data includes sensor systematic drift data and progressive offset data.
3. The method according to claim 1, characterized in that, Data combinations that violate the laws of energy conservation or mass balance are classified as combinational logic aberrations, including: Multiple parameters with physical relationships are extracted from the original dataset to form a data combination to be verified. The data combination to be verified includes node input and output parameter combinations and upstream and downstream relationship parameter combinations. The network topology connection relationship in the spatial association is transformed into a node energy conservation constraint equation, which represents the balance relationship that the energy injected into the node is equal to the energy outflow plus the energy loss. The node flow transmission relationship in the spatial association is transformed into a quality balance constraint equation, which represents the conservation relationship that the inflow quality of the node is equal to the outflow quality. The combination of data to be verified is substituted into the node energy conservation constraint equation and the mass balance constraint equation for verification. When each parameter in the combination of data to be verified is within the normal range individually, but violates the balance relationship after being substituted into the constraint equation, the combination of data to be verified is determined to be combinational logic abnormal data.
4. The method according to claim 1, characterized in that, The spatiotemporal attention compensation network is invoked to generate compensation values for the hidden anomaly data and the combinational logic anomaly data, including: Based on the type of abnormal data, a spatiotemporal attention weight allocation strategy is selected. When the data to be compensated is the hidden abnormal data, the attention weight of the time dimension is increased, and historical periodic data is extracted from the time series trend relationship to calculate the time attention weight. When the data to be compensated is the combinational logic abnormal data, the attention weight of the spatial dimension is increased, and adjacent sensor data is extracted from the spatial correlation relationship to calculate the spatial attention weight. A multi-head attention mechanism is used to construct short-term attention heads and long-term attention heads respectively. The short-term attention head captures instantaneous change features within a preset short-term time window, and the long-term attention head captures periodic pattern features of historical data from the same period. The outputs of the short-term and long-term attention heads are weighted by the time attention weight to obtain time compensation features, and adjacent sensor data are weighted by the spatial attention weight to obtain spatial compensation features. The time compensation features and the spatial compensation features are fused to generate compensation values.
5. The method according to claim 4, characterized in that, A multi-head attention mechanism is used to construct short-term and long-term attention heads, including: Multiple short-term attention heads are constructed to capture instantaneous change features at different time scales, and different short-term attention heads use preset short-term time windows of different lengths; multiple long-term attention heads are constructed to capture features of different cycle patterns, and different long-term attention heads extract historical data from the same period of the daily cycle, weekly cycle and seasonal cycle respectively. Calculate the correlation score between the output of each attention head and the current data point to be compensated. The correlation score is calculated based on the matching degree between the features extracted by the attention head and the historical evolution pattern of the data point to be compensated. The outputs of each attention head are weighted and fused based on the relevance score. The relevance score is then converted into a fusion weight for each attention head through a normalization operation. The outputs of all short-term attention heads are weighted and fused to obtain short-term comprehensive features, and the outputs of all long-term attention heads are weighted and fused to obtain long-term comprehensive features.
6. The method according to claim 3, characterized in that, A low-confidence flag is added to the data points corresponding to the compensation value. When training the prediction algorithm, the data points with the low-confidence flag are assigned a weight lower than normal data, and a physical-logic penalty term is added to the loss function, including: The confidence score of the compensation value is calculated based on the spatiotemporal data integrity and physical constraint satisfaction during the compensation process. The spatiotemporal data integrity is calculated based on the availability of reference historical data and the redundancy of adjacent sensor data. The physical constraint satisfaction is calculated based on the residual after substituting the compensation value into the node energy conservation constraint equation and the mass balance constraint equation. When the confidence score is lower than the preset confidence threshold, a low confidence marker is added to the corresponding data point. When training the prediction algorithm, the confidence scores of all data points are extracted. For data points with the low confidence marker, the confidence scores are mapped to sample weights using the sigmoid function. The sample weights are negatively correlated with the confidence scores. A physical logic penalty term is added to the loss function of the prediction algorithm. The physical logic penalty term calculates the degree of violation between the prediction output of the prediction algorithm and the node energy conservation constraint equation and the mass balance constraint equation. When the prediction output violates the constraint equation, the loss value is increased.
7. The method according to claim 1, characterized in that, Adjusting the parameters of the spatiotemporal attention compensation network based on the verification results of the prediction algorithm includes: The prediction accuracy of the prediction algorithm is evaluated using a validation dataset. The local prediction error for the time period containing the compensation value and the baseline prediction error for the time period containing only the original data are calculated respectively. When the local prediction error is greater than a preset multiple of the baseline prediction error, the compensation quality is determined to be insufficient. The deviation distribution between the compensation value and its corresponding abnormal data value before replacement is calculated to identify temporal continuity deviation and spatial logic consistency deviation. The temporal continuity deviation is quantified by calculating the trend deviation between the compensation value and historical periodic data. Based on the identified deviation type, the allocation ratio of temporal attention weights and spatial attention weights in the spatiotemporal attention compensation network is adjusted. When the proportion of temporal continuity deviation to total deviation exceeds a preset temporal threshold, the temporal dimension weight is increased. When the proportion of spatial logical consistency deviation to total deviation exceeds a preset spatial threshold, the spatial dimension weight is increased. The spatiotemporal attention compensation network is retrained and the verification and adjustment are iteratively performed until the local prediction error converges.
8. An electronic device, characterized in that, include: processor; Memory used to store processor-executable instructions; The processor is configured to invoke instructions stored in the memory to execute the method according to any one of claims 1 to 7.
9. A computer-readable storage medium having computer program instructions stored thereon, characterized in that, When the computer program instructions are executed by the processor, they implement the method described in any one of claims 1 to 7.