Multi-anomaly data repairing method based on u-shaped network and time-frequency domain weighted loss

By combining a U-shaped CNN-BiLSTM-Attention network with a time-frequency domain weighted loss function, the problem of multi-modal anomalies in civil engineering structure monitoring data is solved, achieving high-precision data repair and reliability assessment. It is applicable to health monitoring systems for various civil engineering structures such as bridges, buildings, and reservoir dams.

CN122241025APending Publication Date: 2026-06-19SUN YAT SEN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
SUN YAT SEN UNIV
Filing Date
2026-03-25
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies are unable to effectively handle multi-modal anomalies in civil engineering structural monitoring data, especially data missing, baseline drift, trend terms and mixed-modal anomalies, and fail to accurately assess the reliability of repair results, affecting the reliability of structural condition assessment and safety early warning.

Method used

A multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss is adopted. Feature extraction and reconstruction are performed through U-shaped CNN-BiLSTM-Attention network, and uncertainty assessment is performed by combining MC-dropout technology. A frequency-weighted loss function jointly used in the time and frequency domains is constructed to realize the reliability assessment of multi-scale feature extraction and data reconstruction.

Benefits of technology

It improves the integrity and reliability of civil engineering structure monitoring data, significantly enhances the repair accuracy of mixed-mode anomalies, and avoids misjudgments caused by unreliable repair results through uncertainty assessment. It is applicable to health monitoring systems for various civil engineering structures.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122241025A_ABST
    Figure CN122241025A_ABST
Patent Text Reader

Abstract

This invention discloses a multi-anomaly data repair method based on a U-shaped network and time-frequency domain weighted loss, comprising the following steps: S1: Constructing a training sample set and a test sample set; S2: Building a U-shaped CNN-BiLSTM-Attention network; S3: Extracting features from the data in the training sample set to obtain high-level abstract features containing multi-scale features; S4: Obtaining feature fusion results; S5: Obtaining enhanced temporal features; S6: Obtaining the optimal data repair model; S7: Using the optimal data repair model to repair the test sample set, obtaining repair results, and outputting the uncertainty quantification result of the repair results through the MC-dropout uncertainty evaluation module. This invention achieves high-precision repair of single anomalies and mixed-mode anomalies such as data missing, baseline drift, and trend terms in civil engineering structure monitoring data, and simultaneously outputs the uncertainty quantification result of the repair results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of civil engineering structural health monitoring, data anomaly processing, and deep learning data repair technology, specifically to a multi-anomaly data repair method based on U-shaped networks and time-frequency domain weighted loss. Background Technology

[0002] Various civil engineering structures, including bridges, buildings, reservoirs, dams, tunnels, rail transit systems, and slopes, are prone to performance degradation and potential damage during long-term operation due to environmental loads, traffic loads, material aging, and natural disasters. Structural health monitoring systems, by deploying various types of sensors such as strain, displacement, acceleration, tilt, temperature, and humidity sensors, achieve long-term real-time monitoring of the condition of various civil engineering structures. The completeness and reliability of the monitoring data are particularly crucial.

[0003] In practical engineering applications, monitoring data of various civil engineering structures often exhibit various anomalies due to factors such as sensor failure, transmission packet loss, electromagnetic interference, and extreme load impacts. These anomalies include single anomalies such as data loss, baseline drift, and trend terms. When two or more of these anomalies exist simultaneously in a sequence, it is called a mixed-mode anomaly. Traditional interpolation methods, filtering methods, and statistical models are difficult to process simultaneously, handling multi-mode, strongly time-series data, and data coupled with single and mixed-mode anomalies. Their repair accuracy is insufficient, and they are prone to losing key structural response characteristics, directly affecting the reliability of subsequent civil engineering structure status assessments, damage identification, and safety early warnings.

[0004] In recent years, deep learning has been widely used in time series data restoration, but existing technologies have the following shortcomings: (1) Traditional convolutional neural networks have difficulty capturing long-term temporal dependencies and are not effective in handling temporal correlation anomalies such as baseline drift and trend terms in civil engineering structure monitoring data; (2) Simple long short-term memory networks have a weak ability to extract local spatial and multi-scale features and cannot accurately capture local features such as missing data and abnormal mixed patterns in civil engineering structure monitoring data; (3) The lack of a feature enhancement mechanism that combines U-shaped structure and void convolution makes it difficult to meet the multi-scale feature extraction needs of single anomalies and mixed-mode anomalies in civil engineering structure monitoring data. (4) Most methods only use time domain error as the loss function and ignore frequency domain feature constraints, resulting in poor repair effect on mixed mode anomalies and trend term anomalies in civil engineering structure monitoring data; (5) Existing technologies have not solved the problem of accurate repair of mixed-mode anomalies, and have not designed a general architecture for multi-structure application in civil engineering. Some deep learning-based repair methods adopt the core architecture of "U-shaped convolutional neural network for feature extraction + long short-term memory network for time-series modeling" without introducing an attention mechanism, which limits their ability to capture features of mixed-mode anomalies. In addition, some methods use time-frequency domain loss functions, but the frequency domain loss term uses uniform weights and has not designed a weighting strategy for low-frequency trend anomalies in civil engineering structure monitoring data, so the repair accuracy needs to be improved. (6) Existing repair methods only output a single repair result and do not evaluate and analyze the uncertainty of the data reconstruction process. They cannot determine the reliability of the repair result. When the repair data is used in key scenarios such as civil engineering structural status assessment and damage identification, it is easy to make a misjudgment due to unreliable repair results, which poses a risk to engineering applications.

[0005] Therefore, in response to the actual needs of engineering, there is an urgent need for a method to repair multi-mode anomaly civil engineering structure monitoring data that can simultaneously achieve multi-scale feature extraction, bidirectional temporal modeling, attention feature weighting, high-low layer feature fusion, frequency-weighted time-frequency domain joint constraints, and can realize data reconstruction uncertainty assessment through MC-dropout technology, effectively handling multi-mode anomaly civil engineering structure monitoring data such as data missing, baseline drift, single trend term anomaly and mixed mode anomaly. Summary of the Invention

[0006] To overcome the shortcomings of existing technologies, the present invention aims to provide a multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss. This method achieves high-precision repair of single anomalies and mixed-mode anomalies such as data missing, baseline drift, and trend terms in civil engineering structure monitoring data, and simultaneously outputs the uncertainty quantification results of the repair results, thereby improving the integrity and reliability of monitoring data and providing high-quality data support for structural status assessment and safety early warning.

[0007] To achieve the objective of this invention, the following solution is adopted: A multi-anomaly data repair method based on U-shaped networks and time-frequency domain weighted loss includes the following steps: S1: Acquire multi-mode monitoring data of civil engineering structures, preprocess the multi-mode monitoring data, and label the single anomalies and mixed-mode anomalies, and construct training sample sets and test sample sets. S2: Construct a U-shaped CNN-BiLSTM-Attention network, which includes an encoder, a decoder, a skip connection module, a BiLSTM temporal modeling module, an attention mechanism module, and an MC-dropout uncertainty evaluation module embedded in the decoder; S3: Use the encoder to extract features from the data in the training sample set to obtain high-level abstract features containing multi-scale features; S4: The high-level abstract features are fused with the shallow features in the decoder through the skip connection module to obtain the feature fusion result; S5: Input the feature fusion result into the BiLSTM temporal modeling module for bidirectional temporal modeling to obtain temporal features; then input the temporal features into the attention mechanism module for feature weighting to obtain enhanced temporal features; S6: The decoder is used to reconstruct the enhanced temporal features to obtain repaired data; a frequency-weighted hybrid loss function combining the time and frequency domains is constructed, and the loss value between the repaired data and the corresponding real data is calculated using the frequency-weighted hybrid loss function. The U-shaped CNN-BiLSTM-Attention network is trained and optimized to obtain the optimal data repair model. S7: Use the optimal data repair model to repair the monitoring data in the test sample set that contain single anomalies and mixed-mode anomalies, obtain the repair results, and output the uncertainty quantification results of the repair results through the MC-dropout uncertainty assessment module.

[0008] Furthermore, in step S1, the multi-mode monitoring data includes two or more time-series data of various bridge structures, building structures, reservoir dams, tunnel structures, rail transit structures, and slopes, including strain, displacement, acceleration, tilt angle, temperature and humidity, settlement, deformation, and seepage flow. All data are collected in real time from the structural health monitoring system, and different types of monitoring data need to meet the sampling frequency of 1 / 1000Hz~100Hz (to meet the requirements of multi-scale feature extraction of void convolution). When extended to different civil engineering structures, the corresponding monitoring data types (such as vibration, settlement, deformation, seepage flow, etc.) can be adapted according to the structural type (such as bridge structure, building structure, reservoir dam).

[0009] And / or, the preprocessing includes removing invalid and redundant data and using max-min normalization to eliminate dimensional differences; And / or, the single anomaly includes missing data, baseline drift, and trend items, and the mixed-mode anomaly is two or more anomalies that coexist with the single anomaly.

[0010] Furthermore, in step S3, the encoder is embedded with a dilated convolutional layer and a basic CNN layer. The dilated convolutional layer uses dilated convolutional kernels with different dilation rates to achieve parallel extraction of abnormal features and local trend features at different scales. The basic CNN layer is used to complete the basic feature extraction.

[0011] Anomaly features include single anomaly features and mixed-mode anomaly features, adapting to the anomaly feature extraction needs of monitoring data for various civil engineering structures. Furthermore, in step S4, the skip connection module achieves high- and low-dimensional feature fusion by concatenating the high-level abstract features with the shallow features in the decoder or adding them element by element.

[0012] U-shaped networks achieve the splicing or element-wise addition of encoder and decoder features through skip connections, thereby completing the fusion of high and low-dimensional features and improving the feature capture capability for anomalous data of mixed modes of various civil engineering structures.

[0013] Furthermore, in step S5, the BiLSTM temporal modeling module models the feature fusion result in a bidirectional temporal manner (both forward and reverse) and outputs temporal features containing long-term dependencies. The attention mechanism module calculates the correlation between each time step in the temporal features and the anomaly labeling interval, and assigns higher weights to the anomaly and adjacent interval features to enhance the expression of the anomaly interval features.

[0014] The BiLSTM temporal modeling module models the feature vectors enhanced by dilated convolution using a bidirectional temporal approach (both forward and backward), outputting temporal features containing long-term dependencies. Furthermore, considering the strong temporal characteristics of civil engineering structure monitoring data, the initial bias terms of the input gate and forget gate of the BiLSTM temporal modeling module are specifically adapted, with default values ​​ranging from 0.1 to 0.3. This enhances the ability to capture temporal dependencies of slowly varying anomalies such as baseline drift and trend terms, adapting to the temporal pattern capture needs of various civil engineering structure data with missing data, baseline drift, trend terms, and mixed-mode anomalies.

[0015] Further, in step S7, the uncertainty quantification result of the repair result is output through the MC-dropout uncertainty assessment module. Specifically, during the inference phase, the dropout layer is kept on, and the same input anomaly monitoring data is independently and randomly sampled and reconstructed T times to obtain T sets of candidate repair data. By calculating the statistical characteristics of the T sets of candidate repair data, the data reconstruction uncertainty is quantitatively assessed.

[0016] Furthermore, the quantitative assessment of uncertainty uses standard deviation and coefficient of variation as core evaluation indicators. The standard deviation is used to characterize the dispersion of candidate repair data in group T, and the coefficient of variation is used to eliminate the influence of the magnitude of repair data, so as to achieve a unified assessment of the uncertainty of different types of monitoring data.

[0017] The MC-dropout uncertainty assessment module is embedded in the decoder of the U-shaped network to quantify the uncertainty of the data reconstruction process. The specific implementation process is as follows: a dropout layer is embedded in the fully connected layer and convolutional layer of the U-shaped network decoder. During the training phase, the dropout probability p∈[0.1,0.3] is fixed. During the inference phase, the dropout layer is not turned off. The same input anomaly monitoring data is subjected to T independent random sampling reconstructions to obtain T sets of repair results. By calculating the statistical characteristics of the T sets of repair results, the quantification assessment of the uncertainty of data reconstruction is completed.

[0018] Further, in step S6, the frequency-weighted hybrid loss function is obtained by weighted combination of time-domain loss term and frequency-domain loss term; wherein, the time-domain loss term is used to constrain the approximation degree between the repaired data and the real data in terms of numerical and temporal aspects; the frequency-domain loss term is used to constrain the approximation degree between the repaired data and the real data in terms of spectrum, energy distribution and frequency characteristics.

[0019] A frequency-weighted strategy is introduced into the frequency domain loss term, which assigns dynamic weights to different frequency ranges, focusing on strengthening the constraints of low-frequency trend characteristics and improving the repair accuracy of abnormal data in mixed modes of various civil engineering structures.

[0020] Furthermore, the time-domain loss term includes one or a combination of mean square error and mean absolute error; the frequency-domain loss term is constructed based on fast Fourier transform or wavelet transform, and a frequency weighting strategy is introduced in the frequency-domain loss term to assign dynamic weights to different frequency ranges, focusing on strengthening the constraint of low-frequency trend characteristics.

[0021] Furthermore, the frequency weighting strategy adopts Gaussian weighting, assigning a higher weight to the low-frequency interval than to the high-frequency interval. The low-frequency interval is the interval corresponding to the first quarter of the total length of the frequency domain data. The weight of the low-frequency interval ranges from 0.8 to 1.0, while the weight of the high-frequency interval ranges from 0.1 to 0.3.

[0022] Higher weights (ω_k=0.8~1.0) are assigned to the low-frequency range (k≤M / 4, where M is the total length of the frequency domain data), while lower weights (ω_k=0.1~0.3) are assigned to the high-frequency range (k>M / 4), to adapt to the abnormal characteristics of civil engineering structure monitoring data.

[0023] Compared with the prior art, the beneficial effects of the present invention are as follows: 1. This invention constructs a U-shaped CNN-BiLSTM-Attention network, embeds dilated convolutional layers and basic CNN layers in the encoder, and uses dilated convolutional kernels with different dilation rates to achieve parallel extraction of anomalous features at different scales. It can simultaneously capture multi-scale features such as single anomalies like missing data, baseline drift, and trend terms, as well as mixed-mode anomalies, solving the problem that traditional convolutional neural networks cannot take into account both local details and global trends. 2. This invention uses the skip connections of a U-shaped network to fuse the high-level abstract features extracted by the encoder with the shallow features in the decoder, thereby achieving complementarity between high- and low-dimensional features, preserving detailed information and global trends in the monitoring data, and improving the feature representation capability for mixed-mode abnormal data. 3. This invention uses a BiLSTM temporal modeling module to model the feature fusion results in a bidirectional temporal manner (forward and backward), accurately capturing the long-term temporal dependencies of monitoring data and effectively solving the problem of handling slowly changing anomalies such as baseline drift and trend terms. At the same time, an attention mechanism module is introduced, which calculates the correlation between each time step in the temporal features and the anomaly labeling interval, assigning higher weights to the anomaly and adjacent interval features, strengthening the feature expression of the anomaly interval. Compared with the traditional U-shaped CNN-LSTM architecture, the accuracy of anomaly repair in the hybrid mode is significantly improved. 4. This invention constructs a frequency-weighted hybrid loss function that combines time and frequency domains, while simultaneously constraining the approximation of the repaired data to the real data in both time-domain values ​​and frequency-domain spectra. This overcomes the shortcomings of existing methods that only use time-domain errors as the loss function and ignore frequency-domain feature constraints. A frequency-weighting strategy is introduced into the frequency-domain loss term, assigning higher weights to the low-frequency range and lower weights to the high-frequency range. This specifically strengthens the constraints on low-frequency trend features, effectively improving the repair accuracy of low-frequency anomalies such as baseline drift and trend terms, as well as hybrid mode anomalies. 5. This invention embeds an MC-dropout uncertainty assessment module into the decoder of a U-shaped network. During the inference phase, the dropout layer is kept on. By performing multiple independent random sampling reconstructions on the same input anomaly monitoring data, the standard deviation and coefficient of variation of T groups of candidate repair data are calculated, realizing the quantitative assessment of uncertainty in the data reconstruction process. This solves the problem that existing repair methods only output a single repair result and cannot judge the reliability of the repair result. It provides a reliability criterion for key scenarios such as structural state assessment and damage identification, and avoids engineering misjudgments caused by unreliable repair data. 6. This invention can be directly connected to existing health monitoring systems for various civil engineering structures. It is applicable to a variety of civil engineering structures such as bridge structures, building structures, reservoir dams, tunnel structures, rail transit structures, and slopes. It can handle common anomalies in engineering practice, such as missing data, baseline drift, trend terms, and mixed-mode anomalies, and has good versatility and engineering applicability. Attached Figure Description

[0024] Figure 1 This is a flowchart of a multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss in an embodiment of the present invention; Figure 2 This is a schematic diagram of the structure of the U-shaped CNN-BiLSTM-Attention network in an embodiment of the present invention; Figure 3 This is a schematic diagram of single anomaly and mixed-mode anomaly data in an embodiment of the present invention; wherein, the vertical axis represents strain (με) and the horizontal axis represents time (days). Figure 4 This is a schematic diagram of the results of reconstructing single anomaly and mixed-mode anomaly data in an embodiment of the present invention; where the vertical axis is strain (με) and the horizontal axis is time (days). Detailed Implementation

[0025] The present invention will now be further described in conjunction with the accompanying drawings and specific embodiments. It should be noted that, without conflict, the various embodiments or technical features described below can be arbitrarily combined to form new embodiments.

[0026] This invention provides a multi-anomaly data repair method based on U-shaped networks and time-frequency domain weighted loss. It improves the reliability of repair results by introducing MC-dropout technology to realize uncertainty assessment and analysis of data reconstruction, thereby enhancing the integrity and accuracy of monitoring data of various civil engineering structures under single anomaly and mixed-mode anomaly conditions. It has good versatility and scalability.

[0027] like Figure 1-4 As shown, the multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss according to an embodiment of the present invention includes the following steps: S1: Obtain multi-mode monitoring data of civil engineering structures, preprocess the multi-mode monitoring data, and label the single anomalies and mixed-mode anomalies to construct training sample sets and test sample sets.

[0028] S2: Construct a U-shaped CNN-BiLSTM-Attention network, which includes an encoder, a decoder, a skip connection module, a BiLSTM temporal modeling module, an attention mechanism module, and an MC-dropout uncertainty evaluation module embedded in the decoder.

[0029] S3: Use the encoder to extract features from the data in the training sample set to obtain high-level abstract features containing multi-scale features.

[0030] S4: The high-level abstract features are fused with the shallow features in the decoder through the skip connection module to obtain the feature fusion result.

[0031] S5: Input the feature fusion result into the BiLSTM temporal modeling module for bidirectional temporal modeling to obtain temporal features; then input the temporal features into the attention mechanism module for feature weighting to obtain enhanced temporal features.

[0032] S6: The enhanced temporal features are reconstructed using the decoder to obtain repaired data; a frequency-weighted hybrid loss function combining the time and frequency domains is constructed, and the loss value between the repaired data and the corresponding real data is calculated using the frequency-weighted hybrid loss function. The U-shaped CNN-BiLSTM-Attention network is trained and optimized to obtain the optimal data repair model.

[0033] S7: Use the optimal data repair model to repair the monitoring data in the test sample set that contain single anomalies and mixed-mode anomalies, obtain the repair results, and output the uncertainty quantification results of the repair results through the MC-dropout uncertainty assessment module.

[0034] The following will further elaborate on the multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss in the embodiments of the present invention.

[0035] In this embodiment, the specific content of multi-mode monitoring data acquisition and preprocessing is as follows: Collect multi-mode monitoring data of various civil engineering structures, including time-series data such as strain, displacement, acceleration, tilt angle, temperature, and humidity of bridge structures; time-series data such as strain, settlement, and vibration of building structures; and time-series data such as displacement, seepage flow, and strain of reservoir dams. Perform targeted preprocessing on the collected multi-mode monitoring data, including: removing invalid and redundant data; using maximum-minimum normalization to eliminate dimensional differences between different types of monitoring data (strain, displacement, etc.); and combining manual annotation with intelligent recognition (based on a simple threshold method to initially screen abnormal intervals) to accurately annotate single anomalies such as missing data, baseline drift, and trend terms, as well as mixed-mode anomalies involving two or more coupled anomalies, ensuring that the annotation accuracy meets the subsequent multi-scale feature extraction requirements. Construct training and testing sets containing the single anomalies and mixed-mode anomalies.

[0036] In this embodiment, the specific content of constructing the U-shaped CNN-BiLSTM-Attention network is as follows: the main body of the network has a U-shaped symmetrical structure, including an encoder, decoder, skip connections, a BiLSTM temporal modeling module, an attention mechanism module, and an MC-dropout uncertainty assessment module. This adapts to the repair needs of various civil engineering structural single-anomaly and mixed-mode anomaly data, while simultaneously achieving data reconstruction uncertainty assessment, possessing good versatility and scalability. The MC-dropout uncertainty assessment module is embedded in the decoder of the U-shaped network, specifically located between the fully connected layer and the convolutional layer of the decoder. It is used to quantitatively assess the uncertainty of the data reconstruction process, providing a basis for judging the reliability of the repair results and avoiding the use of unreliable repaired data in critical scenarios such as structural state assessment.

[0037] In this embodiment, the specific content of multi-scale feature extraction using the encoder is as follows: Embed dilated convolutional layers and base CNN layers in the encoder: 1. Dilated convolution uses different dilation rates to expand the receptive field, enabling the capture of anomalous features at different scales (including single anomalous features and mixed-mode anomalous features), laying the foundation for subsequent feature fusion. Its core calculation formula is: in, The pixel values ​​of the output feature map of the dilated convolution. The input feature map is the feature map of the civil engineering structure monitoring data initially extracted by the basic CNN layer. For size convolution kernel, The dilation rate of the dilated convolution is... The two-dimensional index of the convolution kernel (with values ​​ranging from 0 to K-1). This refers to the side length of the convolution kernel (in this invention, K is either 3 or 5, an odd number, to ensure feature map size matching). This invention adjusts... The values ​​are 1, 2, and 4, which respectively capture different scale features of data missing, baseline drift, single trend term anomaly, and mixed mode anomaly in civil engineering structure monitoring data, solving the problem that traditional CNNs cannot take into account both local details and global trends.

[0038] 2. The basic CNN layer completes the extraction of local basic features, providing a foundation for subsequent feature fusion and temporal modeling, and adapting to the feature extraction needs of various civil engineering structure monitoring data.

[0039] In this embodiment, the feature fusion based on skip connections is as follows: through U-shaped skip connections, the high-level abstract features output by the encoder (including deep features of single anomalies and mixed-mode anomalies) are directly fused with the shallow features of the decoder, preserving detailed information and global trends, and improving the feature representation capability of mixed-mode anomaly data of various civil engineering structures.

[0040] In this embodiment, the specific content of BiLSTM bidirectional temporal modeling and attention feature weighting is as follows: the fused features are input into BiLSTM to model temporal dependencies from both forward and reverse directions, accurately capturing the evolution patterns of monitoring data (including data containing single anomalies and mixed-mode anomalies), adapting to the modeling needs of temporal anomalies such as baseline drift and trend terms in various civil engineering structure monitoring data, and ensuring the temporal accuracy of subsequent data repair. The core calculation formula is as follows: Forward LSTM cell state update: Reverse LSTM cell state update (reverses the time series, the calculation process is the same as above, and the output is...) ) BiLSTM final output (concatenated forward and reverse hidden states): in, For time step indexes of time series sequences; , , These are the output values ​​of the input gate, forget gate, and output gate (ranging from 0 to 1), respectively. The cell state at the current time step (used to store long-term temporal information); The candidate cell states at the current time step; The hidden state (output feature) of the forward LSTM at the current time step; This represents the hidden state of the inverse LSTM at the current time step. Input the BiLSTM feature vector (i.e., the features of civil engineering structure monitoring data after skip connections) to the current time step. This is the hidden state from the previous time step; These are weight matrices corresponding to the input and hidden states; These are all bias terms for the corresponding gating; This is the sigmoid activation function (used for gate switch control). This is the hyperbolic tangent activation function (used for feature normalization); This is an element-wise multiplication operation; This is a feature concatenation operation (concatenating the forward and reverse hidden states along the channel dimension to enhance the representation of temporal features).

[0041] The above BiLSTM cell state update formula is specifically adapted to the strong temporal characteristics of the civil engineering structure monitoring data processed by this invention. The initial bias terms of the input gate and forget gate are adjusted (b_i and b_f have default values ​​of 0.1 to 0.3), which enhances the ability to capture the temporal dependence of slow-changing anomalies such as baseline drift and trend terms. This is different from the application settings of general BiLSTM in other time series data (such as meteorology and power).

[0042] The temporal features output by BiLSTM are input into the attention mechanism module. The attention mechanism calculates the correlation between the features at each time step and the anomaly labeling interval, and assigns higher weights to the anomaly and adjacent interval features to enhance the repair accuracy of hybrid anomalies. Its core logic is as follows: for the hidden state of each time step output by BiLSTM, the attention weight is calculated in combination with the anomaly labeling information. The weight value is positively correlated with the correlation between the features and the anomaly interval. Then, the hidden state and the attention weight are weighted and summed to output the enhanced temporal features, which provide more accurate feature support for subsequent data repair.

[0043] In this embodiment, the specific content of the uncertainty assessment and analysis for MC-dropout data reconstruction is as follows: The uncertainty assessment module of MC-dropout embedded in the U-shaped network decoder is used to quantitatively assess the uncertainty of the data reconstruction process. The specific implementation process is as follows: 1. MC-dropout module settings: Embed dropout layers in the fully connected layers and convolutional layers of the U-shaped network decoder, and set the dropout probability p to 0.1~0.3 (to adapt to the feature complexity of civil engineering structure monitoring data, avoid feature loss due to excessively high dropout probability, and avoid failure to achieve effective uncertainty assessment if the dropout probability is too low). 2. Training phase: The dropout layer works normally, randomly discarding some neurons to regularize the network, improve its generalization ability, and lay the foundation for subsequent uncertainty assessment; 3. Inference Phase: The dropout layer is not turned off, and the dropout probability p is maintained consistent with that in the training phase. The same input anomaly monitoring data is subjected to T independent random sampling reconstructions (T≥50, the more samplings, the more accurate the uncertainty assessment results; this invention preferably uses T=100), resulting in T different repair results. (i is the time-domain data index, t=1,2,...,T); 4. Uncertainty Quantification: By calculating the statistical characteristics of the T-group remediation results, the uncertainty is quantitatively assessed. The core indicators used are standard deviation and coefficient of variation. Standard deviation characterizes the dispersion of the remediation results, while the coefficient of variation eliminates the influence of the magnitude of the remediation data, enabling a unified assessment of different types of monitoring data. The specific calculation formula is as follows: 1) Mean value of repair results (used to characterize the final repair output value): 2) Standard deviation (characterizes the degree of dispersion; the larger the value, the higher the uncertainty): 3) Coefficient of variation (normalized uncertainty index): In the formula, ε is the minimum value (ε=1e-8), used to avoid the case where the denominator is zero; 4) Reliability determination: Set the coefficient of variation threshold CV_th=0.1. When CV_i≤0.1, the repair result of this time step is determined to have low uncertainty and reliable repair, and can be directly used for subsequent structural status assessment. When CV_i>0.1, the repair result of this time step is determined to have high uncertainty and needs to be further verified by combining the original monitoring data and the repair results of adjacent time steps to avoid misjudgment.

[0044] This invention uses MC-dropout technology to assess the uncertainty of data reconstruction, solving the problem that existing repair methods cannot determine the reliability of repair results. It is especially suitable for scenarios with high reliability requirements, such as civil engineering structural monitoring data, thus improving the engineering applicability of the method.

[0045] In this embodiment, the specific content of the frequency-weighted time-frequency domain mixing loss function optimization is as follows: A frequency-weighted hybrid loss function, composed of time-domain and frequency-domain losses, is constructed. This function, while constraining both numerical accuracy and spectral characteristics, can improve the repair accuracy of single anomaly data for various civil engineering structures and specifically optimize the repair effect of hybrid anomaly data. It addresses the shortcomings of traditional single time-domain loss functions in repairing hybrid anomalies. The core formula is as follows: in, , Weighting coefficients ( The values ​​range from 0.1 to 0.9. These values ​​are used to balance the contribution of time-domain and frequency-domain losses and can be dynamically adjusted according to the repair needs of data missing, baseline drift, single trend term anomalies, and mixed mode anomalies in civil engineering structure monitoring data. This represents the total network loss (used for network parameter optimization). For time-domain loss terms; This is the frequency-weighted frequency domain loss term.

[0046] The time-domain loss term (using mean squared error MSE, which can be replaced by mean absolute error MAE) is used to constrain the deviation between the repaired data and the actual monitoring data in the time domain, thereby improving the numerical accuracy of data repair for various civil engineering structures. in, The total length of the time-domain data (i.e., the total number of time steps for a single monitoring data point). For indexes of time-domain data (values ​​range from 1 to N); The data represents the actual civil engineering structure monitoring data at the i-th time step. Repair the model data at the i-th time step.

[0047] The frequency-weighted frequency domain loss term (constructed based on Fast Fourier Transform, FFT) is used to constrain the consistency of the repair data and the actual monitoring data in terms of spectrum and energy distribution. It specifically addresses the problem of spectral feature distortion in trend terms and mixed-mode anomaly repair in civil engineering structure monitoring data. Its core formula is: in, This is the Fast Fourier Transform (FFT) operation, used to convert time-domain data into frequency-domain data; The total length of the frequency domain data (equal to the length N of the time domain data); The frequency index for frequency domain data (value range 1 to M); For real data The complex result of the k-th frequency point after FFT transformation; To repair data The complex result of the k-th frequency point after FFT transformation; This is an operation to extract the real part of a complex number; This is an operation to extract the imaginary part of a complex number; This is a square root operation (used to calculate the Euclidean distance between two complex numbers, characterizing spectral differences). The weighting coefficient for frequency k is determined using a Gaussian weighting strategy, assigning higher weights (ω_k = 0.8~1.0) to the low-frequency range (k ≤ M / 4) and lower weights (ω_k = 0.1~0.3) to the high-frequency range (k > M / 4). Since anomalies in civil engineering structure monitoring data (especially baseline drift and trend terms) are mainly reflected in low-frequency trends, high-frequency weighting suppression reduces the impact of noise on repair accuracy, unlike the uniform weighting frequency domain loss calculation method used in existing technologies. Through joint constraints in the time and frequency domains, the repaired data ensures that it conforms to the numerical characteristics of the original data while retaining its spectral characteristics, adapting to the repair needs of various anomaly data in various civil engineering structures.

[0048] In this embodiment, the specific content of model inference and data repair is as follows: using the trained model to repair various types of multi-mode monitoring data of civil engineering structures, including single anomalies such as missing data, baseline drift, and trend anomalies, as well as mixed-mode anomalies, and outputting the mean of T groups of repair results. As the final repair data, the coefficient of variation (CV_i) at each time step is also output as the uncertainty assessment result. Based on the reliability judgment criteria of CV_i, reliable repair data is selected for subsequent civil engineering structure status assessment, damage identification and safety early warning, thereby improving the reliability of engineering applications. This model can be adapted to various civil engineering structures without the need for significant adjustments to the network structure. It can achieve efficient repair simply by adjusting the input feature dimensions according to the monitoring data types of different structures, and has strong versatility.

[0049] The multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss in this invention has the following advantages: 1. By combining U-shaped structure with dilated convolution, multi-scale and high- and low-dimensional feature fusion can be achieved. It can not only accurately capture single abnormal features such as data missing, baseline drift, and trend terms in various civil engineering structure monitoring data, but also effectively extract complex features of mixed-mode anomalies, significantly improving the ability to capture local mutations and global trends. 2. BiLSTM bidirectional temporal modeling combined with an attention mechanism: BiLSTM accurately captures the strong temporal characteristics and long-term dependencies of civil engineering structural monitoring data, while the attention mechanism enhances the representation of anomaly interval features, adapting to the temporal pattern capture needs of baseline drift, trend terms, and mixed-mode anomaly data. Compared to the traditional U-shaped CNN-LSTM architecture, this invention adds an attention mechanism, improving the accuracy of mixed-mode anomaly repair by more than 15%. 3. The frequency-weighted time-domain hybrid loss function simultaneously constrains both time-domain numerical values ​​and frequency-domain features. By strengthening the constraint of low-frequency trend features through a frequency-weighting strategy, it significantly improves the repair accuracy in various civil engineering structure monitoring data scenarios involving single anomalies such as data loss, baseline drift, and trend terms, as well as mixed-mode anomalies. Compared to a frequency-domain loss function using uniform weights, the frequency-weighted strategy of this invention can improve the spectral correlation coefficient of baseline drift anomaly repair by more than 0.08. 4. The MC-dropout technique is introduced to achieve quantitative assessment of the uncertainty of data reconstruction. The reliability of the repair result is judged by two indicators: standard deviation and coefficient of variation. This solves the problem that the existing technology only outputs a single repair result and cannot assess reliability. It avoids the risk of engineering misjudgment caused by unreliable repair data and improves the engineering practicality of the method. 5. This invention can be directly connected to the existing health monitoring systems of various civil engineering structures to achieve automated and real-time data repair. It can handle various single and mixed anomalies commonly seen in engineering practice, and is compatible with multiple types of civil engineering structures, making it highly practical and versatile.

[0050] The following section uses health monitoring data of a certain bridge structure as an example to provide a detailed explanation of the multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss in this embodiment of the invention.

[0051] 1. Collect and preprocess multi-mode monitoring data: Taking various civil engineering structures (such as bridge structures like beam bridges, arch bridges, and cable-stayed bridges; building structures like frame structures and shear wall structures; and reservoir dam structures like gravity dams and arch dams) as examples, corresponding monitoring data are collected: strain, acceleration, and displacement are collected for bridge structures; strain and settlement data are collected for building structures; and displacement and seepage flow data are collected for reservoir dams. The sampling frequency is 1Hz to 100Hz (to meet the needs of multi-scale feature extraction of dilated convolution).

[0052] The raw data is processed as follows: invalid and redundant data are removed; combined with manual annotation and intelligent identification (based on a simple threshold method to initially screen out abnormal intervals), the missing data, baseline drift, trend term single anomalies and mixed-mode anomalies are identified; maximum-minimum normalization is performed to eliminate the influence of different dimensions of monitoring data; and a sample set containing single anomalies and mixed-mode anomalies is constructed by combining artificial synthesis and real anomaly annotation to adapt to the monitoring data characteristics of various civil engineering structures.

[0053] 2. Construct a U-shaped CNN-BiLSTM-Attention network: Encoder: Convolutional layer + dilated convolutional layer + pooling layer, used to extract multi-scale features of single anomalies (data missing, baseline drift, trend terms) and mixed-mode anomalies in various civil engineering structures.

[0054] Decoder: Deconvolution / upsampling layer + dropout layer + fully connected layer, where the dropout layer is embedded between the deconvolution layer and the fully connected layer, forming the MC-dropout uncertainty assessment module, which is used to realize the uncertainty assessment of data reconstruction.

[0055] Skip connection: The corresponding layer features of the encoder are concatenated with the features of the decoder to achieve the fusion of high and low layer features.

[0056] The end layer sequentially connects to a BiLSTM layer for bidirectional temporal modeling and an attention mechanism module for feature weighting, capturing temporal dependencies, strengthening the feature expression of abnormal intervals, and adapting to the temporal characteristics of various civil engineering structure monitoring data.

[0057] 3. Perform multi-scale feature extraction: The dilation rate of the dilated convolution is set to 1, 2, 3, and 4 to achieve different receptive field feature extraction, thereby capturing the features of missing data, baseline drift, single anomaly of trend terms, and mixed mode anomaly in various types of civil engineering structure monitoring data. The basic CNN layer completes local filtering and basic feature extraction, providing support for subsequent feature fusion and adapting to the feature extraction needs of monitoring data of different types of civil engineering structures.

[0058] 4. Perform feature fusion: By using skip connections, high-level semantic features (including deep features of various civil engineering structural data missing, baseline drift, single anomalies of trend terms and mixed-mode anomalies) are fused with shallow detailed features to improve feature representation capabilities and ensure that key anomaly-related features are not lost during the repair process.

[0059] 5. Bidirectional temporal modeling and attention feature weighting: BiLSTM traverses time series sequences in both forward and backward directions, outputting deep time series features containing dependencies between preceding and subsequent events. The initial bias terms of the input gate and forget gate of BiLSTM are set to 0.2 (to meet the needs of capturing slow-changing anomalies in civil engineering structures), accurately capturing the time series evolution patterns of baseline drift, single anomalies in trend terms, and mixed-mode anomalies in various civil engineering structure monitoring data. The attention mechanism module calculates the correlation between features at each time step and anomaly labeled intervals, assigns higher weights (weight values ​​0.8~1.0) to anomaly and adjacent interval features, and assigns lower weights (weight values ​​0.1~0.3) to normal interval features, thereby strengthening the expression of anomaly interval features and improving the accuracy of anomaly repair in hybrid mode.

[0060] 6. Uncertainty assessment and analysis of MC-dropout data reconstruction: During the inference phase, the dropout layer is kept on (probability p=0.2). The same input anomaly monitoring data is subjected to 100 independent random samplings for reconstruction, resulting in 100 sets of repair results. The mean of the repair results at each time step is calculated. Standard deviation The coefficient of variation CV_i is used; CV_th=0.1 is set to filter out reliable repair data with CV_i≤0.1. For the repair results of time steps with CV_i>0.1, a second verification is performed in combination with the normal data of adjacent time steps to ensure the reliability of the repair results.

[0061] 7. Frequency-weighted time-frequency domain mixing loss function: set up: Loss_total = α·Loss_time + β·Loss_freq in: Loss_time is MSE (mean squared error) or MAE (mean absolute error), used to constrain the numerical approximation of the repaired data to the true data in the time domain. It is suitable for data numerical repair needs of various civil engineering structures with missing data, baseline drift, single trend term anomalies, and mixed pattern anomalies. Its core formula is: MSE (mean squared error): MAE (Mean Absolute Error): in, The total length of the time-domain data. For time-domain data indexing, This represents the actual civil engineering structure monitoring data at the i-th time step. Repair the model data at the i-th time step. The squared difference between the real data and the repaired data. This represents the absolute difference between the actual data and the repaired data.

[0062] Loss_freq is constructed based on FFT (Fast Fourier Transform) spectral error and introduces a frequency weighting strategy to constrain the consistency of the spectrum and energy distribution of the repaired data with the real data in the frequency domain. It specifically improves the repair effect of single-mode anomalies and mixed-mode anomalies in various civil engineering structure monitoring data, avoiding spectral distortion in the repaired data. Its core formula is: in, For Fast Fourier Transform operation, This represents the total length of the frequency domain data. For frequency index, , These are the complex numbers at the k-th frequency point after FFT transformation of real data and repaired data (monitoring data of various civil engineering structures), respectively. , These are operations for extracting the real and imaginary parts of a complex number, respectively. This is a square root operation (used to calculate the Euclidean distance between two complex numbers, characterizing spectral differences). The weighting coefficient for frequency k is determined by a Gaussian weighting strategy, which assigns higher weights (ω_k=0.8~1.0) to the low-frequency range (k≤M / 4) and lower weights (ω_k=0.1~0.3) to the high-frequency range (k>M / 4). , The weighting coefficient is dynamically adjusted based on the need to correct data gaps, baseline drift, and single and mixed anomalies in trend terms in various civil engineering structural monitoring data. The value range is 0.1–0.9, and it meets the following requirements: In this embodiment, for the repair of mixed-mode abnormal data, α=0.4 and β=0.6 can be selected to strengthen the frequency domain feature constraints and improve the repair accuracy of mixed abnormal data; for the repair of single abnormal data, α=0.6 and β=0.4 can be selected to balance the repair effect of time domain values ​​and frequency domain features. This parameter setting is suitable for the repair of monitoring data of various civil engineering structures.

[0063] 8. Model Training and Repair: The Adam optimizer was used, with a learning rate of 1e-4 to 1e-3, and training continued until the loss function converged. The model repairs various civil engineering structural data, including those with missing data, baseline drift, single trend term anomalies, and mixed-mode anomalies, within the test set. It also performs MC-dropout uncertainty assessment and outputs the final repaired data and uncertainty indicators. Evaluation metrics include RMSE (Root Mean Square Error), MAE (Mean Absolute Error), R² (Coefficient of Determination), spectral correlation coefficient, and uncertainty assessment accuracy. The model's effectiveness in repairing various types of anomaly data in civil engineering structures and its reliability in uncertainty assessment are verified to be at a high level. Specifically, the accuracy of mixed-mode anomaly repair is improved by more than 15% compared to existing technologies, the spectral correlation coefficient for baseline drift anomaly repair is improved by more than 0.08, and the uncertainty assessment accuracy is ≥95%.

[0064] The multi-anomaly data repair method based on U-shaped network and time-frequency domain weighted loss in this invention can be directly applied to the repair of monitoring data for various civil engineering structures such as bridge structures, building structures, reservoir dams, tunnel structures, rail transit structures, and slopes. It does not require changing the core network structure; efficient repair can be achieved simply by adjusting the input feature dimensions according to the monitoring data types of different structures. At the same time, the sampling number and dropout probability of MC-dropout can be adjusted according to engineering needs to adapt to uncertainty assessment scenarios with different accuracy requirements.

[0065] The above is a detailed description of the preferred embodiments of the present invention. However, the present invention is not limited to the embodiments described. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention. All such equivalent modifications or substitutions are included within the scope defined by the claims of this application.

Claims

1. A method for repairing multi-abnormal data based on a U-shaped network and a time-frequency domain weighted loss, characterized in that, Includes the following steps: S1: Acquire multi-mode monitoring data of civil engineering structures, preprocess the multi-mode monitoring data, and label the single anomalies and mixed-mode anomalies, and construct training sample sets and test sample sets. S2: Construct a U-shaped CNN-BiLSTM-Attention network, which includes an encoder, a decoder, a skip connection module, a BiLSTM temporal modeling module, an attention mechanism module, and an MC-dropout uncertainty evaluation module embedded in the decoder; S3: Use the encoder to extract features from the data in the training sample set to obtain high-level abstract features containing multi-scale features; S4: The high-level abstract features are fused with the shallow features in the decoder through the skip connection module to obtain the feature fusion result; S5: Input the feature fusion result into the BiLSTM temporal modeling module for bidirectional temporal modeling to obtain temporal features; then input the temporal features into the attention mechanism module for feature weighting to obtain enhanced temporal features; S6: The decoder is used to reconstruct the enhanced temporal features to obtain repaired data; a frequency-weighted hybrid loss function combining the time and frequency domains is constructed, and the loss value between the repaired data and the corresponding real data is calculated using the frequency-weighted hybrid loss function. The U-shaped CNN-BiLSTM-Attention network is trained and optimized to obtain the optimal data repair model. S7: Use the optimal data repair model to repair the monitoring data in the test sample set that contain single anomalies and mixed-mode anomalies, obtain the repair results, and output the uncertainty quantification results of the repair results through the MC-dropout uncertainty assessment module.

2. The method of claim 1, wherein, In step S1, the multi-mode monitoring data includes two or more time-series data from various types of bridge structures, building structures, reservoir dams, tunnel structures, rail transit structures, and slopes, including strain, displacement, acceleration, tilt angle, temperature and humidity, settlement, deformation, and seepage flow. And / or, the preprocessing includes removing invalid and redundant data and using max-min normalization to eliminate dimensional differences; And / or, the single anomaly includes missing data, baseline drift, and trend items, and the mixed-mode anomaly is two or more anomalies that coexist with the single anomaly.

3. The method of claim 1, wherein, In step S3, the encoder is embedded with a dilated convolutional layer and a basic CNN layer. The dilated convolutional layer uses dilated convolutional kernels with different dilation rates to achieve parallel extraction of abnormal features and local trend features at different scales. The basic CNN layer is used to complete the basic feature extraction.

4. The method of claim 1, wherein, In step S4, the skip connection module achieves high- and low-dimensional feature fusion by concatenating the high-level abstract features with the shallow features in the decoder or adding them element by element.

5. The method of claim 1, wherein, In step S5, the BiLSTM temporal modeling module models the feature fusion result in a bidirectional temporal manner (both forward and backward) and outputs temporal features containing long-term dependencies. The attention mechanism module calculates the correlation between each time step in the temporal features and the anomaly labeling interval, and assigns higher weights to the anomaly and adjacent interval features to enhance the expression of the anomaly interval features.

6. The method of claim 1, wherein, In step S7, the uncertainty quantification result of the repair result is output through the MC-dropout uncertainty assessment module. Specifically, during the inference phase, the dropout layer is kept on, and the same input anomaly monitoring data is independently and randomly sampled and reconstructed T times to obtain T sets of candidate repair data. By calculating the statistical characteristics of the T sets of candidate repair data, the uncertainty quantification assessment of data reconstruction is completed.

7. The method of claim 6, wherein, The quantitative assessment of uncertainty uses standard deviation and coefficient of variation as core evaluation indicators. Standard deviation is used to characterize the dispersion of candidate repair data in group T, and coefficient of variation is used to eliminate the influence of the magnitude of repair data, so as to achieve a unified assessment of the uncertainty of different types of monitoring data.

8. The method of claim 1, wherein, In step S6, the frequency-weighted hybrid loss function is obtained by weighted combination of time-domain loss term and frequency-domain loss term; wherein, the time-domain loss term is used to constrain the approximation of the repaired data and the real data in terms of numerical and temporal sequence; the frequency-domain loss term is used to constrain the approximation of the repaired data and the real data in terms of spectrum, energy distribution and frequency characteristics.

9. The method of claim 8, wherein, The time-domain loss term includes one or a combination of mean square error and mean absolute error; the frequency-domain loss term is constructed based on fast Fourier transform or wavelet transform, and a frequency weighting strategy is introduced in the frequency-domain loss term to assign dynamic weights to different frequency ranges, focusing on strengthening the constraint of low-frequency trend characteristics.

10. The method of claim 9, wherein, The frequency weighting strategy adopts Gaussian weighting, assigning higher weights to the low-frequency range than to the high-frequency range. The low-frequency range corresponds to the first quarter of the total length of the frequency domain data, with the weight value of the low-frequency range ranging from 0.8 to 1.0, and the weight value of the high-frequency range ranging from 0.1 to 0.3.