A traffic infrastructure monitoring data probability outlier diagnosis method based on a conditional diffusion model
By constructing a conditional embedding and denoising network based on the conditional diffusion model, a time series prediction model is trained to calculate the anomaly probability of data points, thus solving the problem of identifying outliers in traffic infrastructure monitoring data and improving the reliability and accuracy of the monitoring data.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HARBIN INST OF TECH
- Filing Date
- 2024-12-17
- Publication Date
- 2026-06-12
Smart Images

Figure CN119862510B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of structural health monitoring technology, and in particular relates to a method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model. Background Technology
[0002] Anomaly diagnosis is crucial for ensuring clean and reliable datasets for monitoring the structural health of transportation infrastructure. Due to instability in data transmission systems, measurement noise, and sensor malfunctions, anomalies are inevitable in monitoring data. The presence of outliers can severely impact the reliability of the monitoring system and the accuracy of subsequent damage diagnosis, condition assessment, and early warning systems. Existing methods widely employ techniques based on computer vision, density clustering, and statistical regression. However, traditional computer vision methods typically require large labeled datasets, and the identification results are data segments rather than specific data points. Density clustering methods often identify low-density areas as outliers, but these methods usually require pre-setting hyperparameters such as the number of clusters, resulting in poor robustness. Statistical regression methods are susceptible to outliers during model training, leading to inaccurate predictions, and lack clear criteria for determining the extent of the difference between actual and predicted values as outliers.
[0003] The diffusion model, based on diffusion process theory, captures the dynamic characteristics of data by simulating the diffusion process of data over time. Unlike traditional models, the conditional diffusion model can better handle high-dimensional and nonlinear data. Through conditional processing, it can mitigate the impact of outlier data during model training, ensuring reliable prediction even when outliers exist in the training data. By defining the probability of data point anomalies, the degree of anomaly can be quantitatively determined. Considering the shortcomings of existing anomaly diagnosis methods, and combining the needs of practical problems with the advantages of the conditional diffusion model, a probabilistic outlier diagnosis method for traffic infrastructure monitoring data based on the conditional diffusion model is established. Summary of the Invention
[0004] The purpose of this invention is to solve the problem of outlier diagnosis in existing traffic infrastructure monitoring data, and to propose a probabilistic outlier diagnosis method for traffic infrastructure monitoring data based on a conditional diffusion model.
[0005] This invention is achieved through the following technical solution: This invention proposes a method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model. The method includes the following steps:
[0006] Step 1: Construct a traffic infrastructure monitoring dataset from which outliers need to be removed. Determine the time step of each type of time series monitoring data based on the sampling frequency in order to train a time series prediction model for outlier diagnosis.
[0007] Step 2: Establish a conditional diffusion model, determine the network architecture of the conditional embedding network module and the denoising network module, and train a time series prediction model based on the monitoring dataset;
[0008] Step 3: Based on the time series prediction model, predict the monitoring data points at each time point, and calculate the residuals using the actual values and predicted values to estimate the variance parameter of the prediction error;
[0009] Step 4: Define the anomalous probability of data points based on the probability distribution of predicted data points, calculate the anomalous probability of each data point, and identify data points with an anomalous probability greater than 0.5 as outliers.
[0010] Furthermore, in step two, the network architecture of the selected conditional embedding network module is an RNN network model, composed of GRU gated units and containing four hidden layers, each containing 30 hidden neurons. The time-related features at the prediction time t and multiple observations before time t are used as context windows to jointly form the covariate c. t ; compare it with the hidden state h from the previous time step t-2 As input to the conditional embedding network module, the output of the conditional embedding network module is h. t-1 :
[0011] h t-1 =GRU(c t ,h t-2 )
[0012] The specific expression is:
[0013] r t-1 =σ[W r ·(h t-2 ,c t )+b r ]
[0014] z t-1 =σ[W z ·(h t-2 ,c t )+b z ]
[0015]
[0016] in It is the sigmoid activation function. For Tanh activation function, ⊙ represents element-wise multiplication, C θ ={W r W z W h ,b r ,b z ,bh} represents the model parameters of the conditional embedding network module, initialized from a standard normal distribution. The parameters are randomly generated and then optimized through subsequent model training, with h0 = 0.
[0017] Furthermore, in step two, the raw monitoring data at time t is... By gradually increasing Gaussian white noise and assuming that this process satisfies the Markov property, after T-step diffusion, the original monitoring data becomes pure Gaussian white noise, i.e.:
[0018]
[0019] Reverse process from Start by entering h under the given conditions t-1 The original monitoring data was then generated by gradually removing noise. According to Bayes' theorem, under given conditions Temporal conditional probability distribution The following relationship must be satisfied:
[0020]
[0021] A neural network is used to fit the above reverse process, and the fitting result is: Set as:
[0022]
[0023] The KL divergence between the two is used as the loss function:
[0024]
[0025] Where C is a constant independent of the network model parameter θ; according to Form selection for:
[0026]
[0027] Substituting this further into the KL divergence simplification, we get:
[0028]
[0029] Where ε θ This represents the model parameters of the denoising network module.
[0030] Furthermore, in step two, when the hidden state h corresponds to input time t... t-1 When used as input, the loss function of the time series prediction model is:
[0031]
[0032] After shuffling the order, a random number m is selected from (1,2,…,T) T times. The gradient descent algorithm is used to update the network model parameters, which include the model parameters ε of the denoising network module. θ The model parameters C of the conditional embedding network module θ Two parts:
[0033]
[0034] The update stops when the error loss is less than the threshold Δ = 0.01, and the model training is complete.
[0035] Furthermore, in step three, after the model has been trained, given the hidden state h... t-1 Under the condition of distribution Medium sampling
[0036]
[0037] Where when n = 2, 3, ..., T When m=1, z=0; after repeating the sampling process T times, the predicted output of the model at a certain time t under given conditions can be obtained.
[0038] Furthermore, in step three, the number of samples is set to N, and the predicted mean and standard deviation at a certain time are calculated as follows:
[0039]
[0040]
[0041] Furthermore, in step three, the variance of the prediction error is estimated using a resampling method. A certain proportion of data is randomly sampled from the original data L times, and the variance is calculated using unbiased estimation before taking the mean.
[0042]
[0043] Furthermore, in step four, given the predicted mean and variance, as well as the variance of the prediction error, the prediction distribution at a certain moment approximately satisfies a Gaussian distribution:
[0044]
[0045] The probability of an anomaly at that moment is:
[0046]
[0047] When the calculated probability of an anomaly is greater than 0.5, it is identified as an outlier.
[0048] The present invention also proposes an electronic device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model.
[0049] The present invention also proposes a computer-readable storage medium for storing computer instructions, which, when executed by a processor, implement the steps of the method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model.
[0050] The beneficial effects of this invention are:
[0051] The traffic infrastructure monitoring data probabilistic outlier monitoring method based on conditional diffusion model described in this invention can learn the implicit pattern features of the monitoring data more accurately and is more robust to outlier data. Furthermore, by calculating the outlier probability, it can quantify the degree of anomaly of the data points while identifying outlier data. Attached Figure Description
[0052] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on the provided drawings without creative effort.
[0053] Figure 1 This is a flowchart of the method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on the conditional diffusion model described in this invention;
[0054] Figure 2 This is a diagnostic effect diagram of the probabilistic outlier diagnosis method for traffic infrastructure monitoring data based on the conditional diffusion model described in this invention; the two sub-graphs show the diagnostic effects of two different types of monitoring data. Detailed Implementation
[0055] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0056] The purpose of this invention is to address the shortcomings of existing methods for diagnosing outliers in transportation infrastructure monitoring data by proposing a probabilistic outlier diagnosis method based on a conditional diffusion model. The method described in this invention can effectively diagnose potential outliers in transportation infrastructure monitoring data.
[0057] Combination Figure 1 , Figure 2 This invention proposes a method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model, specifically including the following steps:
[0058] Step 1: Construct a traffic infrastructure monitoring dataset from which outliers need to be removed. Determine parameters such as the time step size for each type of time series monitoring data based on the sampling frequency in order to train a time series prediction model for outlier diagnosis.
[0059] Step 2: Establish a conditional diffusion model, determine the network architecture of the conditional embedding network module and the denoising network module, and train a time series prediction model based on the monitoring dataset;
[0060] Step 3: Based on the time series prediction model, predict the monitoring data points at each time point, and calculate the residuals using the actual values and predicted values to estimate the variance parameter of the prediction error;
[0061] Step 4: Define the anomalous probability of data points based on the probability distribution of predicted data points, calculate the anomalous probability of each data point, and identify data points with an anomalous probability greater than 0.5 as outliers.
[0062] Step two specifically involves:
[0063] Step 2.1: Select the network architecture of the conditional embedding network module as an RNN network model, which consists of GRU gated units and contains 4 hidden layers. Each hidden layer contains 30 hidden neurons. The time-related features at the prediction time t and 60 observations before time t are used as context windows to form the covariate c. t ; compare it with the hidden state h from the previous time step t-2 As input to the conditional embedding network module, the output of the conditional embedding network module is h. t-1 :
[0064] h t-1 =GRU(c t ,h t-2 )
[0065] The specific expression is:
[0066] r t-1 =σ[w r ·(h t-2 ,c t )+br ]
[0067] z t-1 =σ[W z ·(h t-2 ,c t )+b z ]
[0068]
[0069] in It is the sigmoid activation function. For Tanh activation function, ⊙ represents element-wise multiplication, C θ ={W r W z W h ,b r ,b z ,b h} represents the model parameters of the conditional embedding network module, initialized from a standard normal distribution. The parameters are randomly generated and then optimized through subsequent model training, with h0 = 0.
[0070] Step 2.2: Obtain the raw monitoring data at time t. By gradually increasing Gaussian white noise and assuming that this process satisfies the Markov property, after T-step diffusion, the original monitoring data becomes pure Gaussian white noise, i.e.:
[0071]
[0072] Reverse process from Start by entering h under the given conditions t-1 The original monitoring data was then generated by gradually removing noise. According to Bayes' theorem, under given conditions Temporal conditional probability distribution The following relationship must be satisfied:
[0073]
[0074] A neural network (consisting of eight residual layers with skip connections) was used to fit the above reverse process, and the fitting result was: Set as:
[0075]
[0076] The KL divergence of the two (probability distribution and fitting result) is used as the loss function:
[0077]
[0078] Where C is a constant independent of the network model parameter θ; according to Form selection for:
[0079]
[0080] Substituting this further into the KL divergence simplification, we get:
[0081]
[0082] Where ε θ These represent the model parameters of the denoising network module;
[0083] When the hidden state h corresponds to input time t t-1 When used as input, the loss function of the time series prediction model is:
[0084]
[0085] To further ensure the robustness of the model, after shuffling the order, an m is randomly selected from (1,2,…,T) T times, and the gradient descent algorithm is used to update the network model parameters. Here, the network model parameters include the model parameters ε of the denoising network module. θ The model parameters C of the conditional embedding network module θ Two parts:
[0086]
[0087] The update stops when the error loss is less than the threshold Δ = 0.01, and the model training is complete.
[0088] Step three specifically involves:
[0089] After the model has been trained, given the hidden state h t-1 Under the condition of distribution Medium sampling
[0090]
[0091] Where when m = 2, 3, ..., T When m=1, z=0; after repeating the sampling process T times, the predicted output of the model at a certain time t under given conditions can be obtained. With the number of samples set to N, the predicted mean and standard deviation at this time point are calculated as follows:
[0092]
[0093] The variance of the prediction error was estimated using a resampling method. A certain proportion of data was randomly sampled from the original data (L = 10000 times), where G = 0.8N. The variance was calculated using an unbiased estimator, and then the mean was obtained.
[0094]
[0095] Step four specifically involves:
[0096] Given the prediction mean and variance, as well as the variance of the prediction error, the prediction distribution at a certain time approximately follows a Gaussian distribution:
[0097]
[0098] The probability of an anomaly at that moment is:
[0099]
[0100] When the calculated probability of an anomaly is greater than 0.5, it is identified as a possible outlier.
[0101] Example
[0102] Figure 2 The application effects of this invention in a real monitoring dataset are presented. For a specific bridge monitoring database, bridge wind speed monitoring data is used as the input-output of a conditional diffusion prediction model to diagnose outliers in the bridge wind speed monitoring data. The specific operation steps are as follows:
[0103] Step one specifically involves using a data matrix composed of wind speeds from two different measuring points in a bridge monitoring database as input-output. The sampling frequency of this monitoring data is 10Hz, therefore the time step of the time series prediction model is set to 100ms.
[0104] Step two specifically involves: determining the day of the week and hour of the day for the predicted data point based on the time point corresponding to the time of data collection, and using the monitoring data from the previous 60 time points as lag features to form a covariate c. t The hidden state h at the current time is generated in a GRU gated unit conditional network with 4 hidden layers and 30 nodes in each hidden layer. t-1 The conditional diffusion prediction model was trained, with the total number of diffusion steps T = 100, and the noise figure was calculated. β min =0.0001,β maxThe initial loss function is set to 0.1, and then Gaussian white noise is gradually added to complete the forward pass. The denoising network mainly consists of 8 conditional residual blocks, each of which is composed of a one-dimensional convolutional neural network and a ReLU activation function. Fourier position embedding is used to encode the noise. The network model parameters are updated using gradient descent according to the aforementioned loss function. Training stops when the error loss is less than Δ = 0.01.
[0105] Step three specifically involves: after the model has been trained, given the hidden state h... t-1 Under given conditions, the predicted output of the model at a certain time t is obtained by sampling multiple times from the trained model. Calculate the mean and variance of the predicted values at the current time and estimate the standard deviation of the residuals.
[0106] Step four specifically involves: calculating the anomaly probability of data points based on the predicted distribution of data points, and identifying data points with an anomaly probability greater than 0.5 as abnormal data.
[0107] The present invention also proposes an electronic device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model.
[0108] The present invention also proposes a computer-readable storage medium for storing computer instructions, which, when executed by a processor, implement the steps of the method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model.
[0109] The memory in this application embodiment can be volatile memory or non-volatile memory, or it can include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDRSDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DRRAM). It should be noted that the memory used in the methods described in this invention is intended to include, but is not limited to, these and any other suitable types of memory.
[0110] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented in software, it can be implemented, in whole or in part, as a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another via wired (e.g., coaxial cable, fiber optic, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device such as a server or data center that integrates one or more available media. The available media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., high-density digital video discs (DVDs)), or semiconductor media (e.g., solid-state disks (SSDs)).
[0111] In implementation, each step of the above method can be completed by integrated logic circuits in the processor's hardware or by instructions in software. The steps of the method disclosed in the embodiments of this application can be directly implemented by a hardware processor, or by a combination of hardware and software modules in the processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory, and the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method. To avoid repetition, detailed descriptions are omitted here.
[0112] It should be noted that the processor in the embodiments of this application can be an integrated circuit chip with signal processing capabilities. During implementation, each step of the above method embodiments can be completed by the integrated logic circuitry in the processor's hardware or by instructions in software form. The processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. It can implement or execute the methods, steps, and logic block diagrams disclosed in the embodiments of this application. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the methods disclosed in the embodiments of this application can be directly embodied as being executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules can be located in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory, and the processor reads the information in the memory and, in conjunction with its hardware, completes the steps of the above methods.
[0113] The above provides a detailed description of the probabilistic outlier diagnosis method for traffic infrastructure monitoring data based on a conditional diffusion model proposed in this invention. Specific examples have been used to illustrate the principles and implementation methods of this invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of this invention. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this invention. Therefore, the content of this specification should not be construed as a limitation of this invention.
Claims
1. A method for diagnosing probabilistic outliers in traffic infrastructure monitoring data based on a conditional diffusion model, characterized in that, The method includes the following steps: Step 1: Construct a traffic infrastructure monitoring dataset from which outliers need to be removed. Determine the time step of each type of time series monitoring data based on the sampling frequency in order to train a time series prediction model for outlier diagnosis. Step 2: Establish a conditional diffusion model, determine the network architecture of the conditional embedding network module and the denoising network module, and train a time series prediction model based on the monitoring dataset; Step 3: Based on the time series prediction model, predict the monitoring data points at each time point, and calculate the residuals using the actual values and predicted values to estimate the variance parameter of the prediction error; Step 4: Define the anomalous probability of data points based on the probability distribution of predicted data points, calculate the anomalous probability of each data point, and identify data points with an anomalous probability greater than 0.5 as outliers; In step three, the number of samples is set to... The predicted mean and standard deviation at a certain moment are calculated as follows: ; In step three, the variance of the prediction error is estimated using a resampling method. A certain proportion of data is randomly sampled from the original data L times, and the variance is calculated using an unbiased estimator before taking the mean. ; In step four, given the predicted mean and variance, as well as the variance of the prediction error, the predicted distribution at a certain moment approximately follows a Gaussian distribution: The probability of an anomaly at that moment is: When the calculated probability of an anomaly is greater than 0.5, it is identified as an outlier.
2. The method according to claim 1, characterized in that, In step two, the network architecture of the selected conditional embedding network module is an RNN network model, which consists of GRU gated units and contains four hidden layers, each containing 30 hidden neurons. The time-related features at the prediction time t and multiple observations before time t are used as context windows to form covariates. ; Compare it with the hidden state of the previous moment. As input to the conditional embedding network module, the output of the conditional embedding network module is : The specific expression is: in It is the sigmoid activation function. For Tanh activation function, This represents element-wise multiplication. The model parameters for the conditionally embedded network modules are initialized from a standard normal distribution. The parameters are randomly generated and then optimized during subsequent model training. .
3. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1-2.
4. A computer-readable storage medium for storing computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the steps of the method according to any one of claims 1-2.