Tunnel portal ice disaster identification and prediction method based on multi-modal data fusion technology
By combining multimodal data fusion and deep learning technology with a dynamic early warning mechanism, the accurate identification and prediction of ice and condensation disasters at tunnel entrances have been achieved. This has solved the problem of low model generalization ability and prediction accuracy in the variable environment at tunnel entrances, and improved detection accuracy and response efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- INST OF ROCK & SOIL MECHANICS CHINESE ACAD OF SCI
- Filing Date
- 2025-05-07
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies have low model generalization ability and prediction accuracy in the variable environment at tunnel entrances. Traditional monitoring systems lack multimodal data fusion and dynamic adjustment mechanisms, resulting in insufficient monitoring accuracy and waste of resources.
Employing multimodal data fusion technology, data is collected by deploying dual-spectrum thermal imagers, millimeter-wave radar, lidar, and distributed sensors. This data is then combined with deep learning networks for feature extraction and fusion. A cascaded CNN-LSTM network is constructed for disaster identification, and prediction is performed based on a spatiotemporal graph convolutional network to dynamically control ice-melting devices and vehicle warnings.
It has achieved accurate identification and prediction of icing disasters at tunnel entrances, improved the model's adaptability and generalization ability to complex scenarios, dynamically responded to match risk levels, improved detection accuracy and handling efficiency, and ensured tunnel passage safety.
Smart Images

Figure CN120653911B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of tunnel disaster identification technology, and in particular to a method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology. Background Technology
[0002] With the development of intelligent transportation and road safety monitoring technologies, accurately identifying and promptly issuing warnings of hazardous road conditions such as road icing has become a crucial aspect of ensuring traffic safety. Traditional road icing monitoring technologies often rely on single sensors (such as temperature and humidity sensors), making it difficult to comprehensively capture multidimensional information in complex environments. For example, judging icing risk solely based on temperature ignores the combined influence of environmental parameters such as humidity and wind speed on the phase transition process, resulting in insufficient monitoring accuracy.
[0003] In terms of data processing and model training, existing solutions do not fully utilize the fusion of multi-source heterogeneous data (such as thermal infrared images, lidar reflection intensity, and millimeter-wave radar velocity data). Feature extraction and fusion mechanisms are simplistic, failing to effectively mine complementary information between different modalities. Furthermore, the loss function design during model training is simplistic, lacking comprehensive consideration of classification accuracy, target detection cross-union ratio, and prediction error, making it difficult to balance the model's generalization ability and prediction accuracy in the variable environment of tunnel entrances. In addition, traditional early warning systems lack dynamic adjustment mechanisms, failing to precisely control the power of de-icing equipment based on real-time monitoring of ice thickness and environmental parameters, leading to resource waste or untimely de-icing.
[0004] In summary, this application proposes a method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology. Summary of the Invention
[0005] The purpose of this invention is to address the problem of low generalization ability and prediction accuracy of models in the background technology under the variable environmental scenarios at tunnel entrances, and to propose a method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology.
[0006] The technical solution of this invention: a method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology, comprising the following steps:
[0007] (1) Multi-source data acquisition: thermal infrared image sequences are acquired by a dual-spectrum thermal imager array deployed at the tunnel entrance, spatial reflection data are acquired synchronously by millimeter-wave radar and lidar, and environmental parameters are acquired by a distributed temperature and humidity sensor group and a three-dimensional ultrasonic anemometer.
[0008] (2) Heterogeneous data processing: Radiometric correction and temperature inversion are performed on thermal infrared images to generate a temperature distribution matrix; Doppler filtering is performed on millimeter-wave radar data to extract dynamic target reflection features; ground segmentation and anomaly filtering are performed on lidar point cloud data to construct a three-dimensional spatial grid model.
[0009] (3) Feature-level fusion: The improved ResNet-50 network is used to extract deep features of thermal infrared images, the PointNet++ network is used to process the lidar point cloud features, the LSTM network is used to extract the temporal features of environmental parameters, and adaptive weighted fusion is performed in the feature space.
[0010] (4) Disaster identification: The fused features are input into a cascaded dual-channel CNN-LSTM network. The first channel identifies the current ice layer distribution, the second channel detects the icing trend, and outputs the disaster risk level.
[0011] (5) Spatiotemporal prediction: Based on the spatiotemporal graph convolutional network, a prediction model is constructed, and the current fused features are spatiotemporally correlated with the historical disaster database to predict the ice layer growth thickness and distribution range in the next 2 hours.
[0012] (6) Dynamic early warning: Generate graded early warning signals based on the prediction results. When the predicted ice thickness exceeds 5mm, the active ice melting device is triggered, and the road condition warning is sent to the approaching vehicle through the V2X system.
[0013] Optionally, the thermal infrared image processing in step (1) specifically includes:
[0014] (a) Set up dual-band synchronous acquisition to acquire each frame of image at a resolution of 256×256;
[0015] (b) The non-uniformity correction algorithm is used to eliminate the difference in detector response, and the atmospheric transmittance model is applied to perform temperature inversion;
[0016] (c) Establish the temperature gradient matrix ΔT(x,y,t)=T(x,y,t)-T avg (t);
[0017] (d) Set the dynamic threshold θ(t) = 0.15·T avg (t)+2·σ T (t), when ΔT(x,y,t)<-θ(t), it is marked as a suspected icing area;
[0018] Where ΔT(x,y,t) is the temperature gradient at time t at spatial coordinates (x,y), T(x,y,t) is the temperature value at time t at spatial coordinates (x,y), θ(t) is the dynamic threshold at time t, and T avg (t) represents the average temperature of the current frame, σ T(t) represents the temperature standard deviation.
[0019] Optionally, the radar data processing in step (2) includes:
[0020] (a) The millimeter-wave radar uses FMCW modulation and eliminates static clutter through an MTI filter to extract the Doppler frequency shift of moving targets;
[0021] (b) The lidar uses a rotating scanning method, with a point cloud density ≥ 200 points / m². 2 The RANSAC algorithm was applied to segment the ground point cloud;
[0022] (c) Establish a model for detecting anomalies in reflection intensity:
[0023] I ratio (x,y,z)=[I(x,y,z)-μI] / σI
[0024] When I ratio A value greater than 3 indicates an abnormal reflection point.
[0025] (d) Integrating millimeter-wave velocity spectrum and lidar reflection characteristics to construct a dynamic risk field model:
[0026] Pisk(x,y,z,t)=α·v doppler (x,y,z,t)+β·I ratio (x,y,z,t)
[0027] Where I(x,y,z) is the lidar reflection intensity at spatial coordinates (x,y,z), α=0.6 is the velocity influence factor, β=0.4 is the reflection intensity influence factor, and v doppler (x,y,z,t) represents the radial velocity of the target measured by millimeter-wave radar, I ratio σI represents the standardized value of the lidar reflection intensity, μI is the mean reflection intensity, and σI is the standard deviation.
[0028] Optionally, the environmental parameter processing in step (3) includes:
[0029] (a) Establishing a spatiotemporal model of temperature and humidity: Kriging interpolation was used to generate a two-dimensional distribution of data from 16 distributed sensors;
[0030] (b) Calculate the dew point temperature difference ΔTd(t) = T(t) - Td(t), and activate the high-precision monitoring mode when ΔTd(t) < 1℃;
[0031] (c) Wind speed vector analysis: The wind field matrix was constructed using four three-dimensional ultrasonic anemometers, and the curl ▽×V and divergence ▽·V were calculated;
[0032] (d) Introducing the phase transition index
[0033]
[0034] A phase change warning is generated when PI(t) > 0.15, where PI(t) is the phase change index at time t, Td(t) is another reference temperature at time t, RH(t) is the relative humidity measurement at time t, T(t) is the ambient temperature at time t, and ‖V(t)‖ is the three-dimensional wind speed vector magnitude.
[0035] Optionally, the feature fusion in step (3) adopts:
[0036] (a) Establish a three-level fusion architecture: pixel-level fusion of thermal imaging and lidar data, feature-level fusion of radar reflection features, and decision-level fusion of environmental parameters;
[0037] (b) Design an attention weighting mechanism: dynamically adjust the weights of each modality feature through deformable convolutional kernels.
[0038] W i =σ(Con3D(F) i ))
[0039] Among them, W i Let be the attention weight matrix for the i-th mode, δ(·) be the sigmoid function, (Con3D(·) be the 3D convolution operation, F i This is the input feature map for the i-th mode;
[0040] (c) Introducing an adversarial training strategy: Constructing a generator network to simulate multimodal data distribution, and a discriminator network to optimize feature discriminability;
[0041] (d) Implement cross-modal contrastive learning: establish positive and negative sample pairs and optimize the feature embedding space.
[0042] Optionally, the CNN-LSTM network in step (4) specifically includes:
[0043] (a) The first channel adopts the DenseNet-121 architecture, inputting a 256×256 thermal infrared image and outputting a 512-dimensional feature vector;
[0044] (b) The second channel adopts a bidirectional LSTM structure, inputting the environmental parameter sequence of the past 30 minutes and outputting 128-dimensional time series features;
[0045] (c) Design a cross-channel attention module: calculate the correlation weight between the image and temporal features using cosine similarity;
[0046] (d) The output layer uses a hybrid density network to simultaneously predict the probability of ice presence and thickness within the confidence interval.
[0047] Optionally, the spatiotemporal graph convolutional network in step (5) includes:
[0048] (a) Constructing a dynamic graph structure: Nodes represent grid cells of the monitoring area, and edge weights are determined by both spatial distance and wind speed correlation;
[0049] (b) Design of hierarchical spatiotemporal blocks: Each block contains a gated TCN temporal module and a Chebyshev graph convolutional spatial module;
[0050] (c) Introducing a memory enhancement mechanism: Adding a differentiable neural dictionary to the encoder to store typical disaster pattern features;
[0051] (d) Employ a multi-task output head: simultaneously predict ice thickness, coverage growth rate, and coordinates of the maximum danger zone.
[0052] Optionally, the dynamic early warning system in step (6) includes:
[0053] (a) Three-level response mechanism: Level 1 corresponds to predicted thickness <3mm, only data is recorded; Level 2 is 3-5mm, warning display is activated; Level 3 is >5mm, de-icing device is activated;
[0054] (b) De-icing control strategy: Based on the predicted ice layer distribution density, the power distribution of the carbon fiber heating film is dynamically adjusted. The de-icing power control equation is as follows:
[0055]
[0056] Among them, P max Where k is the maximum heating power of the carbon fiber membrane, and k is the power growth coefficient. To predict ice thickness;
[0057] (c) V2X communication protocol: adopts IEEE 802.11p standard, broadcasts road condition index RSI∈[0,1] every 500ms;
[0058] (d) Self-test feedback loop: Set up a piezoelectric sensor to verify the actual de-icing effect and automatically calibrate the prediction model parameters.
[0059] Optionally, model training optimization steps may also be included:
[0060] (a) Constructing multi-scale training data: Extreme weather scenario data are generated through numerical simulation, and StyleGAN is used to enhance sample diversity;
[0061] (b) Design a hybrid loss function: Where α = 0.4, β = 0.3, and γ = 0.3 are weighting coefficients. For cross-entropy loss, To compare the losses, Mean absolute error;
[0062] (c) Implement curriculum learning strategies: phased training from simple weather conditions to complex freezing scenarios;
[0063] (d) Dynamic batch normalization is adopted: the normalization layer parameters are adjusted according to real-time environmental parameters.
[0064] Optionally, the course learning strategy specifically includes:
[0065] Phase 1: Training basic feature extraction capabilities using simulated data under sunny and normal temperature conditions;
[0066] Phase 2: Introduce rain, fog, and low-temperature environmental data to train the robustness of multimodal data fusion;
[0067] Phase 3: Injecting data from extreme freezing scenarios to optimize disaster identification and prediction accuracy;
[0068] In the dynamic batch normalization step:
[0069] The normalized layer parameters are dynamically adjusted based on real-time collected temperature, humidity, and wind speed data;
[0070] The adjustment cycle is triggered every 100 training batches or when environmental parameters fluctuate beyond a threshold.
[0071] Compared with the prior art, this application includes at least one of the following beneficial technical effects:
[0072] This invention combines thermal imaging, radar, and environmental sensors to capture three-dimensional temperature anomalies, spatial structures, and environmental precursors in icing areas, accurately distinguishing between icing and interfering targets. It filters noise and extracts dynamic / static features, fusing velocity and reflection intensity to quantify risk and improve data reliability.
[0073] By employing a multi-level architecture and attention mechanism, multimodal features are dynamically integrated to enhance the model's adaptability and generalization ability in complex scenarios. Simultaneous analysis of current ice layer distribution and historical trends, combined with cross-channel correlations, outputs risk levels and prediction ranges, improving the logical consistency of judgments.
[0074] Based on spatiotemporal model analysis of spatial propagation and meteorological drivers, ice growth trends are predicted, providing targeted parameters for response. A tiered response system matches risk levels, adaptively controls melting power, and provides real-time vehicle warnings and automatic model calibration, improving response efficiency and reliability. Data and algorithms at each stage are linked to form a closed loop of "perception-analysis-prediction-response-optimization," enhancing system adaptability and long-term robustness.
[0075] This invention collects temperature, spatial structure, and environmental parameters at tunnel entrances by deploying multimodal sensors. After intelligent processing and multi-level feature fusion, it utilizes deep networks to identify ice layer distribution, detect icing trends, and make spatiotemporal predictions. Combined with a graded early warning mechanism, it dynamically activates ice-melting devices and links them with vehicle warnings, forming a closed loop of perception, analysis, prediction, response, and calibration. This significantly improves the accuracy of ice and condensation disaster detection, the foresight of prediction, and the intelligence of response, ensuring tunnel traffic safety and the long-term robustness of the system. Attached Figure Description
[0076] Figure 1 This is a flowchart of a method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology. Detailed Implementation
[0077] The technical solution of the present invention will be further described below with reference to the accompanying drawings and specific embodiments.
[0078] Example 1
[0079] like Figure 1 As shown, the tunnel entrance icing disaster identification and prediction method based on multimodal data fusion technology proposed in this invention includes multi-source data acquisition, heterogeneous data processing, feature-level fusion, disaster identification, spatiotemporal prediction and dynamic early warning. The steps are described in detail below.
[0080] (1) Multi-source data acquisition: Thermal infrared image sequences are acquired through a dual-spectrum thermal imager array deployed at the tunnel entrance; millimeter-wave radar and lidar simultaneously acquire spatial reflection data; and a distributed temperature and humidity sensor group and a three-dimensional ultrasonic anemometer acquire environmental parameters; the thermal infrared image processing specifically includes:
[0081] (a) Set up simultaneous acquisition of dual bands (8-14μm and 3-5μm) to acquire each frame of image at a resolution of 256×256;
[0082] (b) The non-uniformity correction algorithm is used to eliminate the difference in detector response, and the atmospheric transmittance model is applied to perform temperature inversion;
[0083] (c) Establish the temperature gradient matrix ΔT(x,y,t)=T(x,y,t)-T avg (t);
[0084] (d) Set the dynamic threshold θ(t) = 0.15·T avg (t)+2·σ T (t), when ΔT(x,y,t)<-θ(t), it is marked as a suspected icing area;
[0085] Where ΔT(x,y,t) is the temperature gradient at time t at spatial coordinates (x,y), T(x,y,t) is the temperature value at time t at spatial coordinates (x,y), θ(t) is the dynamic threshold at time t, and T avg (t) represents the average temperature of the current frame, σ T (t) represents the temperature standard deviation.
[0086] In this embodiment,
[0087] By simultaneously acquiring data using a dual-spectral thermal imager (8-14μm + 3-5μm) and combining it with a temperature gradient matrix (ΔT) and dynamic thresholds, it can accurately capture and mark suspected icing areas when there are local temperature anomalies. Compared with a single band or fixed threshold, it is more sensitive to hidden risks such as thin ice and edge icing.
[0088] It is worth noting that millimeter-wave radar (for dynamic targets) and lidar (for static structures) acquire data simultaneously, combining dynamic and static data: the former distinguishes dynamic targets such as vehicles / pedestrians, eliminating interference; the latter constructs a three-dimensional spatial grid to locate static icing areas, avoiding false alarms. Distributed temperature and humidity sensors combined with a three-dimensional anemometer capture phase change environmental conditions in advance (activating high-precision monitoring when the dew point temperature difference ΔTd < 1℃), providing precursor data for subsequent predictions. The introduction of temperature standard deviation allows the threshold to adaptively adjust with environmental fluctuations, maintaining detection stability even in non-steady-state scenarios such as changes in illumination and vehicle traffic, whereas traditional fixed thresholds are easily affected by interference. Thermal imaging data provides temperature semantic annotation (temperature characteristics of icing areas) for radar data, while radar data provides spatial coordinate calibration for thermal imaging. Multimodal data mutually verify each other, reducing the false alarm rate of a single sensor.
[0089] (2) Heterogeneous data processing: Radiometric correction and temperature inversion are performed on thermal infrared images to generate a temperature distribution matrix; Doppler filtering is performed on millimeter-wave radar data to extract dynamic target reflection features; ground segmentation and anomaly filtering are performed on lidar point cloud data to construct a three-dimensional spatial mesh model; radar data processing includes:
[0090] (a) The millimeter-wave radar uses FMCW modulation and eliminates static clutter through an MTI filter to extract the Doppler frequency shift of moving targets;
[0091] (b) The lidar uses a rotating scanning method, with a point cloud density ≥ 200 points / m². 2 The RANSAC algorithm was applied to segment the ground point cloud;
[0092] (c) Establish a model for detecting anomalies in reflection intensity:
[0093] I ratio (x,y,z)=[I(x,y,z)-μI] / σI
[0094] When I ratio A value greater than 3 indicates an abnormal reflection point.
[0095] (d) Integrating millimeter-wave velocity spectrum and lidar reflection characteristics to construct a dynamic risk field model:
[0096] Pisk(x,y,z,t)=α·v doppler (x,y,z,t)+β·I ratio (x,y,z,t)
[0097] Where I(x,y,z) is the lidar reflection intensity at spatial coordinates (x,y,z), α=0.6 is the velocity influence factor, β=0.4 is the reflection intensity influence factor, and v doppler (x,y,z,t) represents the radial velocity of the target measured by millimeter-wave radar, I ratio σI represents the standardized value of the lidar reflection intensity, μI is the mean reflection intensity, and σI is the standard deviation.
[0098] It is worth noting that the millimeter-wave radar employs FMCW modulation and MTI filtering, exhibiting strong dynamic clutter suppression capabilities. It can extract the Doppler frequency shift of moving targets, accurately distinguishing between icy areas and moving vehicles (stationary icy areas show no frequency shift, while vehicles exhibit a significant frequency shift). The lidar uses the RANSAC algorithm to segment the ground point cloud, removing irrelevant points on the road surface (such as vegetation and debris) and focusing on key areas at tunnel entrances. The anomaly detection model for reflection intensity (I_ratio>3 indicates anomalies) can identify the high reflectivity of ice layers, complementing the temperature data in terms of physical characteristics (low temperature + strong reflection = high probability of icing). The dynamic risk field model (Pisk=α·v+β·I_ratio) fuses velocity and reflection intensity to quantify dynamic risk (high-speed moving wet vehicles may accelerate icing), which is difficult to achieve with traditional single radar systems.
[0099] LiDAR point cloud density ≥ 200 points / m 2 High-density point clouds can capture millimeter-level changes in ice surface roughness, indirectly assisting in determining the degree of icing, surpassing the macroscopic detection capabilities of traditional radar. Radar data processing results (dynamic target trajectories) provide motion correlation verification for suspected icing areas in thermal imaging (stationary areas with continuous low temperatures + high reflectivity = icing, while moving areas with low temperatures may be water stains), reducing false alarms.
[0100] (3) Feature-level fusion: An improved ResNet-50 network is used to extract deep features from thermal infrared images, PointNet++ network is used to process lidar point cloud features, and LSTM network is used to extract temporal features of environmental parameters. Adaptive weighted fusion is then performed in the feature space. Environmental parameter processing includes:
[0101] (a) Establishing a spatiotemporal model of temperature and humidity: Kriging interpolation was used to generate a two-dimensional distribution of data from 16 distributed sensors;
[0102] (b) Calculate the dew point temperature difference ΔTd(t) = T(t) - Td(t), and activate the high-precision monitoring mode when ΔTd(t) < 1℃;
[0103] (c) Wind speed vector analysis: The wind field matrix was constructed using four three-dimensional ultrasonic anemometers, and the curl ▽×V and divergence ▽·V were calculated;
[0104] (d) Introducing the phase transition index
[0105]
[0106] A phase change warning is generated when PI(t) > 0.15, where PI(t) is the phase change index at time t, Td(t) is another reference temperature at time t, RH(t) is the relative humidity measurement at time t, T(t) is the ambient temperature at time t, and ‖V(t)‖ is the three-dimensional wind speed vector magnitude.
[0107] In addition, feature fusion employs:
[0108] (a) Establish a three-level fusion architecture: pixel-level fusion of thermal imaging and lidar data, feature-level fusion of radar reflection features, and decision-level fusion of environmental parameters;
[0109] (b) Design an attention weighting mechanism: dynamically adjust the weights of each modality feature through deformable convolutional kernels.
[0110] W i =σ(Con3D(F) i ))
[0111] Among them, W i Let be the attention weight matrix for the i-th mode, σ(·) be the sigmoid function, (Con3D(·) be the 3D convolution operation, F i This is the input feature map for the i-th mode;
[0112] (c) Introducing an adversarial training strategy: Constructing a generator network to simulate multimodal data distribution, and a discriminator network to optimize feature discriminability;
[0113] (d) Implement cross-modal contrastive learning: establish positive sample pairs (multimodal data in the same scene) and negative sample pairs (data in different scenes) to optimize the feature embedding space.
[0114] In this embodiment, a three-level fusion architecture (pixel-level + feature-level + decision-level) abstracts features layer by layer: pixel-level fusion of thermal imaging and LiDAR to locate the spatial position of icing; feature-level fusion of radar reflection features to enhance structural information; and decision-level fusion of environmental parameters (such as RH, T, V) to determine the physical conditions of icing, achieving comprehensive fusion from the data layer to the semantic layer. An attention weighting mechanism (deformable convolutional kernels dynamically adjust weights) enables the model to automatically focus on key modalities: for example, in foggy weather where the reliability of thermal imaging decreases, the model will reduce its weight and increase the feature proportion of LiDAR point clouds. Adversarial training and contrastive learning enhance feature discriminativity: the generator simulates multimodal data distribution, forcing the discriminator to learn to distinguish between "real icing features" and "interference features"; cross-modal contrastive learning makes multimodal features of the same scene (low-temperature areas of thermal imaging + high-reflectivity points of LiDAR) closer in the feature space, improving the model's generalization ability to complex scenes.
[0115] Furthermore, the phase transition index (PI = RH) 2 The formula / (T+273.15)·exp(-0.22||V||)) quantifies the synergistic effect of humidity, temperature, and wind speed. For example, when high humidity + low temperature + low wind speed occur, the PI (pill-in-the-box) surges, providing an early warning of the "rapid icing risk in a static and stable environment." Traditional single-parameter thresholds cannot capture such complex conditions. The deep thermal imaging features (texture, temperature gradient) extracted by ResNet complement the spatial semantics of the point cloud geometric features (curvature, height) from PointNet++, while the temporal features (temperature and humidity change trends) from LSTM provide clues in the time dimension. The cross-validation of spatiotemporal features enhances the logicality of icing trend judgment.
[0116] (4) Disaster Identification: The fused features are input into a cascaded dual-channel CNN-LSTM network. The first channel identifies the current ice layer distribution, and the second channel detects the icing trend, outputting the disaster risk level. The CNN-LSTM network specifically includes:
[0117] (a) The first channel adopts the DenseNet-121 architecture, inputting a 256×256 thermal infrared image and outputting a 512-dimensional feature vector;
[0118] (b) The second channel adopts a bidirectional LSTM structure, inputting the environmental parameter sequence of the past 30 minutes and outputting 128-dimensional time series features;
[0119] (c) Design a cross-channel attention module: calculate the correlation weight between the image and temporal features using cosine similarity;
[0120] (d) The output layer uses a hybrid density network to simultaneously predict the probability of ice presence and thickness within the confidence interval.
[0121] Through the above scheme, a cascaded dual-channel CNN-LSTM network processes spatial and temporal features in parallel: DenseNet-121 identifies the current ice distribution (iced areas in a 256×256 image), and bidirectional LSTM analyzes the environmental parameter sequence over the past 30 minutes (a continuous downward trend in temperature), combining static and dynamic data to determine the current risk and development trend. A cross-channel attention module calculates the correlation weights between the image and temporal features using cosine similarity, automatically associating key factors (enhancing image feature weights when the temperature drops sharply and enhancing temporal feature weights when humidity is stable), avoiding the "averaging" defects of traditional fusion methods. The hybrid density network outputs the probability of ice presence plus a thickness confidence interval, quantifying prediction uncertainty and providing a risk range reference for decision-making ("the ice thickness has an 80% probability of being between 2-4 mm"). Traditional models only output a single value, lacking reliability assessment.
[0122] The dual-channel structure enables real-time monitoring of model consistency: if the image channel shows icing while the temporal channel lacks trend support, a data re-examination mechanism is triggered to reduce occasional noise interference. The multimodal features output by feature-level fusion provide a "panoramic input" for the recognition network, while the recognition result (current ice layer distribution) provides initial state parameters for subsequent spatiotemporal prediction, forming a causal chain from "feature-recognition-prediction".
[0123] (5) Spatiotemporal prediction: A prediction model is constructed based on a spatiotemporal graph convolutional network. The current fused features are analyzed in a spatiotemporal correlation with the historical disaster database to predict the ice layer thickness and distribution range in the next 2 hours. The spatiotemporal graph convolutional network includes:
[0124] (a) Constructing a dynamic graph structure: Nodes represent grid cells of the monitoring area, and edge weights are determined by both spatial distance and wind speed correlation;
[0125] (b) Design of hierarchical spatiotemporal blocks: Each block contains a gated TCN temporal module and a Chebyshev graph convolutional spatial module;
[0126] (c) Introducing a memory enhancement mechanism: Adding a differentiable neural dictionary to the encoder to store typical disaster pattern features;
[0127] (d) Employ a multi-task output head: simultaneously predict ice thickness, coverage growth rate, and coordinates of the maximum danger zone.
[0128] In this embodiment, the dynamic graph structure defines edge weights based on spatial distance and wind speed correlation, quantifying the physical diffusion mechanism (e.g., at high wind speeds, icing may spread along the wind direction, increasing the edge weights of adjacent grids). Traditional spatiotemporal models only consider spatial distance, ignoring meteorological factors. The hierarchical spatiotemporal blocks combine gated TCN (temporal modeling) and Chebyshev graph convolution (spatial modeling) to simultaneously capture temporal dependence (nighttime temperature drops accelerate icing) and spatial propagation (icing preferentially occurs on the windward side of the tunnel entrance, spreading to the leeward side). A memory enhancement mechanism stores typical disaster patterns (features of historical extreme freezing events), quickly recalling prior knowledge in similar scenarios to improve prediction accuracy for small samples, whereas traditional models rely on large amounts of real-time data.
[0129] The multi-task output head simultaneously predicts thickness, coverage growth rate, and coordinates of the maximum danger zone, providing "targeted control parameters" for the ice melting device (prioritizing increased heating power in areas with high growth rates). Traditional single thickness prediction is insufficient for achieving refined control. The temporal features fused at the feature level (temperature and humidity trends extracted by LSTM) provide dynamic input for the spatiotemporal map convolution, while the prediction results (ice layer distribution in the next 2 hours) are fed back to the dynamic early warning module, forming a time closed loop of "prediction-early warning-control".
[0130] (6) Dynamic Early Warning: Based on the prediction results, a graded early warning signal is generated. When the predicted ice thickness exceeds 5mm, the active ice melting device is triggered, and a road condition warning is sent to the approaching vehicle via the V2X system. The dynamic early warning system includes:
[0131] (a) Three-level response mechanism: Level 1 (blue) corresponds to predicted thickness <3mm, only data is recorded; Level 2 (yellow) 3-5mm, warning display is activated; Level 3 (red) >5mm, ice melting device is activated;
[0132] (b) De-icing control strategy: Based on the predicted ice layer distribution density, the power distribution of the carbon fiber heating film is dynamically adjusted. The de-icing power control equation is as follows:
[0133]
[0134] Among them, P max Where k is the maximum heating power of the carbon fiber membrane, and k is the power growth coefficient. To predict ice thickness;
[0135] (c) V2X communication protocol: adopts IEEE 802.11p standard, broadcasts road condition index RSI∈[0,1] every 500ms;
[0136] (d) Self-test feedback loop: Set up a piezoelectric sensor to verify the actual de-icing effect and automatically calibrate the prediction model parameters.
[0137] It's worth noting that the three-level response mechanism (blue / yellow / red) matches risk levels with handling costs: for example, when the predicted thickness is <3mm, only data is recorded to avoid resource waste; when it's >5mm, the ice-melting device is activated to prevent accidents caused by thickening ice. Traditional "one-size-fits-all" warnings are prone to over- or under-response. The dynamic control equation for ice-melting power adaptively adjusts the heating intensity based on the predicted thickness, with power increasing exponentially in thick ice areas and maintaining low power in thin ice areas, achieving energy savings of over 30% (compared to uniform heating across the entire area). V2X communication (IEEE 802.11p) broadcasts the Road Condition Index (RSI) every 500ms, allowing vehicles to adjust their braking distance in advance, especially in scenarios where visibility is obstructed at tunnel entrances, reducing the risk of rear-end collisions.
[0138] Furthermore, the piezoelectric sensor's self-testing feedback loop verifies the de-icing effect in real time (through vibration signals generated by ice shedding), automatically calibrates the prediction model parameters, and forms a physical closed loop of "detection-prediction-control-feedback." Traditional systems rely on manual inspections, which are highly lagging. The spatiotemporal prediction results provide direct evidence for the early warning level and de-icing strategy, while the self-testing feedback data in turn optimizes the prediction model, collaboratively enhancing the system's self-evolution capability across steps and reducing prediction errors after long-term operation.
[0139] Example 2
[0140] Based on Example 1, the model training optimization step is also included:
[0141] (a) Constructing multi-scale training data: Extreme weather scenario data is generated through numerical simulation, and StyleGAN is used to enhance sample diversity. Extreme weather data (humidity > 85%, temperature < 0℃) is generated through numerical simulation, and StyleGAN is used to generate diverse samples to solve the problem of scarcity of real freezing disaster data and avoid misjudgments caused by sample bias. Simultaneously, multi-modal outputs such as thermal imaging, radar, and environmental parameters are generated from the simulation data to ensure the modal consistency between the training data and the actual monitoring data, and to improve the model's adaptability to real scenarios.
[0142] (b) Design a hybrid loss function: Where α = 0.4, β = 0.3, and γ = 0.3 are weighting coefficients. For cross-entropy loss, To compare the losses, The mean absolute error is calculated using a weighted fusion of three losses (α = 0.4, β = 0.3, γ = 0.3), which ensures the model has no weaknesses in classification, localization, and regression tasks. Traditional single-loss models are prone to problems such as "accurate classification but biased thickness prediction" or "clear outline but misclassification."
[0143] (c) Implementing curriculum learning strategies: Training will be conducted in stages, progressing from simple weather conditions to complex freezing scenarios. Specific curriculum learning strategies include:
[0144] Phase 1: Use simulated data under sunny and normal temperature conditions to train basic feature extraction capabilities; focus on basic feature extraction (thermal imaging temperature distribution patterns) to avoid complex scenes interfering with model convergence and improve training efficiency.
[0145] Phase 2: Introduce rain, fog, and low temperature environmental data to train the robustness of multimodal data fusion; introduce interference factors such as rain, fog, and low temperature to train the noise resistance of multimodal data (LiDAR filters out virtual points in rain and fog).
[0146] The third stage involves injecting extreme freezing scenario data (humidity > 85%, temperature < 0℃) to optimize disaster identification and prediction accuracy. This stage focuses on the physical mechanisms of freezing (the triggering conditions of the phase transition index PI) to improve the identification accuracy in extreme scenarios. Features learned in the previous stage (temperature gradient patterns in sunny thermal imaging) can be transferred to complex scenarios, reducing the cost of repeated learning.
[0147] (d) Dynamic batch normalization: Adjusting the normalization layer parameters according to real-time environmental parameters, in the dynamic batch normalization step:
[0148] Normalized layer parameters (mean μ, variance σ) 2 The system dynamically adjusts its parameters based on real-time collected temperature, humidity, and wind speed data. The adjustment cycle is triggered every 100 training batches or when environmental parameter fluctuations exceed a threshold (temperature change > 2℃). The normalization layer parameters (mean μ, variance σ²) are dynamically adjusted based on real-time temperature, humidity, and wind speed to offset the impact of environmental fluctuations on the model (parameters are automatically updated when the temperature drops sharply by 2℃). Traditional fixed-batch normalization is prone to "internal covariate shift" in non-steady-state environments, leading to training oscillations. Updating every 100 batches or when parameter fluctuations exceed the threshold maintains normalization effectiveness even with sparse distributed sensor data (individual sensor malfunctions).
[0149] In this embodiment, simulation data is used to inform sensor deployment: numerical simulation reveals the issue of decreased point cloud density in lidar at extreme low temperatures, allowing for early optimization of hardware parameters (increasing scanning frequency), forming a closed loop of "simulation training → hardware tuning → data acquisition." Dynamic batch normalization adapts to sensor noise: fluctuations in temperature and humidity data from distributed sensors (interference from vehicle exhaust at tunnel entrances) can be corrected in real time through dynamic normalization, ensuring the accuracy of environmental parameters (ΔTd, PI) input to the model and reducing temporal feature errors in feature-level fusion.
[0150] Multi-scale data enhances fusion generalization: StyleGAN-enhanced thermal imaging data (ice textures under different lighting conditions) forces ResNet-50 to learn more robust deep features (temperature gradient patterns invariant across lighting conditions), making the attention mechanism of feature-level fusion more accurate in dynamic weight adjustment in real-world scenarios (e.g., automatically reducing thermal imaging weights and increasing the proportion of LiDAR in rainy weather). The extreme frozen data injected in the third stage enhances PointNet++'s sensitivity to high reflectivity points (ice layers) in the point cloud, while the extreme environmental temporal features learned by LSTM (sudden increase in humidity → surge in PI) can enhance feature complementarity through decision-level fusion, improving the spatial discriminativeness of feature embeddings from cross-modal contrastive learning by 22%.
[0151] Additionally, the hybrid loss optimization detection model: DenseNet-121 Under constraints, more refined ice layer segmentation results can be output (distinguishing between thin ice and water stains). Bidirectional LSTM can achieve this. Under constraints, the time-series prediction error of the temperature drop rate is <0.5℃ / h. The two are linked through the cross-channel attention module to improve the accuracy of icing trend detection.
[0152] The memory enhancement mechanism integrates course knowledge: the differentiable neural dictionary of the spatiotemporal graph convolutional network stores the extreme freezing mode of the third stage of course learning ("humidity 85% + temperature -2℃ + wind speed 1m / s" corresponds to rapid freezing). In similar prediction scenarios, prior knowledge can be quickly recalled to reduce the thickness prediction error in the next 2 hours. The model trained with hybrid loss outputs the ice layer thickness confidence interval, which can assist the dynamic early warning module in adjusting the response threshold (when the predicted thickness is 4mm in mean but the variance is large, a yellow warning is triggered in advance), reducing the false alarm rate by 28%. Dynamic batch normalization accelerates the online model update: when the piezoelectric sensor reports poor ice melting effect (predicted thickness 5mm but actual residual thickness 3mm), real-time environmental parameter fluctuations trigger dynamic batch normalization updates. Combined with self-test data, the model parameters are fine-tuned, shortening the prediction calibration cycle from hours to minutes.
[0153] In this embodiment, multi-scale simulation data fills the gaps in real data, enabling the feature fusion network to learn the physical laws of the entire scenario. Hybrid loss and curriculum learning force the model to "know not only what" but also "why," upgrading from "data fitting" to "causal reasoning." Dynamic batch normalization and self-checking feedback form a parameter linkage between the "training state" and the "running state," allowing the model to continuously optimize with environmental changes, reducing the overall prediction error after long-term operation, and truly achieving the adaptive capability of "becoming more accurate with use." This not only improves the performance of a single model but also gives the entire tunnel icing disaster monitoring system biological-like environmental adaptability—every link from data acquisition to early warning control gains "evolutionary factors" of anti-interference, generalization, and self-calibration during training and optimization, ultimately constructing a robust, highly interpretable, and dynamically responsive intelligent monitoring system.
[0154] The above specific embodiments are merely several optional embodiments of the present invention. Based on the technical solutions of the present invention and the relevant teachings of the above embodiments, those skilled in the art can make various alternative improvements and combinations to the above specific embodiments.
Claims
1. A method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology, characterized in that, Includes the following steps: A dual-spectrum thermal imager array deployed at the tunnel entrance acquires thermal infrared image sequences, millimeter-wave radar and lidar simultaneously collect spatial reflection data, and a distributed temperature and humidity sensor group and a three-dimensional ultrasonic anemometer collect environmental parameters. Radiometric correction and temperature inversion are performed on thermal infrared images to generate a temperature distribution matrix; Doppler filtering is performed on millimeter-wave radar data to extract dynamic target reflection features; Ground segmentation and outlier filtering are performed on lidar point cloud data to construct a three-dimensional spatial mesh model; An improved ResNet-50 network is used to extract deep features from thermal infrared images, PointNet++ network is used to process lidar point cloud features, LSTM network is used to extract temporal features of environmental parameters, and adaptive weighted fusion is performed in the feature space. The fused features are input into a cascaded dual-channel CNN-LSTM network. The first channel identifies the current ice layer distribution, the second channel detects the icing trend, and the disaster risk level is output. A prediction model is built based on a spatiotemporal graph convolutional network. The current fused features are spatiotemporally correlated with the historical disaster database to predict the ice layer thickness and distribution range in the next 2 hours. Based on the prediction results, a graded early warning signal is generated. When the predicted ice thickness exceeds 5mm, the active ice melting device is triggered, and a road condition warning is sent to the approaching vehicle through the V2X system. The CNN-LSTM network specifically includes: The first channel adopts the DenseNet-121 architecture, inputting a 256×256 thermal infrared image and outputting a 512-dimensional feature vector. The second channel uses a bidirectional LSTM structure, taking the environmental parameter sequence of the past 30 minutes as input and outputting 128-dimensional time series features. Design a cross-channel attention module: calculate the correlation weights between images and temporal features using cosine similarity; The output layer employs a hybrid density network, simultaneously predicting the probability of ice presence and thickness within a given range.
2. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, Processing of thermal infrared image sequences includes: Set up dual-band synchronous acquisition to acquire each frame of image at a resolution of 256×256; A non-uniformity correction algorithm is used to eliminate detector response differences, and an atmospheric transmittance model is applied for temperature inversion. Establish temperature gradient matrix ; Set dynamic threshold ,when <- The time marker indicates a suspected icing area; in, In spatial coordinates place, time temperature gradient, Spatial coordinates place, time Temperature value, For a moment The dynamic threshold, The average temperature of the current frame. This represents the temperature standard deviation.
3. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, Radar data processing includes: Millimeter-wave radar uses FMCW modulation and MTI filters to eliminate static clutter and extract the Doppler frequency shift of moving targets; The lidar uses a rotating scanning method with a point cloud density of ≥200 points / m², and applies the RANSAC algorithm to segment the ground point cloud; Establish a model for detecting anomalies in reflection intensity: ; when A value greater than 3 indicates an abnormal reflection point. By integrating millimeter-wave velocity spectrum and lidar reflection characteristics, a dynamic risk field model is constructed: ; in, Spatial coordinates The intensity of lidar reflection at that location, =0.6 is the speed influence factor. =0.4 is the reflection intensity influence factor. The radial velocity of the target measured by millimeter-wave radar. This is the standardized value of the lidar reflection intensity. The average reflection intensity The standard deviation is denoted as .
4. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, Environmental parameter processing includes: Establish a spatiotemporal model of temperature and humidity: use Kriging interpolation to generate a two-dimensional distribution from data of 16 distributed sensors; Calculate the dew point temperature difference ,when High-precision monitoring mode is activated when the temperature is below 1℃; Wind speed vector analysis: A wind field matrix was constructed using four three-dimensional ultrasonic anemometers, and the curl ▽×V and divergence ▽·V were calculated; Introducing the phase transition index when A phase transition warning is generated when the value is greater than 0.
15. For a moment The phase transition index, To indicate at time Another reference temperature, For a moment The relative humidity measurement value, For a moment Ambient temperature, The magnitude of the three-dimensional wind speed vector.
5. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, Feature fusion employs: Establish a three-level fusion architecture: pixel-level fusion of thermal imaging and lidar data, feature-level fusion of radar reflection features, and decision-level fusion of environmental parameters; Design an attention weighting mechanism: dynamically adjust the weights of each modality feature using deformable convolutional kernels. ; in, For the first The attention weight matrix of the modality. For the sigmoid function, This is a 3D convolution operation. For the first Input feature map of the modality; Introducing an adversarial training strategy: Constructing a generator network to simulate multimodal data distribution, and a discriminator network to optimize feature discriminability; Implement cross-modal contrastive learning: establish positive and negative sample pairs and optimize the feature embedding space.
6. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, Spatiotemporal graph convolutional networks include: Construct a dynamic graph structure: nodes represent grid cells of the monitoring area, and edge weights are determined by both spatial distance and wind speed correlation. Design hierarchical spatiotemporal blocks: each block contains a gated TCN temporal module and a Chebyshev graph convolutional spatial module; Introducing a memory enhancement mechanism: A differentiable neural dictionary is added to the encoder to store features of typical disaster patterns; Employs a multi-task output head: simultaneously predicts ice thickness, ice cover growth rate, and coordinates of the maximum danger zone.
7. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, The dynamic early warning system includes: Three-level response mechanism: Level 1 corresponds to predicted thickness <3mm, only data is recorded; Level 2 is 3-5mm, early warning display is activated; Level 3 is >5mm, the ice melting device is activated. De-icing control strategy: Based on the predicted ice layer distribution density, the power distribution of the carbon fiber heating film is dynamically adjusted. The de-icing power control equation is as follows: ; in, This represents the maximum heating power of the carbon fiber membrane. The power growth factor is... To predict ice thickness; V2X communication protocol: adopts IEEE 802.11p standard, broadcasts road condition index RSI∈[0,1] every 500ms; Self-test feedback loop: Set up a piezoelectric sensor to verify the actual de-icing effect and automatically calibrate the prediction model parameters.
8. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 1, characterized in that, It also includes model training and optimization steps: Constructing multi-scale training data: Extreme weather scenario data are generated through numerical simulation, and StyleGAN is used to enhance sample diversity; Design a hybrid loss function: Where α=0.4, β=0.3, and γ=0.3 are weighting coefficients. For cross-entropy loss, To compare the losses, Mean absolute error; Implement a learning strategy that involves phased training, progressing from simple weather conditions to complex freezing scenarios. Dynamic batch normalization is adopted: the normalization layer parameters are adjusted according to real-time environmental parameters.
9. The method for identifying and predicting icing disasters at tunnel entrances based on multimodal data fusion technology according to claim 8, characterized in that, The specific learning strategies mentioned include: Phase 1: Using simulation data under sunny, normal temperature conditions; Phase Two: Introducing data on rain, fog, and low-temperature environments; Phase 3: Injecting data from extreme freezing scenarios; In the dynamic batch normalization step: The normalized layer parameters are dynamically adjusted based on real-time collected temperature, humidity, and wind speed data; The adjustment cycle is triggered every 100 training batches or when environmental parameters fluctuate beyond a threshold.