A lightweight robust traffic flow prediction method based on double-path divergence fusion

By decomposing traffic flow observation data into smooth baseline patterns, learnable structured variations, and non-learnable perturbations, a dual-path complementary coding and differential perception dynamic fusion method is designed to solve the instability problem of existing traffic flow prediction methods under heterogeneous sensor damage, achieving lightweight robustness and efficient prediction.

CN122223970APending Publication Date: 2026-06-16UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Filing Date
2026-04-28
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing traffic flow prediction methods struggle to distinguish between learnable traffic dynamics and non-learnable sensor disturbances when faced with heterogeneous sensor failures, leading to unstable prediction results. Furthermore, existing robust models suffer from high computational overhead and inference latency, making them unsuitable for resource-constrained scenarios.

Method used

We adopt a robust prediction approach based on observation data composition modeling, decomposing the data into three parts: smooth baseline pattern, learnable structured variation, and non-learnable perturbation. We design structure-oriented and consistency-oriented paths and achieve adaptive fusion through difference-aware dynamic reliability gates to improve the robustness and efficiency of the model.

Benefits of technology

While ensuring prediction accuracy, it significantly improves the model's robustness to damage from heterogeneous sensors, maintains its lightweight characteristics, is suitable for resource-constrained scenarios, and has efficient traffic flow prediction capabilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122223970A_ABST
    Figure CN122223970A_ABST
Patent Text Reader

Abstract

The application relates to a lightweight robust traffic flow prediction method based on a double-path divergence fusion, and belongs to the field of intelligent transportation systems. The application decomposes observation data into three parts of a smooth baseline mode, a learnable structured variation and an unlearnable disturbance, realizes effective representation of reliable signals by modeling the composition mode of the observation data, and reduces the influence of interference components on the prediction result. In view of the problem that a single coding path cannot simultaneously consider prediction accuracy and robustness, a structure-oriented path and a consistency-oriented path are designed to ensure the prediction accuracy in a clean data scene and the prediction stability in a damaged scene. Finally, a difference perception dynamic reliability gate is used to realize adaptive fusion at a node level and a sample level, so that sensitive perception and accurate response to heterogeneous damage can be realized without fault labeling. While ensuring prediction accuracy, the application significantly improves the robustness of the model to heterogeneous sensor damage, and maintains the lightweight characteristics, so that the actual deployment requirements are met.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of intelligent transportation systems, and particularly relates to a lightweight and robust traffic flow prediction method based on dual-path divergence fusion. Background Technology

[0002] In the current era of rapid development of Intelligent Transportation Systems (ITS), accurate traffic flow prediction is a core foundation for signal control, route planning, and congestion management, and is crucial for improving transportation efficiency and travel experience. With the evolution of deep learning technology, deep spatiotemporal models such as spatiotemporal graph neural networks and Transformer-like architectures have demonstrated excellent predictive performance on public benchmark datasets such as PeMS, promoting the large-scale application of traffic flow prediction technology.

[0003] However, sensor networks in real-world deployments often face complex and non-ideal conditions: sensor data is susceptible to Gaussian noise, random missing values, and heterogeneous damage such as long-term node failures. These unlabeled sensor failures can lead to significant prediction biases in existing models, creating a generalization performance gap during deployment and making it difficult to adapt to the needs of real-world traffic scenarios.

[0004] Current traffic flow prediction methods are mainly trained based on the Empirical Risk Minimization (ERM) criterion. This paradigm treats all biases in the observed data as learnable signals, which easily leads to overfitting of disturbances caused by sensor malfunctions. This results in models being overly sensitive to unreliable inputs, leading to unstable prediction results. To alleviate this problem, related research focuses on improving spatiotemporal prediction backbone models, non-stationarity and mechanism adaptation, learning methods with defective observations, and large language model-assisted pipelines. Existing technical approaches can be summarized into the following three categories, and each approach has its corresponding technical limitations:

[0005] Data repair-oriented strategies: These methods repair damaged data through data preprocessing techniques, providing cleaner input to the model. The core focuses on mask reconstruction and global pattern extraction to complete and correct defective data. For example, some studies use mask autoencoding, decreasing mask reconstruction constrained Transformers, etc., to fill missing values, or use global temporal pattern extraction to mitigate the impact of data corruption. However, these methods are only designed for specific types of corruption (such as random missing values) and are insufficiently adaptable to heterogeneous corruption (noise + missing values ​​+ node failures). Furthermore, the assumptions introduced in the repair process may conflict with real traffic dynamics, introducing new biases. Additionally, they fail to distinguish between learnable signals and disturbances at the model learning level.

[0006] Robust Model Architecture Design: This approach enhances anti-interference capabilities by optimizing the model architecture. On one hand, it upgrades the spatiotemporal prediction backbone model, such as using graph wave networks and spatiotemporal graph neural networks with refined key nodes to capture complex spatial dependencies, or using Transformer-like models based on spatiotemporal embedding and spatiotemporal graph neural networks designed with spectral operators to model long-range temporal correlations. On the other hand, it designs robust architectures to address distributional offsets, such as multi-scale spatiotemporal graph learning models focusing on overload scenarios and graph convolutional networks that adapt graphs during testing. However, these models typically suffer from high computational overhead and high inference latency, making it difficult to meet the deployment requirements of resource-constrained scenarios. Furthermore, they do not clearly distinguish between learnable traffic dynamics and non-learnable sensor disturbances, still posing a risk of overfitting to disturbances and failing to fundamentally solve the prediction instability problem caused by heterogeneous damage.

[0007] Self-supervised learning enhancement: This approach optimizes learning paradigms. One type utilizes self-supervised tasks to enable models to learn more robust representations, such as improving the model's tolerance to data bias through bias-aware self-supervised learning. Another type focuses on the non-stationarity and dynamic mechanisms of traffic flow, designing trend-seasonal decomposition networks and heterogeneity-aware metaparameter learning models to achieve adaptation to different traffic mechanisms. Research also explores large language model-assisted test-time selection and inference pipelines to improve model adaptability. While these methods can improve the model's tolerance to partial damage to some extent, they lack specific design for heterogeneous damage and do not establish reliable fusion mechanisms to dynamically adapt to different damage scenarios. Furthermore, additional self-supervised tasks or complex adaptive modules increase the complexity of model training and inference, reducing overall efficiency. Moreover, most methods fail to decouple learnable biases from non-learnable sensor perturbations under the premise of traffic-only supervision.

[0008] Overall, while existing research has made progress in the accuracy of traffic flow prediction and the robustness of some scenarios, it has not yet formed a unified framework to distinguish between learnable traffic dynamics and non-learnable sensor disturbances. It also lacks a sample and node-level adaptive mechanism to achieve dynamic switching between structural features and spatiotemporal consensus features. Furthermore, it has not formed a unified protocol to cover noise, missing data, and node failures in robustness assessment, making it difficult to adapt to the heterogeneity and unlabeled sensor damage problems in real-world scenarios. Summary of the Invention

[0009] This invention proposes a lightweight, reliable traffic flow forecasting method (LUNA-T) for heterogeneous, unlabeled sensor disturbances, which mainly includes the following three core technologies:

[0010] Robust prediction approach based on observation data composition modeling: Traditional methods fail to clearly distinguish between effective signals and interference components in traffic flow observations, making it difficult for models to focus on core prediction information. This invention proposes to decompose observation data into three parts: smooth baseline patterns, learnable structured variations, and non-learnable perturbations. By modeling the composition of observation data, it achieves effective characterization of reliable signals and reduces the impact of interference components on prediction results.

[0011] Dual-path complementary coding architecture: To address the issue that a single coding path cannot simultaneously ensure both prediction accuracy and robustness, a structure-oriented path and a consistency-oriented path are designed: The structure-oriented path captures stable, low-variance traffic patterns through a residual MLP architecture, ensuring prediction accuracy in clean data scenarios; the consistency-oriented path utilizes local smoothing and global spatiotemporal consensus mechanisms to effectively recover unreliable signals, improving prediction stability in damaged scenarios.

[0012] Difference-Aware Dynamic Reliability Gating: Existing fusion mechanisms lack adaptive judgment of input reliability and cannot dynamically adjust the contribution weights of the two paths. This invention proposes a difference-aware dynamic reliability gate (DiffGate), which infers input reliability by the degree of divergence represented by the two paths, achieving adaptive fusion at the node and sample levels. It can achieve sensitive perception and accurate response to heterogeneous damage without fault labeling.

[0013] Thanks to the above three designs, this invention significantly improves the robustness of the model to damage from heterogeneous sensors while ensuring prediction accuracy, and maintains lightweight characteristics to meet practical deployment needs.

[0014] To solve the above-mentioned technical problems, the specific technical solution of the present invention is as follows:

[0015] A lightweight and robust traffic flow prediction method based on dual-path divergence fusion includes the following steps:

[0016] Step 1: Acquire historical traffic observation data and construct a shared latent representation, wherein the traffic flow observation data consists of a smoothed baseline pattern, learnable structured variations, and non-learnable perturbations;

[0017] Step 2: Extract structural and consistency features based on complementary coding using structure-oriented and consistency-oriented dual paths; the structure-oriented path extracts stable patterns and low-variance structural information from historical traffic observations, while the consistency-oriented path extracts consistency context information across nodes and across time when local observations are unreliable.

[0018] Step 3: Generate dynamic reliability coefficients based on the differences between the two path features, and complete the adaptive fusion of the two path features based on the dynamic reliability coefficients;

[0019] Step 4: Output the future traffic flow prediction results based on the fused features and complete the model training.

[0020] Furthermore, step 1 is detailed as follows:

[0021] Assume the traffic sensor network includes Each sensor node, at the current predicted time For reference time, obtain the length as: Historical observation window, using Indicates from the first The time step to the 1 Historical traffic observation tensor at each time step;

[0022] The historical traffic observation tensor is flattened along the time and feature dimensions, and then... Convolution and nonlinear activation functions perform lightweight shared mapping to generate shared latent representations. .

[0023] Furthermore, the extraction of structural and consistency features based on the dual-path complementary encoding of structure-oriented and consistency-oriented methods is as follows:

[0024] Let the first The implicit representation of a layer is The update process of the structure-guided path is represented as follows:

[0025]

[0026]

[0027] In this context, the hidden representation of the initial layer is a shared latent representation; This indicates a normalization operation; Indicates the first Layer location-independent multilayer perceptron mapping; Indicates the number of layers in the structural guidance path; This represents the structural features output by the structure-guided path;

[0028] The consistency-oriented path first uses a lightweight hybrid operator to perform local smoothing and cross-node feature mixing on the shared latent representation to obtain a hybrid feature representation; then, it uses a global attention mapping to calculate a global consensus feature representation based on the shared latent representation; finally, it concatenates the hybrid feature representation with the global consensus feature representation and obtains the consistency-oriented path output, i.e., the consistency feature, through lightweight projection.

[0029] Furthermore, step 3 is detailed as follows:

[0030] First, the difference tensors of structural features and consistency features are calculated; then, the consistency features and the difference tensors are concatenated along the feature dimension, and then... Convolution and the Sigmoid activation function generate dynamic reliability coefficients; based on the dynamic reliability coefficients, structural features and consistency features are fused element-wise to obtain a fused feature representation.

[0031] Furthermore, step 4 is detailed as follows:

[0032] Inputting the fused feature representation into the lightweight prediction head yields the future... The traffic flow prediction results at each time step are used, and then the model is trained end-to-end using the masked mean absolute error loss function.

[0033] Furthermore, node identification information and time identification information are introduced in the process of generating the shared potential representation.

[0034] Furthermore, the specific expression for the masked mean absolute error loss function is as follows:

[0035]

[0036] in, This represents the mean absolute error loss with masking; Indicates the first The first prediction time step, the first The predicted value of each node; The corresponding true value is represented by F, which represents the prediction time span; the prediction mask matrix is... When the real label is in the 1st The time step, the first When there are valid observations at each node location When the corresponding position is a null value or an invalid observation, .

[0037] Compared with existing technologies, this invention focuses on the collaborative design of three key technical solutions: observation data composition modeling, dual-path complementary coding, and differential perception dynamic fusion. This approach effectively mitigates the impact of heterogeneous sensor disturbances on traffic flow prediction without the need for fault labeling, while also ensuring prediction accuracy, robustness, and deployment efficiency. Specific advantages are as follows:

[0038] 1. Balancing prediction accuracy and robustness. Existing traffic flow prediction methods typically treat all deviations in the observed data as a learnable signal for fitting, which can easily lead to oversensitivity to unreliable inputs in scenarios with noise, missing values, and node failures, resulting in unstable prediction results.

[0039] This invention views traffic flow observations as consisting of a smooth baseline pattern, learnable structured variations, and non-learnable perturbations, and designs two complementary encoding paths based on this: a structure-oriented path and a consistency-oriented path. The structure-oriented path extracts stable, low-variance structural features, while the consistency-oriented path provides cross-node consensus compensation when local observations are unreliable. Furthermore, a difference-aware dynamic reliability gate adaptively adjusts the fusion weights based on the degree of divergence in the dual-path outputs, enabling the model to maintain high prediction accuracy in clean data scenarios and suppress performance degradation in corrupted data scenarios.

[0040] According to experimental results, in severely damaged scenarios, the present invention can reduce the error degradation by more than 35% on average compared with existing advanced methods, and shows a stability improvement of up to 4 times in high-intensity noise scenarios, demonstrating good accuracy and robustness synergy.

[0041] 2. It possesses lightweight and high efficiency characteristics. Existing robust traffic flow prediction models typically rely on complex graph propagation structures, deeply stacked modules, or heavy attention mechanisms, resulting in large parameter sizes and high computational overhead, which are not conducive to deployment in resource-constrained scenarios.

[0042] This invention employs a lightweight shared context projection, shallow location-independent residual MLP, lightweight hybrid operators, and a lightweight prediction head to form the overall network, reducing the dependence on large-scale graph propagation and complex spatiotemporal modeling structures. At the same time, it achieves adaptive fusion with minimal additional overhead through a difference-aware dynamic reliability gate, thus maintaining robustness while achieving high operating efficiency.

[0043] Based on experimental results, on the PeMS04 dataset, the model provided by this invention has only 0.24M parameters, a single forward propagation computation of 136.08MFLOPs, a training memory footprint of 2.05GB, an inference memory footprint as low as 0.03GB, and an end-to-end runtime of 0.26 hours. Compared with related comparative methods, the computational cost can be reduced by 9.5 to 35.8 times, and the running speed can be improved by 2.7 to 14.1 times, making it suitable for traffic prediction scenarios with limited resources or high real-time requirements.

[0044] 3. Capable of adaptive response to heterogeneous and unlabeled disturbances. Sensor damage in real-world traffic scenarios is typically heterogeneous and unlabeled, with different types and intensities of noise, missing data, or faults occurring at different nodes and time periods. Existing methods often rely on data repair assumptions specific to certain damage types or lack a unified adaptive fusion mechanism, thus limiting their adaptability to complex real-world scenarios.

[0045] This invention eliminates the need for fault category labels, fault location labels, or fault intensity labels. Instead, it utilizes the degree of divergence between the outputs of the structure-oriented path and the consistency-oriented path as a surrogate quantity for input reliability, generating dynamic reliability coefficients at the node and sample levels. This enables adaptive fusion responses to disturbances of different types and intensities. This design avoids dependence on a single failure mode and improves the model's generalization ability and stability in heterogeneous disturbance scenarios.

[0046] According to the experimental results, this invention exhibits low performance degradation under scenarios such as high-intensity random missing values ​​and node failures. For example, in the high-intensity random missing value scenario of the PeMS07 dataset, the MAE degradation of this invention is 37.80, lower than PDFormer's 59.11; in the high-intensity noise scenario of the PeMS03 dataset, the degradation of this invention is 4.27, while GWNet's degradation reaches 20.77, indicating that this invention has a strong adaptability to heterogeneous perturbations.

[0047] 4. Strong Deployment Applicability. The invention has a simple overall structure, relying neither on additional data pre-repair processes nor on fault labeling supervision, making the training and inference processes relatively direct. Because it uses lightweight modules to build the prediction framework and achieves dual-path fusion through dynamic reliability gates, it is easy to deploy and implement on electronic devices with data processing capabilities, such as servers, edge computing devices, and traffic control terminals. It can be used in scenarios such as real-time traffic flow prediction, road operation status assessment, and intelligent traffic control, demonstrating significant engineering application value. Attached Figure Description

[0048] Figure 1 This is a model architecture diagram of the present invention. Detailed Implementation

[0049] To better understand the purpose, structure, and function of this invention, the invention will be described in further detail below with reference to the accompanying drawings.

[0050] This invention proposes a lightweight, robust traffic flow prediction method based on dual-path fusion, applicable to road monitoring networks composed of multiple traffic sensor nodes. The method takes historical traffic observation sequences as input and, without relying on fault category labels, fault location labels, or fault intensity labels, extracts complementary features through structure-guided paths and consistency-guided paths. It then utilizes a difference-aware dynamic reliability gate to adaptively fuse the dual-path features, ultimately outputting a future traffic flow prediction result. The specific model architecture diagram is shown below. Figure 1 As shown.

[0051] In this invention, the “triple decomposition” of observation data refers to a modeling perspective that considers traffic flow observations as being composed of smooth baseline patterns, learnable structured variations, and non-learnable perturbations, in order to guide the design of subsequent feature extraction and fusion mechanisms, rather than performing explicit pre-decomposition processing on the original input data.

[0052] Specifically, the method includes the following steps:

[0053] Step 1: Obtain historical traffic observation data and construct a shared potential representation.

[0054] Assume the traffic sensor network includes Each sensor node, at the current predicted time For reference time, obtain the length as: The historical observation window is denoted as:

[0055]

[0056] in, Indicates the length of the historical observation time step. Indicates the number of sensor nodes. This represents the dimension of the input features for each node at a single time step. Indicates from the first The time step to the 1 The historical traffic observation tensor at each time step.

[0057] The predictive objective of this invention is to output future predictions based on the historical observation window. The traffic flow prediction results for each time step are denoted as:

[0058]

[0059] in, Indicates the predicted time span. Indicates the future At this time step The predicted traffic flow at each node is denoted as:

[0060]

[0061] To characterize the different components in historical observations, this invention uses any node in the historical traffic observation tensor at any time step. Observations The model is as follows:

[0062]

[0063] in, This represents a smooth baseline pattern, used to characterize low-frequency, stable trends in traffic flow. This represents learnable structured variation, used to characterize dynamic change components that are relevant to future traffic conditions and can be learned by the model; This represents unlearnable perturbations, used to characterize irregular and unreliable perturbation components caused by Gaussian noise, random missing values, long-term node failures, etc. The above expression describes the composition of the observed data and serves as the modeling basis for subsequent dual-path complementary coding and dynamic fusion.

[0064] Subsequently, the input tensor Flatten along the time dimension and feature dimension, and then... Convolution and non-linear activation functions perform lightweight shared mapping to generate shared latent representations:

[0065]

[0066] in, This indicates a flattening operation, used to flatten each node. A historical time step and 3D input feature mapping is of length 1 eigenvectors; express Convolution mapping; Represents a non-linear activation function; This indicates a shared potential representation; This represents the dimension of the latent features.

[0067] Preferably, in the generation In the process, node identification information and time identification information can also be introduced to enhance the distinguishability of different nodes and different time locations.

[0068] Step 2: Extract structural and consistency features based on dual-path complementary coding.

[0069] To balance prediction accuracy and robustness, this invention utilizes shared latent representations. Based on this, two complementary coding paths are set up: a structure-oriented path and a consistency-oriented path.

[0070] Among them, the structure-guided path is used to extract stable patterns and low-variance structure information from historical traffic observations. Let the first... The implicit representation of a layer is The update process of the structure-guided path is then expressed as:

[0071]

[0072] Among them, the implicit representation of the initial layer ; This indicates a normalization operation; Indicates the first Layer location-independent multilayer perceptron mapping; Indicates the number of layers in the structural guidance path; This represents the structural feature representation output by the structure-guided path. The structure-guided path preserves the original stable mode information through residual connections, thereby enhancing the representation of smooth baseline modes. and learnable structured variations Characterization ability of stable components.

[0073] The consistency-oriented path is used to extract consistent contextual information across nodes and time when local observations are unreliable. First, a lightweight hybrid operator is used to perform local smoothing and cross-node feature mixing on the shared latent representation, resulting in a hybrid feature representation:

[0074]

[0075] in, Represents a lightweight hybrid operator; This represents the blended features after local smoothing.

[0076] Then, based on the shared latent representation Computing a global spatiotemporal consensus representation:

[0077]

[0078] in, Represents a global attention map; This represents the global consensus feature.

[0079] Furthermore, the hybrid feature representation is concatenated with the global consensus feature representation, and a consensus-oriented path output is obtained through lightweight projection:

[0080]

[0081] in, This indicates that the splicing operation is performed according to the feature dimension; Represents a lightweight projection function; This represents the consensus feature representation of the output of the consensus-oriented path. When the input contains noise, missing values, or node failures, the consensus-oriented path can compensate for unreliable local observations using cross-node consensus information.

[0082] Step 3: Generate dynamic reliability coefficients based on dual-path differences and complete adaptive fusion.

[0083] To achieve adaptive fusion of dual-path features across different nodes and samples, this invention constructs a difference-aware dynamic reliability gate to model the difference between the output of the structure-guided path and the output of the consistency-guided path.

[0084] First, compute the difference tensor between the two path feature representations:

[0085]

[0086] in, The difference tensor between the structural feature representation and the consensus feature representation is used to reflect the degree of divergence between the two paths under the current input conditions.

[0087] Subsequently, the structure guidance path will be output. With difference tensor Concatenate according to feature dimensions, and through Convolution and the Sigmoid activation function generate dynamic reliability coefficients:

[0088]

[0089] in, Represents the dynamic reliability coefficient tensor; This represents the Sigmoid activation function; The range of values ​​for each element in the set is: .

[0090] The dynamic reliability coefficient is used to characterize the reliability of the structural information of the current node in the current feature dimension. When the value is small, it indicates that the outputs of the structure-oriented path and the consistency-oriented path are relatively consistent, so the fusion weight of the structure-oriented path should be increased; when... If the value is large, it indicates that the current local observation may contain strong unreliable disturbances. Therefore, the fusion weight of the structure-guided path should be reduced and the compensation effect of the consistency-guided path should be enhanced.

[0091] Based on the dynamic reliability coefficient, the outputs of the two paths are weighted and fused element by element to obtain the fused feature representation:

[0092]

[0093] in, This represents the fusion feature representation; This represents element-wise multiplication;

[0094] Step 4: Output the future traffic flow prediction results and complete the model training.

[0095] Representing fusion features Input a lightweight prediction head to get the future. Traffic flow prediction results for each time step:

[0096]

[0097] in, The prediction head mapping function can preferably be represented by... Convolution or linear mapping achieves the projection from the latent feature space to the future prediction space; output results .

[0098] This invention can use a masked mean absolute error loss function to perform end-to-end training of the model. Let the prediction mask matrix be... When the real label is in the 1st The time step, the first When there are valid observations at each node location When the corresponding position is a null value or an invalid observation, The training objective can then be expressed as:

[0099]

[0100] in, This represents the mean absolute error loss with masking; Indicates the first The first prediction time step, the first The predicted value of each node; This represents the corresponding actual value.

[0101] Through the above steps, this invention uses "historical observation composition modeling - dual-path complementary coding - difference perception dynamic fusion - lightweight prediction output" as the basic technical chain, and achieves robust prediction of traffic flow in heterogeneous unlabeled sensor disturbance scenarios without explicit data pre-decomposition or additional fault labeling supervision.

[0102] To verify the effectiveness of this invention, the following experiment was conducted:

[0103] Datasets: All datasets are from the California Transportation Performance Evaluation System (PeMS) and contain 5-minute aggregated traffic flow data from sensor networks of different sizes. Specific parameters are shown in Table 1.

[0104] Table 1. Dataset Parameters

[0105] Dataset Number of nodes Time step period PeMS03 358 26208 2018 / 09 / 01–2018 / 11 / 30 PeMS04 307 16992 2018 / 01 / 01–2018 / 02 / 28 PeMS07 883 28224 2017 / 05 / 01–2017 / 08 / 31 PeMS08 170 17856 2016 / 07 / 01–2016 / 08 / 31

[0106] Evaluation Metrics: The Masked MAE (Mean Absolute Error of the Mask) was used as the core evaluation metric to measure the deviation between the predicted and actual values. Simultaneously, the Robustness Composite Metric Expected MAE was used to evaluate performance under both clean data scenarios and severe perturbation scenarios. Its calculation formula is as follows:

[0107]

[0108] in, This represents the mean absolute error in a clean data scenario. , , These represent the performance degradation under high-intensity Gaussian noise, random missing data, and node failure scenarios, respectively. This is used to simulate application scenarios where the system operates under 20% non-ideal conditions.

[0109] Comparison methods: The comparison methods selected in this paper include: the classic temporal model Long Short-Term Memory Network (LSTM); spatiotemporal graph neural network models Graph Wave Network (GWNet), Spatiotemporal Key Node Graph Neural Network (STPGNN), and Spatiotemporal Graph Filtering Network (STSGNN); Transformer-like models Propagation Delay Aware Dynamic Long-Range Transformer (PDFormer), Spatiotemporal Adaptive Embedded Transformer (STAEformer), Heterogeneity Aware Metaparameter Learning Model (HimNet), Spatiotemporal Aware Trend-Season Decomposition Network (STDN), and Spatiotemporal Attribute Pre-trained Model (STDPLM).

[0110] Experimental Results: The proposed method LUNA-T demonstrates excellent performance in Expected MAE across four datasets, as shown in Tables 2 and 3. Under different types and severity of corruption scenarios, LUNA-T maintains low performance degradation. For example, in the high-intensity random missing data scenario of the PeMS07 dataset, LUNA-T's MAE degradation is 37.80, significantly lower than PDFormer's 59.11. In the high-intensity noise scenario of PeMS03, LUNA-T's degradation is only 4.27, while GWNet's degradation is as high as 20.77, fully validating its robustness advantage.

[0111] Table 2. Expected MAE of the model on four datasets.

[0112]

[0113] Table 3. Detailed robustness evaluation results of the model on four datasets.

[0114]

[0115] It is understood that the present invention has been described through some embodiments, and those skilled in the art will recognize that various changes or equivalent substitutions can be made to these features and embodiments without departing from the spirit and scope of the invention. Furthermore, under the teachings of the present invention, these features and embodiments can be modified to adapt to specific situations and materials without departing from the spirit and scope of the invention. Therefore, the present invention is not limited to the specific embodiments disclosed herein, and all embodiments falling within the scope of the claims of this application are within the protection scope of the present invention.

Claims

1. A lightweight, robust traffic flow prediction method based on dual-path divergence fusion, characterized in that, Includes the following steps: Step 1: Acquire historical traffic observation data and construct a shared latent representation, wherein the traffic flow observation data consists of a smoothed baseline pattern, learnable structured variations, and non-learnable perturbations; Step 2: Extract structural and consistency features based on complementary dual-path encoding using structure-oriented and consistency-oriented methods; The structure-oriented path extracts stable patterns and low-variance structure information from historical traffic observations, while the consistency-oriented path extracts consistency context information across nodes and across time when local observations are unreliable. Step 3: Generate dynamic reliability coefficients based on the differences between the two path features, and complete the adaptive fusion of the two path features based on the dynamic reliability coefficients; Step 4: Output the future traffic flow prediction results based on the fused features and complete the model training.

2. The lightweight robust traffic flow prediction method based on dual-path divergence fusion according to claim 1, characterized in that, Step 1 is described in detail as follows: Assume the traffic sensor network includes Each sensor node, at the current predicted time For reference time, obtain the length as: Historical observation window, using Indicates from the first The time step to the 1 Historical traffic observation tensor at each time step; The historical traffic observation tensor is flattened along the time and feature dimensions, and then... Convolution and nonlinear activation functions perform lightweight shared mapping to generate shared latent representations. .

3. The lightweight robust traffic flow prediction method based on dual-path divergence fusion according to claim 2, characterized in that, The extraction of structural and consistency features based on complementary dual-path encoding guided by structure and consistency is as follows: Let the first The implicit representation of a layer is The update process of the structure-guided path is represented as follows: In this context, the hidden representation of the initial layer is a shared latent representation; This indicates a normalization operation; Indicates the first Layer location-independent multilayer perceptron mapping; Indicates the number of layers in the structural guidance path; This represents the structural features output by the structure-guided path; The consistency-oriented path first uses a lightweight hybrid operator to perform local smoothing and cross-node feature mixing on the shared latent representation to obtain a hybrid feature representation; then, it uses a global attention mapping to calculate a global consensus feature representation based on the shared latent representation; finally, it concatenates the hybrid feature representation with the global consensus feature representation and obtains the consistency-oriented path output, i.e., the consistency feature, through lightweight projection.

4. A lightweight robust traffic flow prediction method based on dual-path divergence fusion according to claim 3, characterized in that, Step 3 is as follows: First, the difference tensors of structural features and consistency features are calculated; then, the consistency features and the difference tensors are concatenated along the feature dimension, and then... Convolution and the Sigmoid activation function generate dynamic reliability coefficients; based on the dynamic reliability coefficients, structural features and consistency features are fused element-wise to obtain a fused feature representation.

5. A lightweight robust traffic flow prediction method based on dual-path divergence fusion according to claim 4, characterized in that, Step 4 is as follows: Inputting the fused feature representation into the lightweight prediction head yields the future... The traffic flow prediction results at each time step are used, and then the model is trained end-to-end using the masked mean absolute error loss function.

6. A lightweight robust traffic flow prediction method based on dual-path divergence fusion according to claim 5, characterized in that, In the process of generating a shared potential representation, node identification information and time identification information are introduced.

7. A lightweight robust traffic flow prediction method based on dual-path divergence fusion according to claim 6, characterized in that, The specific expression for the masked mean absolute error loss function is as follows: in, This represents the mean absolute error loss with masking; Indicates the first The first prediction time step, the first The predicted value of each node; The corresponding true value is represented by F, which represents the prediction time span; the prediction mask matrix is... When the real label is in the 1st The time step, the first When there are valid observations at each node location When the corresponding position is a null value or an invalid observation, .