A non-rigid registration method based on structured surface panel coding
By employing an unpaired multi-fidelity data fusion method using structured panel coding and multi-scale dilated convolution modules, pseudo-high-fidelity data is generated and features are extracted. This method overcomes the limitations of traditional methods in unpaired data processing, achieving high-precision and robust multi-fidelity data fusion, and is suitable for complex engineering problems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- HUNAN UNIV
- Filing Date
- 2026-02-06
- Publication Date
- 2026-06-12
AI Technical Summary
Traditional multifidelity data fusion methods struggle to effectively capture nonlinear, high-dimensional, and weakly correlated multifidelity relationships when dealing with unpaired or structurally disconnected data, leading to a sharp decline in performance in complex engineering problems.
A non-paired multi-fidelity data fusion method based on structured panel coding is adopted. A pseudo-high-fidelity output is generated by training a benchmark neural network with low-fidelity data, a paired information augmentation dataset is constructed, and feature extraction and fusion are performed using a multi-scale dilated convolution module and a parallel multilayer perceptron to output high-precision prediction results.
It enables the effective integration of multi-fidelity datasets without relying on sample alignment or strong fidelity correlation, significantly improving the accuracy and robustness of prediction models, reducing data acquisition costs, and making it suitable for complex engineering tasks.
Smart Images

Figure CN122196869A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the application of computer vision and deep learning technologies in multi-fidelity data fusion, and more particularly to a non-paired multi-fidelity data fusion method based on structured panel coding. Background Technology
[0002] Multi-Fidelity Data Fusion (MDF) technology plays a crucial role in modern science and engineering as a cost-effective surrogate modeling method. It combines scarce but highly accurate high-fidelity (HF) data with abundant but less accurate low-fidelity (LF) data to construct reliable and efficient predictive models, thereby enabling a wide range of applications such as simulation-based optimization, uncertainty quantification, digital twin systems, and real-time decision-making.
[0003] However, traditional multifidelity data fusion methods often face a key bottleneck: they typically require data to be paired or strongly correlated. This assumption is often difficult to meet in real-world applications because, in practical engineering scenarios, HF and LF datasets are often collected independently, with varying environmental conditions, sampling times, or modeling tools, resulting in unpaired and misaligned structures. This unpairing characteristic causes a sharp decline in the efficiency of traditional methods when dealing with complex engineering problems, as they cannot effectively capture nonlinear, high-dimensional, and weakly correlated multifidelity relationships.
[0004] To overcome this challenge, deep learning-based multi-fidelity data fusion methods have emerged in recent years, demonstrating greater flexibility to some extent. However, the performance of these methods remains limited when dealing with truly unpaired or structurally disconnected data. The problem of unpaired multi-fidelity data fusion is particularly prominent in complex engineering tasks such as computational fluid dynamics (CFD), aerodynamic optimization, and structural mechanics, where independent data acquisition across fidelity levels is required for comprehensive exploration and cost reduction.
[0005] Therefore, developing a computational framework that can effectively integrate and utilize multi-fidelity datasets without relying on sample alignment or strong fidelity correlation has become a key problem that urgently needs to be solved in the field of multi-fidelity data fusion. Summary of the Invention
[0006] To address the aforementioned problems, this invention aims to provide a non-paired multi-fidelity data fusion method based on structured panel coding, which is particularly suitable for complex engineering prediction modeling scenarios with non-paired and misaligned data structures.
[0007] The technical solution adopted in this invention is: a non-paired multi-fidelity data fusion method based on structured panel coding, comprising the following steps: A baseline neural network is trained using low-fidelity (LF) data to capture global trends; By using the trained LF model to generate pseudo-high-fidelity (Pseudo-HF) outputs at high-fidelity (HF) input locations, a pairing information augmentation HF dataset is constructed. Each HF sample is transformed into a structured image-like feature panel, integrating HF input, true response value, corresponding pseudo-HF output, and the complete LF dataset; Hierarchical feature extraction of feature panels is performed using a multi-scale dilated convolution module to capture local interactions and long-range cross-fidelity dependencies; High-precision prediction results are output by fusing learned feature representations through parallel multilayer perceptron (MLP).
[0008] Preferably, the multi-scale dilated convolution module includes four parallel one-dimensional dilated convolution branches, each with a different kernel size, to expand the effective receptive field and aggregate multi-scale spatiotemporal information.
[0009] Preferably, the output of the multi-scale dilated convolution module achieves dimensionality consistency through temporal alignment and channel concatenation operations. Subsequently, the sequence dimension is restored through two layers of MLP and pooling operations are performed. Finally, the output is reshaped into a one-dimensional vector for prediction.
[0010] Preferably, the method for constructing the feature panel includes: The high-fidelity response corresponding to each HF input and the synthesized pseudo-HF output are concatenated with the complete LF dataset along the channel axis; A unified high-dimensional tensor is formed, preserving spatial distribution information and specific fidelity information.
[0011] Preferably, the parallel multilayer perceptron (MLP) includes a linear part (MLP). L ) and nonlinear part (MLP) NL The outputs of both are fused using a learnable weight parameter α to obtain the final prediction result. Where Z is the feature map of multi-scale convolution, and α is the learnable weight parameter.
[0012] Preferably, the training process adopts a two-stage procedure: Phase 1: Minimize the LF loss function using LF data and optimize the LF network parameters; The second stage involves fine-tuning the entire network by minimizing the HF loss function using the mini-batch-based Adam optimizer.
[0013] Preferably, both the LF loss function and the HF loss function include a regularization term to prevent overfitting, defined as follows: Where, N LF and N HF λ represents the number of LF and HF samples, respectively. LF and λ HF θ is the regularization coefficient. LF and θ HF These are the model parameters.
[0014] Preferably, the method further includes a pseudo-high-fidelity data generation step, specifically: mapping the HF input to the LF domain using a pre-trained LF model to generate a pseudo-high-fidelity (Pseudo-HF) output. in This represents the optimized LF parameters after fixing.
[0015] Preferably, the method further includes a feature aggregation step, specifically: The true response value, pseudo-HF output, and complete LF dataset corresponding to each HF input are concatenated along the feature dimension to form a feature panel: in D x and D y Corresponding to the input and output dimensions respectively, Concat means concatenation along the feature dimensions.
[0016] Preferably, the method is applied to multi-fidelity data fusion tasks in the fields of finite element analysis, structural health monitoring, or materials science, and the evaluation indicators include root mean square error (RMSE), normalized root mean square error (NRMSE), coefficient of determination (R²), and maximum absolute error (MAE).
[0017] This invention proposes an unpaired multi-fidelity data fusion method based on structured panel coding, namely Multi-fidelity Sequential Convolutional Neural Network (MF-SCNN). This method, through an innovative "point-to-domain" strategy, successfully overcomes the limitations of traditional multi-fidelity data fusion methods when processing unpaired data, achieving significant technical and beneficial effects. This invention expands the dataset by generating pseudo-high-fidelity samples and utilizes a novel encoding method to transform all unpaired data into structured image-like tensors, thereby achieving effective fusion of multi-fidelity data without relying on sample alignment or strong fidelity correlation. This innovative strategy significantly broadens the application scope of multi-fidelity data fusion technology, enabling it to handle more complex and practical engineering problems.
[0018] This invention processes structured image tensors using a multi-scale dilated convolutional network, enabling the extraction of multi-level spatial and cross-fidelity features, thereby significantly improving the accuracy and robustness of the prediction model. In benchmark tests and engineering case studies, MF-SCNN reduces the root mean square error (RMSE) by 28-31% compared to existing state-of-the-art (SOTA) methods, demonstrating its superior performance under unpaired conditions.
[0019] The modular design of this invention decouples low-fidelity and high-fidelity trend learning, allowing each component to focus on a specific task. This design not only improves the interpretability of the model but also enhances its generalization ability, enabling it to exhibit excellent performance even when faced with sparse, high-dimensional, and structurally misaligned data.
[0020] Because this invention does not rely on strict sample alignment or strong fidelity correlation, it can significantly reduce the cost and time of data acquisition in practical applications. By effectively utilizing existing unpaired multifidelity data, this invention provides a cost-effective solution for engineering fields where data is scarce or data acquisition costs are high.
[0021] This invention not only provides a new approach and method for unpaired multi-fidelity data fusion, but also lays a solid foundation for the further development of multi-fidelity modeling technology. Through continuous optimization and improvement of the MF-SCNN framework, it is expected to be extended to more complex engineering and scientific fields in the future, unlocking new application scenarios and value. Attached Figure Description
[0022] Figure 1 This is a flowchart of the MF-SCNN workflow in this invention; Figure 2 This is a diagram of the MF-SCNN framework architecture in this invention; Figure 3 This is a schematic diagram comparing traditional image coding strategies and point-to-domain image coding strategies in multi-fidelity data augmentation. Figure 4 This is a comparative diagram of traditional one-dimensional convolution and multi-scale one-dimensional dilated convolution in this invention; Figure 5 This is a visual illustration of the feature maps before and after convolution in this invention; Figure 6 This is a schematic diagram of principal component analysis of feature representations before and after convolution in this invention; Figure 7 This is a schematic diagram comparing the predictions of the complete MF-SCNN and the ablation variant in this invention; Figure 8 This is a schematic diagram illustrating the quantitative impact of the ablation experiment on the performance of MF-SCNN in this invention; Figure 9 This is a schematic diagram illustrating the performance stability of different models in this invention under different HF data volumes; Figure 10 This is a schematic diagram of the simulation results of a 2D / 3D plate with elliptical holes in this invention; Figure 11 This is a schematic diagram of the spatial distribution of LF (750) and HF (200) samples in this invention; Figure 12 This is a schematic diagram of the LF neural network prediction versus the HF true value in this invention; Figure 13 This is a schematic diagram of MF-SCNN prediction vs. HF true value in this invention; Figure 14 This is a schematic diagram comparing the prediction residuals of all models in this invention; Figure 15 This is a schematic diagram illustrating the difference in fidelity and output caused by mesh granularity in the truss bridge case of this invention; Figure 16 This is a stress and displacement cloud diagram of the bridge components in this invention; Figure 17 This is a schematic diagram comparing the actual output and the predicted output in the test set of this invention; Figure 18 This is a box plot comparison diagram of the absolute residuals of all competing models in this invention. Detailed Implementation
[0023] This invention proposes an unpaired multifidelity data fusion method based on structured panel coding. This method reconstructs the traditional hierarchical regression paradigm using a deep learning framework to solve the problem of fusion of independently sampled, unaligned, and structurally unpaired multifidelity data. The specific embodiments of this invention are described in detail below with reference to the accompanying drawings.
[0024] The system architecture of this invention, MF-SCNN (Multi-fidelity Sequential Convolutional Neural Network), adopts a modular design, and its core architecture is as follows: Figure 1 As shown, it includes three key stages: Pseudo-sample augmentation stage: A baseline neural network is trained using low-fidelity (LF) data to generate pseudo-high-fidelity (HF) samples to augment the dataset. This stage uses the LF model to evaluate at high-fidelity input locations, generating pairing information and effectively alleviating the data scarcity problem. Figure 1 The "Pseudo-sample generation" module in the flowchart.
[0025] Point-to-domain spatial encoding stage: Each HF sample is encoded into a structured class image feature panel, integrating HF input, true response, pseudo-HF output, and the complete LF dataset. Figure 3 (b)). This encoding method transforms the unpaired fusion problem into a standard supervised learning task, eliminating the reliance on traditional point-to-point pairing.
[0026] Hierarchical feature extraction and prediction stage: Cross-fidelity dependencies are extracted using a multi-scale dilated convolution module, and the prediction results are output using a parallel multilayer perceptron (MLP). Figure 2 The architecture diagram includes the "Multi-scale Feature Extraction" and "MLP Fusion" modules.
[0027] The pseudo-sample enhancement stage in this invention includes the following steps: Step 1: LF Pre-training: Train a lightweight neural network using abundant LF data. The loss function is defined as: in, For the true value of LF, Here, θ represents the model's predicted value, λ is the parameter, and λ is the regularization coefficient. This step enables the model to efficiently encode the global trend of LF, laying the foundation for subsequent pseudo-sample generation.
[0028] Step 2: Pseudo-HF Sample Generation: Fix the parameters of the pre-trained LF model and evaluate and generate pseudo-HF outputs at the HF input positions. in, For HF input, These are the optimized LF parameters. This strategy expands the data space and enhances model robustness without requiring additional HF samples.
[0029] The implementation of the unrestrained point-to-domain spatial coding phase includes the following steps: Step 1: Feature panel construction integrates LF, HF, and pseudo-HF data into a high-dimensional tensor, and splices them along the channel axis to form an image-like representation: in D x and D yCorresponding to the input and output dimensions respectively, Concat represents concatenation along the feature dimensions. This encoding method preserves spatial distribution and fidelity information, providing a consistent foundation for subsequent feature extraction.
[0030] Step 2: Multi-scale feature extraction The feature panel is processed using four parallel one-dimensional dilated convolution branches, with kernel sizes of [sizes to be filled in]. The output of the i-th branch is: in ,as well as C represents the length of the sequence obtained after convolution, while C′ is the number of output channels.
[0031] To ensure dimensionality consistency when splicing branches, timing alignment is achieved by truncating all outputs to their minimum length: Subsequently, the multi-scale features are stitched together along the channel dimension: The hierarchical feature extraction and prediction stage in this invention includes the following steps: Step 1: MLP Fusion: The sequence dimension is restored and information is aggregated through two layers of MLP. The final output is reshaped into a one-dimensional vector through flattening operation, which is used to predict the maximum stress and displacement.
[0032] Step 2: Loss Function and Optimization: The mean squared error (MSE) is used as the loss function, and the network parameters are optimized through backpropagation to improve the prediction accuracy.
[0033] This invention successfully achieves efficient fusion of unpaired multi-fidelity data through structured panel coding and multi-scale feature extraction. Specific embodiments are as follows: As shown in Figures 1-8, this invention proposes a non-paired multi-fidelity data fusion method based on structured panel coding. Its MF-SCNN deep learning framework is built upon a hierarchical regression framework used for multi-fidelity modeling. The hierarchical regression framework employs a structured, step-by-step fusion strategy, formally separating "knowledge transfer for fidelity scaling" from "input-dependent correction." While this principle-based decomposition method enhances robustness in cases of data scarcity, its reliance on linear operators limits its expressive power, making it difficult to handle complex nonlinear relationships in non-paired datasets.
[0034] To retain the structured approach and overcome the aforementioned limitations, this invention reconstructs the paradigm using a deep learning framework, achieving a leap from traditional analytical workflows to neural network-based implementations. Its core breakthrough lies in replacing predefined transformations with trainable, hierarchical neural modules, enabling the model to adaptively learn features from data, thus laying the foundational concepts for subsequent specific architectures.
[0035] In this embodiment, the MF-SCNN architecture is specifically designed to address the challenging scenario of multi-fidelity data fusion where data is independently sampled, unaligned, and exhibits non-paired features, a scenario where traditional methods often fail. As shown in Figure 1, its workflow comprises three key stages: Pseudo-sample augmentation: First, train a baseline neural network using only low-fidelity (LF) data to capture its global trends. Then, evaluate the trained LF model at high-fidelity (HF) input locations to generate "pseudo" high-fidelity (HF) outputs, synthesizing pairing information. This expands the HF dataset without additional expensive simulations.
[0036] Point-to-domain spatial encoding: Each HF sample is transformed into a structured, image-like feature panel that integrates HF input, true response value, corresponding pseudo-HF output, and the complete LF dataset, transforming the unpaired fusion problem into a standard supervised learning task.
[0037] Hierarchical Feature Extraction and Prediction: Multi-scale dilated convolutional modules are used to process feature panels, extracting hierarchical features and capturing local interactions within the panels and long-range cross-fidelity dependencies. Finally, the learned representations are fused through a parallel multilayer perceptron (MLP) to output high-precision prediction results.
[0038] This modular design offers two major advantages: first, it decouples low-fidelity and high-fidelity trend learning, allowing each component to focus on a specific task; second, it builds effective information bridges between unpaired datasets through pseudo-sample synthesis and spatial encoding techniques. Therefore, MF-SCNN provides a robust and flexible solution for surrogate modeling, exhibiting excellent performance when dealing with sparse, high-dimensional, and structurally misaligned data.
[0039] like Figure 2 As shown, the MF-SCNN proposed in this embodiment begins with the LF pre-training stage. In this stage, a lightweight neural network is trained using abundant low-fidelity (LF) data to extract global features. The LF loss function is defined as: in represents the model parameters, and represents the regularization coefficients used to prevent overfitting. This pre-training step enables the model to efficiently encode low-fidelity information, thus laying the foundation for the subsequent generation of pseudo-high-fidelity samples.
[0040] Next, MF-SCNN generates pseudo-high-fidelity (Pseudo-HF) output by mapping the HF input to the LF domain using a pre-trained LF model: in This represents the optimized LF parameters after fixing. This strategy expands the HF data space without requiring additional HF samples, thereby enhancing the model's robustness in data-scarce scenarios.
[0041] To effectively bridge the gap between independently sampled high-fidelity (HF) and low-fidelity (LF) data, the MF-SCNN proposed in this invention no longer relies on the traditional point-to-point correspondence, but instead reconstructs the fusion process into a point-to-domain learning task.
[0042] As shown in Figure 3(a), traditional multifidelity frameworks typically require datasets to be in paired form, meaning that each HF sample must have a corresponding LF observation at the same input location. However, in practical applications, this assumption is often difficult to achieve due to the separation of sampling mechanisms and structural misalignment.
[0043] MF-SCNN employs a point-to-domain fusion strategy, utilizing the complete LF dataset as contextual support for each HF sample to construct a domain-level representation. Figure 3(b) illustrates this transformation process using image-like feature panel encoding. Specifically, MF-SCNN integrates LF, HF, and pseudo-HF data into a unified high-dimensional tensor, which preserves both spatial distribution information and specific fidelity information. For each HF input... Its corresponding high-fidelity response The synthesized pseudo-HF output is then concatenated along the channel axis with the complete LF dataset to form an image-like representation. The definition is as follows: in D x and D y Corresponding to the input and output dimensions respectively, Concat represents concatenation along the feature dimensions. This encoding method ensures that all fidelity levels are uniformly represented within the same convolutional domain, thus providing a consistent and information-rich foundation for subsequent feature extraction.
[0044] Subsequently, MF-SCNN employs a multi-scale dilated convolution module, aiming to comprehensively extract cross-fidelity dependencies from long and information-rich encoded sequences across multiple spatial and temporal scales. Let B represent the input feature tensor, where B is the batch size, L is the sequence length, and C is the channel dimension. Given that low-fidelity (LF) inputs generated from image encoding often result in long sequences, traditional convolutions with fixed receptive fields struggle to simultaneously aggregate local and global information. To address this limitation, MF-SCNN employs four parallel one-dimensional dilated convolution branches with kernel sizes of [missing information]. The output of the i-th convolutional branch is calculated as follows: in ,as well as C represents the length of the sequence obtained after convolution, while C′ is the number of output channels. To ensure dimensionality consistency when concatenating branches, temporal alignment is achieved by truncating all outputs to the minimum length. Subsequently, the multi-scale features are stitched together along the channel dimension: Next, the sequence dimension is recovered using a two-layer multilayer perceptron (MLP), followed by pooling to aggregate information along the sequence direction and reduce its length. Finally, a flattening step reshapes the output into a one-dimensional vector for final prediction. As shown in Figure 4, in MF-SCNN, the multi-scale dilated convolution module is the core component of multi-fidelity data fusion, which can effectively capture local and global dependencies in long and complex sequences.
[0045] This module takes feature sequences derived from original high-fidelity (HF) input, pseudo-high-fidelity (pseudo-HF), and low-fidelity (LF) data, and processes them through parallel one-dimensional dilated convolution branches with different receptive fields, thereby achieving information aggregation across multiple spatiotemporal scales. This design overcomes the locality limitations of standard one-dimensional convolution and enhances the fusion effect of heterogeneous fidelity features.
[0046] This module expands the effective receptive field without significantly increasing the number of parameters, generating more comprehensive, robust, and generalizable feature representations, thereby improving the prediction accuracy and adaptability of the model in challenging unpaired multifidelity scenarios.
[0047] After feature extraction, MF-SCNN combines the outputs of linear and nonlinear MLPs to generate the final prediction. The linear part (MLP) L ) Calculate the linear output, but not the nonlinear part (MLP) NL This utilizes an activation function to capture complex relationships. The two outputs are passed through a learnable weight parameter. α The final model prediction obtained by fusion is: This hybrid fusion strategy can adaptively balance simple and complex relationships to achieve optimal prediction.
[0048] Table 1. Computation process and input-output mapping of MF-SCNN. Table 1 summarizes the computational flow and input-output mapping of MF-SCNN, detailing all key modules and transformation processes from the original input to the final prediction. This staged design, integrating pseudo-high-fidelity enhancement and multi-branch dilated convolution, lays a robust foundation for the subsequent prediction module.
[0049] In this embodiment, the training of MF-SCNN follows a two-stage process. In the first stage, the LF network is optimized by minimizing the LF loss function using LF data. In the second stage, the entire network is fine-tuned by minimizing the HF loss function. in This represents the model parameters that participate in the final predicted path, while This is the regularization term used for HF learning.
[0050] To ensure computational efficiency, MF-SCNN employs the Adam optimizer based on minibatches. Table 2 summarizes the specific steps of the MF-SCNN training protocol, detailing procedures including LF pre-training, pseudo-HF data generation, feature aggregation, HF training, and new sample prediction. This step-by-step training scheme ensures stable convergence of the model and effective knowledge transfer across fidelity levels.
[0051] Table 2. Step-by-step training process of the MF-SCNN model To ensure the comprehensiveness and fairness of the evaluation, this embodiment employs four widely used quantitative indicators: root mean square error (RMSE), normalized root mean square error (NRMSE), and coefficient of determination (R²). 2 And the maximum absolute error (MAE). The mathematical definitions of these metrics are as follows: Among them, RMSE, NRMSE, and R 2 MAE and are used to comprehensively evaluate prediction accuracy, scale normalization performance, trend consistency, and local error, respectively.
[0052] To comprehensively evaluate the effectiveness and robustness of the proposed MF-SCNN framework, a series of multi-fidelity experiments were conducted, covering both classical benchmark functions and practical finite element analysis (FEA) scenarios. The experimental setup simulated real-world conditions, namely significant sample misalignment and weak correlation between low-fidelity (LF) and high-fidelity (HF) data. This condition poses a significant challenge to traditional methods that rely on strictly paired data or strong cross-fidelity relationships. To ensure fairness, necessary interpolation was performed on the benchmark methods where required. The performance of MF-SCNN was compared with several advanced methods, including MF-TLNN, MF-C-NN, and HF-NN.
[0053] In the field of surrogate modeling, accurate modeling of multi-fidelity data with sample misalignment remains a key challenge. To address this, nine analytical benchmark functions with varying complexities and cross-fidelity correlations were evaluated; details are summarized in Table 4. Samples were uniformly drawn from a shared input domain, and most benchmarks intentionally employed unpaired sampling to rigorously test model adaptability. Model performance was evaluated using root mean square error (RMSE), normalized root mean square error (NRMSE), coefficient of determination (R²), mean absolute error (MAE), and computation time. To ensure fairness, all models were trained using optimized hyperparameters; the specific hyperparameter settings for MF-SCNN are shown in Table 3.
[0054] Table 3. Training hyperparameters for MF-SCNN based on the baseline MF function. Table 4. Training the hyperparameters of MF-SCNN for use as a baseline MF function Table 5. Quantitative performance metrics for all models on baseline 1. Table 6. Quantitative performance metrics for all models on baseline 2. Table 7. Quantitative performance metrics for all models on benchmark 3. Table 8. Quantitative performance metrics for all models on benchmark 4. Table 9. Quantitative performance metrics for all models on benchmark 5. Table 10. Quantitative performance metrics for all models on benchmark 6. Table 11. Quantitative performance metrics for all models on benchmark 7. Table 12. Quantitative performance metrics for all models on benchmark 8. Table 13. Quantitative performance metrics for all models on benchmark 9. As summarized in Table 5–13, MF-SCNN consistently achieves the lowest error metrics and the highest R-values in almost all benchmark tests. 2 The values demonstrate its superior prediction accuracy and robustness. Specifically, in benchmark 3, MF-SCNN achieves an RMSE of 0.3572 and an R-value of 0.9991. 2 It significantly outperforms HF-NN (RMSE 0.8583, R0). 2 0.8147), MF-TLNN (RMSE 0.5736, R 2 0.9643) and MF-C-NN (RMSE 0.5281, R 2 0.9589).
[0055] To demonstrate the feature extraction capabilities of MF-SCNN, we visualized the feature maps before and after the convolutional layers using benchmark 2. For example... Figure 5 As shown in (a), the features before convolution are smooth and homogeneous, indicating limited spatial expressive power. In contrast, Figure 5 The post-convolutional features in (b) demonstrate clear structural diversity and multi-channel granularity, revealing that the convolutional kernel enhances the detection capability of local patterns through spatially constrained filtering and nonlinear activation.
[0056] To perform quantitative evaluation, we applied principal component analysis (PCA) to the feature space. Figure 6 The results show that before convolution, the first two principal components explain 95.6% and 4.4% of the variance, respectively, reflecting high redundancy. After convolution, these values become 73.9% and 20.9%, indicating a wider variance distribution and increased intrinsic dimensionality. The expanded confidence ellipse further supports enhanced feature separability. The reduction in explained variance is consistent with higher feature entropy, which strengthens the representational power of regression modeling. In summary, these results confirm that MF-SCNN, through its convolutional design, transforms homogeneous inputs into discriminative multi-scale features, providing an interpretable basis for its performance gains on unstructured multifidelity data.
[0057] To evaluate the contribution of each core component in the MF-SCNN architecture, we conducted a rigorous ablation study. Figure 7 The predictive power of the full model and the variant model built by removing a single module was intuitively compared, and Figure 8 This provides a quantitative summary of performance.
[0058] The full MF-SCNN model exhibits the best performance, with an RMSE of 0.3139 and an R² of 0.9993. Removing the pseudo-high-fidelity (Pseudo-HF) data generation component leads to the most significant performance drop, increasing the RMSE to 1.2467 and decreasing the R² to 0.9613. This result confirms that generating synthetic high-fidelity samples is crucial for addressing the inherent problem of data scarcity.
[0059] Similarly, replacing the multi-scale dilated convolution module with a standard convolution operation increases the RMSE to 1.1917 and decreases the R² to 0.9640, indicating that capturing hierarchical spatial features is fundamental and necessary for modeling complex data structures. Replacing the hybrid fusion mechanism with a standard MLP also leads to a significant decrease in accuracy (RMSE: 0.9082, R²: 0.9762), suggesting that a specialized strategy must be employed to integrate heterogeneous feature representations to achieve optimal performance.
[0060] These results validate that each module plays a unique and crucial role. The pseudo-high-fidelity module alleviates the limitations of sparse data, multi-scale convolution extracts information-rich spatial patterns, and the fusion mechanism effectively combines cross-fidelity information. The observed sustained performance degradation after removing any component not only highlights their individual necessity but, more importantly, demonstrates the synergistic advantages of the integrated MF-SCNN design.
[0061] Table 14. Average model performance and computational cost. To evaluate the robustness and practical efficiency of multifidelity models under different data availability, we examined their performance stability across six benchmarks as the amount of high-fidelity data increases. Figure 9The performance evolution revealed strikingly different behavioral patterns. The proposed MF-SCNN model and MF-TLNN variant exhibit a pronounced data thresholding effect. Their prediction accuracy only shows a significant non-linear improvement after a critical amount of available high-fidelity data is reached, suggesting that their architectural capacity requires a sufficient amount of high-quality data for effective training. In contrast, the MF-C-NN model exhibits smoother and more consistent convergence across the entire data range, implying greater robustness under data-constrained conditions. The HF-NN benchmark, however, demonstrates rapid diminishing returns, with its performance saturating after only a small data increment, thus validating the fundamental motivation behind multifidelity learning.
[0062] Table 14 lists the overall computational costs, quantifying the resource consumption associated with these performance characteristics. The results confirm a consistent trade-off between accuracy and efficiency. The proposed MF-SCNN model achieves the highest average prediction accuracy (ln(NRMSE) = –2.61), validating the effectiveness of its design. This performance advantage comes at the cost of greater computational overhead, with approximately 65% more parameters than the efficient HF-NN benchmark and a training time 2.7 times longer. The MF-C-NN model offers a balanced compromise, providing significantly improved accuracy compared to the benchmark with a moderate increase in resource consumption.
[0063] In practice, model selection should align with the specific application priorities. For tasks where maximizing prediction accuracy is the primary objective, the proposed MF-SCNN framework is the optimal solution. For applications requiring a practical balance between performance and computational efficiency, the MF-C-NN model is recommended. The HF-NN benchmark remains feasible only in contexts where reduced accuracy is permissible and resources are extremely limited. Analysis based on comprehensive performance and cost metrics establishes the effectiveness of the proposed framework and provides clear guidance for its specific deployment.
[0064] Plates with elliptical holes are a classic benchmark widely used in stress concentration analysis, providing an ideal scenario for evaluating multifidelity modeling methods. In this embodiment, we focus on a plate with an elliptical outer boundary (composed of...). (definition) and central elliptical hole (by) A quarter-plate model (described) was subjected to two-dimensional (2D) and three-dimensional (3D) finite element simulations, where x and y represent in-plane spatial coordinates measured in meters. The material was assumed to be isotropic and linearly elastic, with a Young's modulus of... The Poisson's ratio is 0.3.
[0065] For the 2D finite element simulation, a plane stress assumption with a thickness of 0.1 m was adopted. Symmetrical boundary conditions were applied at the edges of the corresponding coordinate planes x=0 and y=0. On these symmetric boundaries, normal displacement was constrained to zero, while tangential displacement was allowed to occur freely. To simulate realistic loading conditions, conditions were applied along the boundaries of the elliptical hole. The uniform tensile force was applied. In contrast, the 3D simulation employed a full-space representation with a thickness increased to 0.6 m, specifically designed to capture significant thickness-direction stress variations. Consistent with the 2D model, symmetric boundary conditions were enforced on the planes of symmetry (x=0 and y=0), restricting normal displacement and allowing tangential motion. Furthermore, to reproduce realistic thickness-direction loading, a uniform tensile force was applied to the top surface of the plate (upper Z-plane). The uniformly distributed normal pressure. To ensure numerical stability and prevent rigid body motion, displacement constraints in the thickness direction are applied only to the boundary regions, thus allowing natural deformation in the central region of the plate.
[0066] like Figure 10 As shown in (a), the 2D simulation reveals significant stress concentration in the mid-surface region near the hole. Conversely, Figure 10 (b) shows that the 3D simulation not only captures in-plane effects but also highlights significant stress and displacement variations along the thickness direction, especially near the hole boundary. The introduction of Z-plane loading and spatial constraints leads to a more complex and realistic stress distribution, which emphasizes the limitations of pure 2D models and confirms the necessity of incorporating high-fidelity 3D data into multi-fidelity modeling benchmarks.
[0067] The model's input consists of in-plane spatial coordinates (x, y), and the output includes the (von Mises) equivalent stress and total displacement at each location. Low-fidelity (LF) data is obtained through 2D simulations using a simplified physical model, while high-fidelity (HF) data is extracted from the top surface (Z-plane) of the 3D simulation with the same mesh size configuration. This setup ensures that the two fidelity levels share a consistent spatial resolution, enabling fair comparisons and smooth learning across datasets. All samples are generated independently and remain unpaired, accurately reflecting the heterogeneous and unstructured nature of multi-fidelity data in real-world engineering scenarios. To maintain consistency, the 3D top surface results are used as the reference standard for all subsequent model training and evaluation.
[0068] Figure 11The spatial distribution of the 2D / 3D board case input samples is shown. A total of 750 low-fidelity samples (blue circles) and 200 high-fidelity samples (red triangles) are independently distributed throughout the domain. These samples are unaligned and randomly positioned, reflecting a typical unpaired, weakly correlated multi-fidelity scenario. This sampling structure poses a significant challenge to traditional surrogate modeling techniques that rely on data pairing or strong cross-fidelity relationships.
[0069] Figure 12 The performance of the LF neural network was compared with that of the true HF values. Panels (a) and (b) show the true von Mises stress and displacement fields obtained from the 3D HF simulation (top surface), while (c) and (d) show the predictions of the LF model. The corresponding relative error plots (e) and (f) reveal significant deviations. Notably, the LF model fails to capture local variations near the hole boundary, with stress errors exceeding 250%, and displacement errors increasing sharply in high-gradient regions. These results highlight the limitations of using only the LF model when applying it to heterogeneous domains.
[0070] Figure 13 Results using the proposed MF-SCNN framework are presented. Panels (c) and (d) show that MF-SCNN achieves excellent agreement with the HF reference values (a) and (b), while panels (e) and (f) show a significant reduction in relative error. Stress and displacement errors are below 5% in most regions, demonstrating the ability of MF-SCNN to accurately reconstruct high-fidelity fields from unpaired multifidelity data.
[0071] To further evaluate the prediction accuracy, Figure 14 A scatter plot comparing the predicted values of all four models with the actual high-fidelity values is shown. The MF-SCNN model (a) shows that its predicted values closely fit the diagonal, producing an R-value of 0.9941. 2 This indicates excellent agreement with the reference data. In contrast, the pure HF model (b), MF-TLNN (c), and MF-C-NN (d) show greater dispersion and tend to underestimate the results, especially in regions with higher stress values.
[0072] Table 15. Comparison of quantitative performance of all models. Table 15 summarizes the key performance metrics for all models. MF-SCNN achieves the lowest RMSE and MAE, and the highest R0. 2It outperforms all benchmark models. Although its training time is slightly longer than that of the standard HF network, the improved prediction accuracy under data scarcity and misalignment settings validates the effectiveness of the proposed architecture.
[0073] The results demonstrate that MF-SCNN successfully bridges the fidelity gap under spatially unstructured and unpaired conditions. By integrating low-fidelity learning, pseudo-HF generation, and deep feature encoding, MF-SCNN provides robust and high-precision surrogate modeling performance for solving complex multi-fidelity problems.
[0074] This embodiment focuses on a truss bridge with a 40-meter span, a 7-meter bridge deck width, and a 5-meter main truss spacing, conforming to standard design specifications. This bridge, a typical example of a large-span load-bearing structure commonly used in civil and mechanical engineering, is widely adopted due to its high efficiency and stability. All structural components are modeled as steel with conventional elastic properties. The left support is completely fixed in all translational directions, while the right support restricts vertical and lateral displacements but allows longitudinal movement. The loading conditions include a uniformly distributed gravity load (bridge self-weight) and a concentrated vertical force of 500 kN at the mid-span of the bridge deck. The multi-fidelity dataset was generated from finite element analyses with different mesh densities. The low-fidelity (LF) model was discretized using a coarse mesh, while the high-fidelity (HF) model used a refined mesh. The difference in mesh granularity leads to heterogeneity in the multi-fidelity data. The correspondence between LF and HF samples is imperfect and weakly correlated, reflecting the unpaired nature of practical multi-fidelity data. Figure 16 The displayed von Mises stress and displacement contour maps of the bridge deck and truss members highlight the differences in stress and displacement distribution patterns captured by the LF and HF models, emphasizing the importance of accurate, high-fidelity data. In the supervised learning task, the input variables are thickness (0.5 - 1.5 m) and Young's modulus (1.6 × 10⁻⁶). 11 –2.2×10 11 The output is the maximum stress and displacement. A total of 125 independently distributed and unpaired LF samples and 27 HF samples were collected, with an LF / HF ratio of approximately 4:1, which closely simulates the uncertainty and heterogeneity of real-world engineering multifidelity datasets.
[0075] Table 16. Quantitative performance metrics for all models. To ensure fairness, this embodiment compares the proposed MF-SCNN framework with three mainstream benchmark models under optimized hyperparameter settings. Figure 17 The results show that the point-by-point comparison of the actual and predicted values of the maximum stress and displacement in the test set indicates that the MF-SCNN prediction results are highly consistent with the actual values in terms of stress and displacement response. Figure 18Box plots show that the absolute residuals generated by MF-SCNN are significantly lower and more consistent with other benchmark models. Table 16 summarizes the quantitative results. MF-SCNN achieves the lowest RMSE (0.0201), NRMSE (0.0408), MAE (0.0826), and the highest R² (0.9865), significantly outperforming the benchmark methods. Although the computation time is slightly higher, the substantial improvement in prediction accuracy and robustness can compensate for this cost, which is of great significance for safety-critical or resource-constrained engineering applications. These results collectively demonstrate that MF-SCNN performs excellently in handling unpaired, weakly correlated multifidelity data and has strong generalization ability. Its ability to fuse heterogeneous simulation source information enables highly reliable structural response prediction, laying the foundation for extending multifidelity modeling to more complex structures. This embodiment proposes MF-SCNN, a novel stepwise convolutional neural network, for unpaired multifidelity data fusion, avoiding the requirements of existing methods for sample alignment and strong cross-fidelity correlation. Comprehensive experiments demonstrate that it is more accurate and robust in scenarios with misaligned and weakly correlated data.
[0076] This embodiment has been extensively evaluated across nine benchmark functions and two engineering case studies, validating the superiority of the proposed method. Compared to traditional pure high-fidelity models, MF-SCNN reduces the root mean square error (RMSE) by an average of 67%; compared to state-of-the-art multi-fidelity methods, it reduces RMSE by 31% and 28% compared to MF-TLNN and MF-C-NN, respectively. This highlights the model's superior accuracy and robustness in real-world scenarios characterized by unpaired, misaligned, and weakly correlated data. Ablation experiments demonstrate that pseudo-high-fidelity (Pseudo-HF) enhancement and multi-scale feature extractors are indispensable components contributing to this significant performance improvement.
[0077] While current work lays a solid foundation for static regression tasks, several promising future research directions remain: first, extending the MF-SCNN framework to handle dynamic or time-dependent problems; second, incorporating physical information constraints to enhance generalization ability; and third, optimizing its architecture for large-scale industrial applications. Furthermore, extending the framework to handle problems with more than two fidelity levels also holds great promise. Although this embodiment focuses on dual-fidelity scenarios, the core "point-to-domain" encoding concept is naturally suited to this extension, particularly for problems with N fidelity levels (LF1, ..., LF2). N-1 For the problem of HF (High-Frequency High-Frequency), N-1 intermediate proxy models can be trained first, and then N-1 pseudo-outputs can be generated for each HF input and concatenated with all available data to form a comprehensive feature panel. The performance and computational trade-offs of this multi-level fusion method are studied, which is expected to open up new applications in more complex engineering and scientific fields.
Claims
1. A method for unpaired multi-fidelity data fusion based on structured panel coding, characterized in that, Includes the following steps: A baseline neural network was trained using low-fidelity (LF) data to capture global trends; By using the trained LF model to generate pseudo-high-fidelity (Pseudo-HF) outputs at high-fidelity (HF) input locations, a pairing information augmentation HF dataset is constructed. Each HF sample is transformed into a structured image-like feature panel, integrating HF input, true response value, corresponding pseudo-HF output, and the complete LF dataset; Hierarchical feature extraction of feature panels is performed using a multi-scale dilated convolution module to capture local interactions and long-range cross-fidelity dependencies; High-precision prediction results are output by fusing learned feature representations through parallel multilayer perceptron (MLP).
2. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 1, characterized in that, The multi-scale dilated convolution module includes four parallel one-dimensional dilated convolution branches, each with a different kernel size, to expand the effective receptive field and aggregate multi-scale spatiotemporal information.
3. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 2, characterized in that, The output of the multi-scale dilated convolution module achieves dimensionality consistency through temporal alignment and channel concatenation operations. Then, the sequence dimension is restored through two layers of MLP and pooling operations are performed. Finally, the output is reshaped into a one-dimensional vector for prediction.
4. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 1, characterized in that, The method for constructing the feature panel includes: The high-fidelity response corresponding to each HF input and the synthesized pseudo-HF output are concatenated with the complete LF dataset along the channel axis; A unified high-dimensional tensor is formed, preserving spatial distribution information and specific fidelity information.
5. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 1, characterized in that, The parallel multilayer perceptron (MLP) includes a linear part (MLP). L ) and nonlinear part (MLP) NL The outputs of both are fused using a learnable weight parameter α to obtain the final prediction result. Where Z is the feature map of multi-scale convolution, and α is the learnable weight parameter.
6. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 1, characterized in that, The training process adopts a two-stage procedure: Phase 1: Minimize the LF loss function using LF data and optimize the LF network parameters; The second stage involves fine-tuning the entire network by minimizing the HF loss function using the mini-batch-based Adam optimizer.
7. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 6, characterized in that, Both the LF loss function and the HF loss function include a regularization term to prevent overfitting, defined as follows: Where, N LF and N HF λ represents the number of LF and HF samples, respectively. LF and λ HF θ is the regularization coefficient. LF and θ HF These are the model parameters.
8. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 1, characterized in that, The method also includes a pseudo-high-fidelity data generation step, specifically: mapping the HF input to the LF domain using a pre-trained LF model to generate a pseudo-high-fidelity (Pseudo-HF) output. in This represents the optimized LF parameters after fixing.
9. The method for unpaired multi-fidelity data fusion based on structured panel coding according to claim 1, characterized in that, The method further includes a feature aggregation step, specifically: The true response value, pseudo-HF output, and complete LF dataset corresponding to each HF input are concatenated along the feature dimension to form a feature panel: in D x and D y Corresponding to the input and output dimensions respectively, Concat means concatenation along the feature dimensions.
10. A method for unpaired multi-fidelity data fusion based on structured panel coding according to any one of claims 1 to 9, characterized in that, The method is applied to multi-fidelity data fusion tasks in the fields of finite element analysis, structural health monitoring, or materials science. The evaluation metrics include root mean square error (RMSE), normalized root mean square error (NRMSE), coefficient of determination (R²), and maximum absolute error (MAE).