A method for predicting loss thickness of rare earth steel based on machine learning

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By establishing a predictive model for the loss thickness of rare earth steel using machine learning methods, the problem of correlating laboratory data with real atmospheric corrosion data was solved, enabling high-precision prediction and rapid composition design of rare earth steel, thus improving R&D efficiency and cost-effectiveness.

CN122201489APending Publication Date: 2026-06-12SHANGHAI UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: SHANGHAI UNIV
Filing Date: 2026-03-10
Publication Date: 2026-06-12

Application Information

Patent Timeline

10 Mar 2026

Application

12 Jun 2026

Publication

CN122201489A

IPC: G16C20/20; G16C20/30; G16C20/70; G06F18/2135; G06N20/00

AI Tagging

Application Domain

Chemical property prediction Molecular entity identification

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing technologies lack a systematic approach that can simultaneously achieve high-precision prediction of rare earth steel loss thickness and reliable laboratory-to-field data conversion, making it difficult to achieve rapid and accurate assessment of the corrosion resistance of rare earth steel.

Method used

A phased or end-to-end prediction method based on machine learning is adopted. Through data acquisition, preprocessing, feature engineering and model building, the correlation between laboratory and real atmospheric corrosion data is established. Machine learning algorithms such as K-nearest neighbor regression model are used to achieve accurate prediction of rare earth steel loss thickness.

Benefits of technology

It achieves high-precision prediction of rare earth steel loss thickness. The high prediction accuracy can quickly guide composition design, significantly improve R&D efficiency and reduce costs.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122201489A_ABST

Patent Text Reader

Abstract

The application discloses a rare earth steel loss thickness prediction method based on machine learning, and belongs to the technical field of metal material corrosion and protection. The method includes two parallel technical schemes: the first scheme adopts a staged modeling strategy, a first model is first constructed to predict the laboratory accelerated corrosion loss thickness, a second model is then constructed to establish the mapping relationship between the laboratory and the real atmosphere corrosion, and finally the two models are jointly optimized; the second scheme adopts an end-to-end integrated modeling strategy, and a single model is used to directly realize the prediction from the input features to the real atmosphere loss thickness. In the two schemes, the rare earth element content is taken as the core feature, key parameters are selected through feature engineering, and real atmosphere data are used for model training and optimization. The application can accurately quantify the influence of the rare earth content on corrosion resistance, solve the accurate mapping problem between the laboratory data and the atmosphere data, and significantly improve the prediction accuracy.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of metal material corrosion and protection technology, specifically to a machine learning-based method for predicting the loss thickness of rare earth steel. Background Technology

[0002] Rare earth elements in steel play a role in purifying molten steel, removing impurities, and refining grains, effectively improving the mechanical properties and corrosion resistance of steel. However, there is an optimal range for the amount of rare earth elements added to achieve their effect; too much or too little may fail to achieve the desired corrosion resistance. Therefore, accurately assessing the corrosion loss thickness of rare earth steels with different rare earth contents is crucial for optimizing their composition design and improving service life.

[0003] Traditional corrosion assessment methods primarily rely on accelerated corrosion tests in laboratories and long-term exposure to natural atmospheres. While accelerated corrosion tests can yield results quickly, the corrosive environment differs significantly from the real atmospheric environment, making it difficult for the results to directly and accurately reflect corrosion behavior under actual service conditions. Real atmospheric exposure tests, while reliable, are time-consuming (typically requiring years or even decades) and costly, failing to meet the urgent needs of rapid material development and selection in modern applications. In recent years, machine learning technology has shown great potential in material performance prediction. However, applying it to rare earth steel corrosion prediction faces the core challenge of effectively integrating the key factor of "rare earth content" and accurately correlating and mapping laboratory accelerated corrosion data with real atmospheric corrosion data.

[0004] Current technologies lack a systematic approach capable of simultaneously achieving high-precision loss thickness prediction and reliable laboratory-to-field data conversion, which has become a technical bottleneck restricting the rapid and accurate assessment of the corrosion resistance of rare earth steels. Therefore, developing a new method to solve these problems is of significant practical importance. Summary of the Invention

[0005] The purpose of this invention is to provide a machine learning-based method for predicting the loss thickness of rare earth steel. To overcome the shortcomings of existing technologies, a machine learning-based method can be established to quickly and accurately predict the loss thickness of rare earth steel and to establish a correlation between laboratory and real atmospheric corrosion data.

[0006] The present invention aims to overcome the shortcomings of the prior art and provide a solution.

[0007] To achieve the above objectives, the technical solution adopted by the present invention is as follows: Solution 1: This invention provides a machine learning-based method for predicting the loss thickness of rare earth steel. This method decomposes the complex corrosion prediction problem into two relatively independent stages, deriving the model step by step, thus improving the interpretability and stability of the model. The method includes the following steps: S1. Data Acquisition and Preprocessing: Collect chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests. Clean and standardize all datasets. S2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements. Use feature selection methods to screen out key features that have a significant impact on the loss thickness. S3. First Model Construction and Training: Construct a first machine learning model, using the selected key features as input and the loss thickness in the laboratory corrosion dataset as output, and train the model. S4. Second model construction and training: Construct a second machine learning model, using the predicted value of laboratory accelerated corrosion loss thickness output by the first machine learning model as input and the loss thickness in the real atmospheric corrosion dataset as output, and train the model. S5. Joint Model Optimization: The first and second machine learning models trained by the training are jointly optimized using a portion of the data in the real atmospheric corrosion dataset to minimize the prediction error of the real atmospheric loss thickness. S6. Corrosion Prediction and Application: Input the key characteristics of the rare earth steel to be predicted into the optimized first machine learning model to obtain the predicted value of accelerated corrosion loss thickness in the laboratory. Then input this predicted value into the optimized second machine learning model to obtain the final predicted value of loss thickness under real atmospheric conditions.

[0008] Option 2: This invention provides a machine learning-based method for predicting the loss thickness of rare earth steel. This method employs an end-to-end prediction approach, achieving direct mapping through a single model, resulting in a compact process and high prediction efficiency. The method includes the following steps: S1. Data Acquisition and Preprocessing: Collect chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests. Clean and standardize all datasets. S2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements, other chemical components of steel, and laboratory accelerated corrosion test parameters, and screen out key features. S3. Integrated Model Construction and Training: Construct an integrated machine learning model, using the selected key features as input and the loss thickness in the real atmospheric corrosion dataset as the direct output, and train the model. S4. Model Optimization: Optimize the trained ensemble machine learning model using the real atmospheric corrosion dataset; S5. Corrosion Prediction and Application: Input the key features of the rare earth steel to be predicted into the optimized integrated machine learning model, and directly output its predicted loss thickness under real atmospheric conditions.

[0009] As a further optimization of the technologies in Scheme 1 and Scheme 2, the present invention also includes the following preferred embodiments: In one specific implementation scheme, in S1, the laboratory accelerated corrosion test includes at least one of the following: neutral salt spray test, cyclic wet-dry corrosion test, and immersion corrosion test.

[0010] In one specific implementation scheme, in S2, the feature selection method includes at least one of correlation analysis, principal component analysis, and recursive feature elimination.

[0011] In one specific implementation scheme, the machine learning model is a model that uses loss thickness as the prediction target, including one or more of the following: linear regression model, support vector machine regression model, K-nearest neighbor regression model, decision tree model, random forest model, gradient boosting model, and artificial neural network model.

[0012] In one specific implementation, the method further includes: obtaining predicted loss thickness values for different rare earth additions by changing the rare earth addition content parameters of the input model, and determining the optimal content range of rare earth addition based on the predicted values.

[0013] In one specific implementation scheme, the method further includes: establishing an equivalent conversion relationship between laboratory accelerated corrosion time and real atmospheric exposure time; the equivalent conversion relationship is obtained by constructing corrosion kinetic models in laboratory environment and real atmospheric environment respectively, and linking the two based on the principle of equal loss thickness.

[0014] In one specific implementation scheme, when the real atmospheric corrosion data is insufficient to construct a complete corrosion kinetic model, a data extrapolation method is used to predict the corrosion loss thickness at the target time point, thereby improving the real atmospheric corrosion kinetic model. The predicted values are then calibrated based on the real atmospheric corrosion dataset to correct prediction errors.

[0015] In one specific implementation, the rare earth steel is rare earth microalloyed low alloy steel, rare earth stainless steel, or rare earth heat-resistant steel.

[0016] Scheme 1 can be further specified that, in S5, the joint optimization of the models is an iterative process, which includes continuously feeding back new real atmospheric exposure data to update and optimize the parameters of the first machine learning model and the second machine learning model.

[0017] The present invention provides a machine learning-based method for predicting the loss thickness of rare earth steel, which has the following significant advantages compared with existing technologies: High prediction accuracy: This invention fully utilizes the powerful nonlinear fitting capabilities of machine learning algorithms to accurately characterize the complex relationship between rare earth content and corrosion loss thickness. Examples show that the prediction model based on the K-nearest neighbor regression model achieves a determination coefficient (R²) of over 0.90 for laboratory data, and the relative error in predicting annual corrosion loss thickness in the real atmospheric environment is as low as 1.09% and 2.78%, respectively, far superior to traditional empirical formulas or simple linear models.

[0018] Successfully establishing a data bridge: This invention jointly optimizes the model by introducing real atmospheric data and creatively establishes an equivalent conversion relationship between laboratory and real atmospheric corrosion time based on the principle of equal loss thickness. Supplemented by grey prediction and extrapolation, it effectively solves the problem of correlating laboratory accelerated test results with corrosion behavior in real service environments. For example, the embodiments clearly conclude that a 480-hour laboratory immersion test is equivalent to 10 years of exposure to a certain industrial atmosphere, providing a scientific basis for the formulation of accelerated testing standards.

[0019] Highly practical and significantly improves R&D efficiency: This method can quickly predict corrosion resistance based on changes in rare earth content, enabling rapid component screening. Application cases show that companies can directly guide component design based on model prediction results (such as meeting the requirement of an annual corrosion rate ≤5μm when the rare earth content is in the range of 0.009%~0.017%), transforming lengthy natural exposure tests into short-term laboratory verification, greatly shortening the R&D cycle and reducing costs. Attached Figure Description

[0020] Figure 1 This is a flowchart of the method described in Embodiment 1 of the present invention; Figure 2 This is a flowchart of the method described in Embodiment 2 of the present invention. Detailed Implementation

[0021] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the invention. Furthermore, the technical features involved in the various embodiments of this invention described below can be combined with each other as long as they do not conflict with each other.

[0022] Example 1: Prediction Method Based on Staged Machine Learning Model This embodiment provides a machine learning-based method for predicting the loss thickness of rare earth steel, including the following steps: S1. Data Acquisition and Preprocessing: Collect chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests. Clean and standardize all datasets. S2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements. Use feature selection methods to screen out key features that have a significant impact on the loss thickness. S3. First Model Construction and Training: Construct a first machine learning model, using the selected key features as input and the loss thickness in the laboratory corrosion dataset as output, and train the model. S4. Second model construction and training: Construct a second machine learning model, using the predicted value of laboratory accelerated corrosion loss thickness output by the first machine learning model as input and the loss thickness in the real atmospheric corrosion dataset as output, and train the model. S5. Joint Model Optimization: The first and second machine learning models trained by the training are jointly optimized using a portion of the data in the real atmospheric corrosion dataset to minimize the prediction error of the real atmospheric loss thickness. S6. Corrosion Prediction and Application: Input the key characteristics of the rare earth steel to be predicted into the optimized first machine learning model to obtain the predicted value of accelerated corrosion loss thickness in the laboratory. Then input this predicted value into the optimized second machine learning model to obtain the final predicted value of loss thickness under real atmospheric conditions.

[0023] The above process is explained in detail below.

[0024] 1. Data Acquisition and Preprocessing: Chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests were collected. All datasets were cleaned and standardized. The accelerated corrosion tests included at least one of the following: neutral salt spray test, cyclic wet-dry corrosion test, and immersion corrosion test.

[0025] This process aims to systematically and systematically acquire the basic data required to build machine learning models and to perform rigorous preprocessing to ensure data quality, thus laying the foundation for the subsequent establishment of high-precision prediction models.

[0026] 1.1 Sample Preparation and Data Sources First, a series of steel samples with different rare earth contents were prepared. These rare earth steels are applicable to various rare earth alloyed steels, including rare earth microalloyed low-alloy steels, rare earth stainless steels, and rare earth heat-resistant steels. This embodiment takes the most representative rare earth microalloyed low-alloy steel (such as Q355 steel) as an example, selecting four groups of lanthanum-cerium (La-Ce) mixed rare earth microalloyed Q355 low-alloy steel samples with rare earth contents (mass fraction) of 0% (labeled RE0), 0.009% (RE90), 0.017% (RE170), and 0.028% (RE280), respectively. The main chemical composition range of the base steel is: C: 0.12%~0.20%, Si: 0.12%~0.30%, Mn: 0.30%~0.60%, P≤0.045%, S≤0.050%. Two parallel specimens measuring 60 mm × 40 mm × 3 mm were prepared for each sample group. All specimens underwent uniform surface treatment (including polishing, degreasing, and rust removal) in preparation for subsequent tests.

[0027] 1.2 Laboratory Accelerated Corrosion Test The above samples underwent accelerated corrosion testing in the laboratory. This embodiment preferably uses a cyclic immersion wet-dry corrosion test (referencing GB / T 19746-2005) to simulate the industrial atmospheric environment. Specific test parameters are as follows: the corrosion solution concentration is 0.01%. A mixed solution of 0.001% NaCl was used, with the ambient temperature controlled at (45±2)°C. Each cycle consisted of 12 minutes of immersion and 48 minutes of drying. Samples were taken at five time points: 120 hours, 240 hours, 480 hours, 720 hours, and 960 hours.

[0028] The corrosion loss thickness D (μm) is accurately calculated using the weight loss method. The calculation formula is as follows: in, The initial mass (g) of the test piece. The mass after corrosion is (g). Let A be the density of the steel (7.85 g / cm³) and A be the surface area of the specimen (cm²). The loss thickness at each time point is taken as the arithmetic mean of two parallel specimens, thus forming the laboratory corrosion dataset.

[0029] 1.3 Real Atmospheric Exposure Test Identical rare-earth steel samples were placed in a typical industrial atmospheric environment (in this case, an industrial atmospheric environment with environmental parameters of SO2 concentration 0.05–0.15 mg / m³ and Cl- deposition rate 1.5–3.0 g / m²·d) for long-term natural exposure tests. Samples were collected annually, and the thickness loss was measured using the weight loss method. Key environmental parameters during the exposure period (such as temperature, humidity, rainfall, and pollutant concentration) were recorded simultaneously. These data constituted a real atmospheric corrosion dataset, which was used for subsequent model validation and optimization.

[0030] 1.4 Data Preprocessing Flow Due to the long experimental period and complex environment, the original dataset may contain noise, outliers, or missing records. Therefore, rigorous data preprocessing is required before modeling. Data cleaning and processing: Rare earth content (RE) and laboratory accelerated corrosion test time (t) were used as model input features, and laboratory corrosion loss thickness was used as the model input feature. As a prediction label, obviously invalid data is removed, and the average value of the test results of parallel samples under the same conditions is taken to form a standardized laboratory average dataset.

[0031] Outlier and missing value handling: Box plots were used to identify and remove outlier observations. For potentially missing data, imputation methods such as mean imputation, median imputation, or K-nearest neighbor (KNN) interpolation were used to complete the data based on the data distribution characteristics of the features.

[0032] Data standardization: To eliminate the impact of differences in feature dimensions on model training, the cleaned continuous numerical features are scaled. This embodiment uses the Z-score standardization method to transform all feature data into a distribution with a mean of 0 and a standard deviation of 1, thereby improving the stability and convergence speed of model training.

[0033] Data partitioning: The preprocessed laboratory dataset was grouped based on rare earth content, and appropriately divided into a training set for training the model and a validation set for adjusting model parameters and preventing overfitting. Real atmospheric corrosion data was not used for model training at this stage; it was only used for the final model's correlation validation and parameter correction.

[0034] Through standardized experimental design and systematic data preprocessing, a high-quality, unbiased, and standardized dataset was obtained, which effectively improved the reliability and consistency of the data and provided a solid guarantee for the subsequent construction of high-precision, highly generalizable machine learning prediction models.

[0035] 2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements. Use feature selection methods to screen out key features that have a significant impact on the loss thickness.

[0036] Feature engineering is a key step in building high-performance prediction models. It aims to scientifically determine the input features of the model from preprocessed data and select the key feature subset that has the greatest predictive power for loss thickness in order to optimize model performance.

[0037] 2.1 Determination of Input Features The core of this invention lies in establishing a mapping relationship between the rare earth element content and the corrosion loss thickness. Therefore, the rare earth element content (RE%) is the most crucial input feature of the model. Furthermore, to comprehensively and accurately characterize corrosion behavior, other relevant factors should be considered as candidate input features. These factors mainly include: Other chemical components of steel, such as the content of elements like carbon (C), silicon (Si), manganese (Mn), phosphorus (P), sulfur (S), chromium (Cr), nickel (Ni), and copper (Cu), have a significant impact on the microstructure and electrochemical behavior of steel.

[0038] Laboratory accelerated corrosion test parameters: For models that predict laboratory corrosion behavior, the test time t is a crucial feature. Other parameters that may be included include the concentration of the corrosive medium, temperature, pH value, etc.

[0039] Correlation parameters: For models that establish a mapping relationship between laboratory and real atmospheric corrosion, the predicted value of laboratory loss thickness can be used as a key input, while environmental parameters of the real atmosphere (such as temperature, humidity, and pollutant concentration) can also be used as auxiliary inputs.

[0040] 2.2 Feature Selection Methods To select the most significant and effective features from the numerous candidate features, thereby reducing model complexity, preventing overfitting, and improving model interpretability, feature selection methods are required. This invention can employ one or more of the following methods: Correlation analysis: By calculating the Pearson correlation coefficient (applicable to linear relationships) or Spearman rank correlation coefficient (applicable to monotonic nonlinear relationships) between each candidate feature and the target variable of loss thickness, the strength of the correlation between the feature and the target is quantitatively assessed, and features with high importance are initially screened out.

[0041] Recursive Feature Elimination (RFE): This method combines a base model (such as a decision tree or linear model) for iterative training. In each iteration, the least important features are eliminated, and the optimal feature subset is determined based on model performance (such as cross-validation scores). This method effectively evaluates the contribution of feature combinations.

[0042] Principal Component Analysis (PCA): PCA can be used for feature extraction when multicollinearity exists among features. PCA linearly combines the original features into a set of new, uncorrelated features (principal components), and uses a few principal components that contain the majority of the information as new input features, thereby achieving dimensionality reduction.

[0043] 2.3 Determination of Key Features By combining the above methods, the set of key features that have the most significant impact on the loss thickness of rare earth steel can be screened. In this embodiment, correlation analysis may reveal that the rare earth content (RE%) and test time t have the most significant correlation with the loss thickness, thus identifying them as the core key features.

[0044] Through systematic feature engineering, key features are accurately identified from numerous influencing factors. This not only simplifies the model structure and improves computational efficiency and generalization ability, but also deepens the understanding of the mechanism of rare earth influence on corrosion behavior, laying a solid foundation for building a high-precision and highly robust prediction model.

[0045] 3. First Model Construction and Training: Construct a first machine learning model, using the selected key features as input and the loss thickness from the laboratory corrosion dataset as output, and train the model. The machine learning model is a model that uses loss thickness as the prediction target, including one or more of the following: linear regression model, support vector machine regression model, K-nearest neighbor regression model, decision tree model, random forest model, gradient boosting model, and artificial neural network model.

[0046] The first machine learning model was constructed, using the selected key features (mainly including the content of rare earth elements, experimental time and related chemical composition) as input and the loss thickness in the laboratory corrosion dataset as the output target, for model training and optimization.

[0047] 3.1 Model Selection and Performance Comparison In the process of predicting corrosion loss in the laboratory, a systematic comparative analysis of the predictive performance of various machine learning regression models was conducted based on a unified data partitioning method. The models compared included mainstream algorithms such as linear regression, ridge regression, lasso regression, support vector regression, K-nearest neighbor regression, decision trees, random forests, extreme random trees, gradient boosting decision trees, AdaBoost, and artificial neural networks.

[0048] The coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) were used as performance evaluation metrics for the models. Comparative results show that the tree-based ensemble model performs well overall, but the K-nearest neighbor regression (KNN) model achieves the best overall prediction performance on the validation set (R²≈0.90, RMSE≈14.79μm, MAE≈12.77μm), demonstrating its suitability for modeling small sample data.

[0049] 3.2 Model Construction and Training Based on performance comparison results, KNN was selected as the primary machine learning model. This model uses rare earth content (RE) and experimental time (t) as core input features, and laboratory loss thickness as the parameter. As output, construct a nonlinear mapping relationship: By adjusting the nearest neighbor number k (which was optimized to k=5) and the distance weight parameter, the model can effectively characterize the complex nonlinear relationship between rare earth content and test time on the thickness of laboratory corrosion loss.

[0050] This step, through systematic model selection and performance comparison, determined the KNN algorithm best suited for corrosion prediction under conditions of small sample data, demonstrating high prediction accuracy and good stability. A phased modeling strategy was adopted to decompose the complex corrosion prediction problem into relatively independent sub-problems, effectively reducing modeling difficulty and laying a solid foundation for subsequent accurate predictions.

[0051] 4. Second model construction and training: Construct a second machine learning model, using the predicted value of laboratory accelerated corrosion loss thickness output by the first machine learning model as input and the loss thickness in the real atmospheric corrosion dataset as output, and train the model.

[0052] A second machine learning model is constructed, using the predicted value of accelerated corrosion loss thickness in the laboratory output by the first machine learning model as the main input and the loss thickness in the real atmospheric corrosion dataset as the output target, to establish a mapping relationship between laboratory and atmospheric environmental corrosion data.

[0053] 4.1 Model Input / Output Design The core objective of the second machine learning model is to build a bridge between laboratory accelerated corrosion data and real atmospheric corrosion data. The model input consists of the laboratory loss thickness predictions from the first model. The primary input feature can be the laboratory experimental parameters and real atmospheric environment parameters, which can be selectively fused as auxiliary input features based on the results of feature importance analysis. The model output is the loss thickness under real atmospheric conditions. .

[0054] 4.2 Algorithm Selection and Hyperparameter Optimization Based on the characteristics of the mapping relationship between laboratory and atmospheric corrosion data, a suitable machine learning algorithm is selected: For data pairs with a strong linear relationship, linear regression or ridge regression or other linear models are preferred. For complex nonlinear relationships, ensemble algorithms such as decision trees, random forests, and gradient boosting machines (e.g., XGBoost, LightGBM) can be used. When there is sufficient data and the relationships between features are complex, artificial neural networks or deep learning models may provide better predictive performance.

[0055] For different model types, optimize by setting corresponding hyperparameter adjustment ranges: For tree-based ensemble models, the focus is on optimizing the number of trees, maximum depth, minimum number of samples per leaf node, and regularization parameters. For artificial neural network models, adjust parameters such as the number of network layers, the number of neurons in each layer, the type of activation function, the optimizer algorithm, and the batch size.

[0056] 4.3 Model Training and Evaluation Strategies The preprocessed and feature-engineered dataset is divided into training, validation, and test sets according to a preset ratio (usually 7:1.5:1.5). The training set is used for model parameter training, the validation set is used for hyperparameter tuning and model selection, and the test set is used to independently evaluate the generalization ability of the finally selected model.

[0057] Based on the training set data, multiple candidate machine learning models are constructed and trained in parallel, and then compared fairly under the same data partitioning conditions. The final model is selected based on a comprehensive score of prediction accuracy and stability, using validation set performance evaluation (primarily based on R², RMSE, and MAE).

[0058] 4.4 Overfitting Risk Control During model training and parameter optimization, k-fold cross-validation (usually k=5 or 10) is used to evaluate the predictive stability of the model on different data subsets. This effectively reduces the model's sensitivity to a single data split and significantly improves the model's generalization ability under small sample conditions, thereby reducing the risk of overfitting.

[0059] This step establishes an accurate mapping between laboratory and atmospheric corrosion data, enabling the accelerated extrapolation of experimental results to real-world service environments. The system's hyperparameter optimization and cross-validation strategies ensure the model's robustness and generalization ability, providing crucial technical support for accurately predicting the long-term corrosion behavior of rare earth steel in real atmospheric environments.

[0060] 5. Joint Model Optimization: The trained first and second machine learning models are jointly optimized using a subset of data from the real atmospheric corrosion dataset to minimize the prediction error of the actual atmospheric loss thickness. This joint model optimization is an iterative process, involving continuous feedback of new real atmospheric exposure data to update and optimize the parameters of both models.

[0061] To accurately establish a quantitative mapping relationship between laboratory accelerated corrosion and real atmospheric corrosion, and to ensure that the model can accurately reflect real atmospheric corrosion behavior, it is necessary to jointly optimize the trained first and second machine learning models using real atmospheric exposure test data. The goal is to minimize the prediction error of the real atmospheric loss thickness.

[0062] For the phased modeling architecture, a first machine learning model (laboratory corrosion prediction model) and a second machine learning model (laboratory-atmospheric corrosion mapping model) are first trained independently using a training set. Then, laboratory corrosion data are input into the first model to obtain the predicted laboratory loss thickness. This predicted value is then passed as input to the second model to obtain the final predicted value of the actual atmospheric loss thickness. This predicted value is compared with the measured value from a real atmospheric exposure experiment, and the prediction error is calculated (e.g., using mean squared error (MSE) as the loss function).

[0063] Based on the calculated error signal, the parameters of the first and second models are jointly adjusted using optimization algorithms such as backpropagation. The optimization strategy can focus on optimizing the parameters of the second model or perform coordinated parameter adjustments between the two models. The goal is to make the predicted atmospheric loss thickness as close as possible to the measured value, thereby accurately capturing the complex correspondence between the two corrosive environments.

[0064] Model optimization is a continuous iterative process. As new real atmospheric exposure data (especially long-term exposure data) accumulates, this new data is periodically incorporated into the training set to retrain the model and update its parameters. This incremental learning mechanism enables the model to adapt to dynamic changes in environmental conditions and potential drift in data distribution, ensuring the accuracy and timeliness of long-term predictions.

[0065] 6. Corrosion Prediction and Application: The key characteristics of the rare earth steel to be predicted are input into the optimized first machine learning model to obtain the predicted value of the laboratory accelerated corrosion loss thickness. This predicted value is then input into the optimized second machine learning model to obtain the final predicted value of the loss thickness under real atmospheric conditions.

[0066] The jointly optimized machine learning model can be put into practical application to predict the loss thickness of rare earth steel with unknown rare earth content and guide material design and optimization.

[0067] For a rare earth steel sample to be predicted, its key characteristic parameters (including rare earth content, known chemical composition, and preset laboratory accelerated corrosion test parameters) are input into the optimized first machine learning model, and the expected loss thickness prediction value of the steel under specific laboratory conditions can be quickly obtained.

[0068] Then, the laboratory loss thickness prediction value output by the first model is input into the optimized second machine learning model, which can infer the loss thickness of the rare earth steel in the target real atmospheric environment.

[0069] One of the core applications of this predictive model is guiding the composition design of rare earth steel. By systematically changing the rare earth content parameter input into the model (keeping other components and experimental conditions constant), the corrosion resistance performance predictions under different ratios can be quickly obtained, thus determining the optimal addition range. Application examples show that the model prediction results clearly reveal a pattern: when the rare earth content increases from 0 to 0.009 wt%, the thickness loss decreases significantly, and the corrosion resistance improves markedly; within the range of 0.009 wt% to 0.017 wt%, the decreasing trend of the thickness loss slows down and tends to stabilize; and when the rare earth content exceeds 0.017 wt%, the thickness loss no longer decreases and may even increase slightly. This quantitative pattern directly indicates that the optimal rare earth addition range is 0.009 wt% to 0.017 wt%, providing a scientific basis for material composition design and process optimization.

[0070] This process combines advanced machine learning techniques with knowledge of corrosion science, enabling the rapid and accurate prediction of the long-term corrosion behavior and service life of rare earth steel in real atmospheric environments using short-cycle laboratory test data. It not only significantly improves materials research and development efficiency and reduces research and development costs, but also provides a powerful decision support tool for the composition design and performance optimization of high-performance rare earth steel.

[0071] 7. Establish the equivalent conversion relationship between laboratory accelerated corrosion time and real atmospheric exposure time.

[0072] In one specific embodiment, the method of the present invention further includes: establishing an equivalent conversion relationship between laboratory accelerated corrosion time and real atmospheric exposure time; the equivalent conversion relationship is obtained by constructing corrosion kinetic models under laboratory and real atmospheric environments respectively, and linking the two based on the principle of equal loss thickness. When the real atmospheric corrosion data is insufficient to construct a complete corrosion kinetic model, a data extrapolation method is used to predict the corrosion loss thickness at the target time point, thereby improving the real atmospheric corrosion kinetic model, and the predicted values are calibrated based on the real atmospheric corrosion dataset to correct prediction errors. After completing the above model training and joint optimization, in order to establish a quantitative correlation between laboratory accelerated corrosion and real atmospheric corrosion, and to perform final calibration of the prediction results, the present invention introduces a time conversion relationship and correction factor system based on corrosion kinetics.

[0073] (1) Establishment of corrosion kinetic model and derivation of time conversion relationship Based on corrosion kinetics theory, the metal corrosion process in both accelerated laboratory environments and real atmospheric environments typically follows a power function law. Therefore, we established corrosion kinetic equations for each rare earth content group, i.e., time-power function models for both indoor and outdoor environments.

[0074] Obtain laboratory accelerated corrosion test data, average the corrosion loss thickness of parallel samples under the same test time, and establish the laboratory corrosion loss thickness. Laboratory exposure time t lab The power function relationship between them: in, and These are the model parameters obtained by fitting the data using the log-linear regression method. , These are corrosion kinetic parameters in a laboratory environment, describing the evolution of material loss thickness over time under accelerated corrosion conditions. Reflecting the initial corrosion rate trend, Reflects the change in the corrosion process rate over time (acceleration or deceleration); t lab The test time for materials to be subjected to accelerated corrosion in a controlled laboratory environment (such as a salt spray chamber or periodic immersion equipment).

[0075] Corrosion loss thickness data for different exposure times under real atmospheric conditions were obtained and organized in chronological order to establish an initial outdoor corrosion loss thickness D. out With atmospheric exposure time t out The power function relationship between them: in, and n out These are the model parameters obtained by fitting the data using the log-linear regression method. , These are corrosion kinetic parameters for a real atmospheric environment, describing the evolution of material loss thickness with exposure time under a real atmospheric environment.

[0076] (2) Extending the real atmospheric timescale based on the prediction model When the number of actual atmospheric corrosion data points is insufficient to cover the target time range, a grey prediction model GM(1,1) is introduced into the existing outdoor corrosion loss data to extrapolate and predict the corrosion loss thickness under the target atmospheric exposure time. The original outdoor corrosion data and the predicted data are used together to refit the outdoor power function model to obtain an outdoor corrosion loss function that covers a longer time scale (within the service life range).

[0077] (3) Establish the indoor-outdoor time conversion relationship based on the principle of equal loss thickness. Based on laboratory corrosion models and real atmospheric corrosion models, the correspondence between indoor and outdoor time is established according to the principle of equal loss thickness: This leads to the conversion function from outdoor time to equivalent laboratory time: Conversion function from laboratory time to equivalent outdoor time: (4) Output of correlation verification and conversion results Grey relational analysis was performed on the real atmospheric corrosion loss sequence and the laboratory corrosion loss sequence obtained by equivalent time conversion to calculate the grey relational degree γ, so as to verify the correlation between laboratory corrosion behavior and real atmospheric corrosion behavior. When the correlation degree meets the preset threshold condition (γ>0.6), the laboratory time and the corresponding outdoor equivalent time and the corresponding corrosion loss thickness are output, realizing the mapping of laboratory accelerated corrosion results to real atmospheric corrosion behavior.

[0078] (5) Establishment of correction factors In practical engineering, it is often difficult to obtain complete atmospheric corrosion data covering the entire service life. To address this problem, this invention employs a data extrapolation method (such as the grey prediction model GM(1,1)) to extrapolate real atmospheric corrosion data over a limited period of time, thereby improving the long-term corrosion kinetics model.

[0079] At the equivalent time point in the laboratory, the corresponding predicted value of laboratory corrosion loss thickness is calculated using the established indoor corrosion loss thickness prediction model; the predicted value of laboratory corrosion loss thickness is then used to calculate the predicted value of laboratory corrosion loss thickness. Measured values of actual atmospheric corrosion loss thickness for samples with corresponding rare earth content The ratio is calculated to obtain the single-point correction factor k, and its calculation formula is as follows: The single-point correction factors obtained from multiple rare earth content samples and multiple real atmospheric exposure time conditions were statistically summarized to determine a global correction factor used to characterize the scale difference between laboratory accelerated corrosion conditions and real atmospheric corrosion conditions. Taking RE0 in this training as an example, the k values of the two groups of samples at 1 year and 3 years were calculated to be 0.29 and 0.69, respectively. The average value k=0.49 was taken as the conversion coefficient between laboratory and real atmospheric loss thickness.

[0080] (6) Application in forecasting. The above results are incorporated into the forecasting process when making the final prediction of the actual atmospheric loss thickness.

[0081] Based on the global correction coefficient, the output of the laboratory corrosion loss prediction model is corrected to obtain the corresponding predicted value of real atmospheric corrosion loss, wherein the predicted value of real atmospheric corrosion loss... It is given by the following formula: That is, input the rare earth content and indoor time series points, predict the laboratory accelerated loss thickness through the KNN model, and then multiply by the correction factor k to obtain the corresponding equivalent outdoor time and loss thickness prediction value under real atmospheric conditions.

[0082] By introducing time conversion and correction factors, this invention not only achieves quantitative correlation between laboratory data and atmospheric data, but also significantly reduces the systematic error of the prediction results, making the final prediction value closer to the actual situation, and greatly improving the engineering practicality and reliability of the method.

[0083] 8. Industrial Application Validation Cases In one specific embodiment, the method of the present invention further includes: obtaining predicted values of loss thickness under different rare earth content by changing the rare earth content parameter of the input model, and determining the optimal content range of rare earth addition based on the predicted values. To verify the actual effectiveness of the method described in this embodiment, it was applied to the composition design of rare earth corrosion-resistant steel for industrial atmospheric environments of a certain enterprise. The technical requirement of this enterprise is: under the target industrial atmospheric environment, the actual atmospheric corrosion rate of the material is ≤5μm / year.

[0084] Therefore, the prediction model established by the method of this invention is used to evaluate the annual loss thickness under different rare earth contents. Specifically, the candidate rare earth content RE and the indoor accelerated corrosion test time t are input, and the indoor prediction model obtains the predicted value of the indoor loss thickness. And by combining indoor / outdoor time conversion, the equivalent indoor time t is obtained.eq , in t eq Calculate the predicted value of corrosion loss thickness in the laboratory : Then, a correction factor k is introduced to obtain the predicted value of the actual atmospheric loss thickness. This predicted value corresponds to a specific equivalent exposure time (e.g., 1 year). In this application case, the rare earth contents of several candidate rare earth elements were input at a unified time series point, and the prediction results are shown in Table 1.

[0085] When RE=0%, The thickness is 6.5μm (not satisfied); When RE=0.009%, The value is 5.3 μm (not satisfied); When RE=0.017%, It is 4.4 μm (satisfied); When RE=0.028%, The value is 5μm (not satisfied).

[0086] It was found that the thickness loss due to corrosion varied in a range as RE=0.009% transitioned to RE=0.017% and then to RE=0.028%, with RE=0.017% being the optimal value. However, further increases in content resulted in increased thickness loss.

[0087] Based on the above predictions, it is recommended that the rare earth content of the steel be designed to be around 0.017%. Subsequent real-world exposure or equivalent verification tests were conducted on the steel grades within this range. The results showed that their corrosion resistance met the requirements of the target industrial atmospheric environment, thus enabling rapid screening and optimization of the composition scheme during the R&D stage. This case demonstrates that, using the method of this invention, the optimal composition formula can be quickly and accurately screened during the R&D stage without long-term atmospheric exposure, significantly shortening the traditional material R&D cycle from 18 months to 3-4 months, significantly improving R&D efficiency and reducing costs.

[0088] Example 2: Prediction Method Based on Ensemble Machine Learning Model This embodiment provides a machine learning-based method for predicting the loss thickness of rare earth steel, including the following steps: S1. Data Acquisition and Preprocessing: Collect chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests. Clean and standardize all datasets. S2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements, other chemical components of steel, and laboratory accelerated corrosion test parameters, and screen out key features. S3. Integrated Model Construction and Training: Construct an integrated machine learning model, using the selected key features as input and the loss thickness in the real atmospheric corrosion dataset as the direct output, and train the model. S4. Model Optimization: Optimize the trained ensemble machine learning model using the real atmospheric corrosion dataset; S5. Corrosion Prediction and Application: Input the key features of the rare earth steel to be predicted into the optimized integrated machine learning model, and directly output its predicted loss thickness under real atmospheric conditions.

[0089] The above method process will be explained below by comparing it with Example 1.

[0090] Step S1 (data acquisition and preprocessing) is exactly the same as in Example 1.

[0091] In step S2 (feature engineering), the determined input features are more comprehensive, including rare earth content (RE%), other chemical compositions of the steel (C, Si, Mn, etc.), and laboratory accelerated corrosion test parameters (time t, medium concentration, etc.). The most influential subset of key features is selected using the recursive feature elimination (RFE) method combined with cross-validation. In this embodiment, five features—rare earth content (RE%), carbon content (C%), test time (t), etc.—are ultimately selected as key input features.

[0092] In step S3 (model construction and training), an end-to-end ensemble modeling strategy is adopted. Instead of distinguishing between the first and second models described in Example 1, an end-to-end ensemble machine learning model is directly constructed. This model uses all the key features selected in step S2 as input and directly maps them to the predicted values of the actual atmospheric loss thickness. That is, to establish a mapping relationship: .

[0093] In this embodiment, the gradient boosting tree model XGBoost is preferably used as the ensemble model, with hyperparameters such as a learning rate of 0.1 and a maximum depth of 6 for training. This end-to-end ensemble scheme has higher prediction efficiency compared to the staged modeling scheme, although its model interpretability may be slightly inferior.

[0094] Step S4 (Model Optimization) uses all real atmospheric exposure data to train and optimize the ensemble model, adjusting its hyperparameters (such as learning rate, maximum tree depth, subsampling ratio, etc.) to minimize prediction error.

[0095] In step S5 (corrosion prediction and application), all key features of the rare earth steel to be predicted are input into the optimized integrated model at once to directly and quickly obtain the predicted value of its loss thickness under the real atmospheric environment. This approach has a more compact process and higher prediction efficiency.

[0096] Furthermore, to verify and highlight the technical superiority of the methods described in Embodiment 1 (staged machine learning model) and Embodiment 2 (ensemble machine learning model) of the present invention, this experimental example designs a systematic comparative experiment to quantitatively compare the method of the present invention with the traditional prediction methods mentioned in the background art. The empirical formula method based on power functions, commonly used by those skilled in the art, is selected as the comparison benchmark.

[0097] By comparing the key performance indicators of the traditional empirical formula method, the method of Embodiment 1 of this invention, and the method of Embodiment 2 of this invention in terms of prediction accuracy, efficiency, and cost under the same dataset, the progress of the method of this invention is objectively evaluated. The comparison objects include: Traditional empirical formula method: adopts the power function empirical formula widely used in industry. As a benchmark, this model uses only time t as input and obtains parameters k and n by fitting historical data through linear regression. Embodiment 1 of the present invention: the staged K-nearest neighbor (KNN) regression model; Embodiment 2 of the present invention: the end-to-end XGBoost ensemble model.

[0098] The dataset used is the preprocessed complete dataset described in Example 1 (containing laboratory accelerated corrosion data with different rare earth contents and real atmospheric exposure data). Evaluation metrics include the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) to evaluate prediction accuracy; model training time and prediction time are recorded to evaluate computational efficiency.

[0099] Experimental Results and Analysis (1) Comparison of prediction accuracy. Table 1 shows the comparison of prediction accuracy of the three methods on the same test set. As can be seen from Table 1, the prediction accuracy of the two embodiments of this invention (R²>0.90) is significantly better than the traditional empirical formula method (R² = 0.68). This indicates that the machine learning model can effectively capture the complex nonlinear relationship between rare earth content, chemical composition, and corrosion loss, while the traditional formula cannot utilize these key features. Furthermore, Embodiment 2 (ensemble model) has slightly higher accuracy than Embodiment 1 (staged model), demonstrating the advantages of the end-to-end learning model.

[0100] Table 1 Comparison of prediction accuracy of different methods

[0101] (2) Comparison of prediction efficiency and application cost Table 2 shows a comparison of the efficiency and cost of various methods for the complete process of new material development.

[0102] Table 2 Comparison of prediction efficiency and application cost of different methods

[0103] As shown in Table 2, in terms of efficiency, the two methods of this invention are significantly superior to traditional methods that rely on long-term real-world exposure, shortening the R&D cycle from "years" to "months". In terms of cost, although the methods of this invention are slightly more expensive than simply running an empirical formula, the resulting improvement in accuracy is orders of magnitude, offering extremely high cost-effectiveness. This avoids the risk of R&D failure due to inaccurate predictions from traditional formulas. Example 2 (ensemble model) demonstrates the fastest prediction speed, showcasing its enormous potential in scenarios requiring high-throughput screening.

[0104] The comparative experiment yields the following conclusions: Significantly improved accuracy: The methods described in Examples 1 and 2 of this invention comprehensively surpass traditional empirical formula methods in prediction accuracy, demonstrating the necessity of introducing machine learning to handle complex feature relationships; Revolutionary improved efficiency: Compared to traditional long-term exposure tests, the method of this invention shortens the material corrosion resistance performance evaluation cycle by more than 75%, achieving a qualitative leap in R&D efficiency; Provides a preferred solution: Example 2 exhibits superior overall performance in prediction accuracy and speed, suitable for most scenarios requiring rapid and accurate prediction; while Example 1's model is simpler, demonstrating good stability and interpretability in small sample conditions. The comparative experiment, through objective data, fully demonstrates that the two machine learning prediction methods provided by this invention have outstanding substantial features and significant progress compared to traditional technologies in predicting the loss thickness of rare earth steel, producing excellent technical results.

[0105] In summary, this invention, by constructing a machine learning model, enables rapid prediction of the loss thickness of rare earth steel under indoor environmental conditions. After conversion using real atmospheric data, by inputting the rare earth content and time series points, the corresponding atmospheric environmental loss thickness from accelerated laboratory testing can be obtained. This provides efficient technical support for the design of rare earth steel composition and the evaluation of its corrosion resistance. Furthermore, adding more real atmospheric data points in the future will improve the model's generalization ability.

[0106] Embodiments of the present invention may be provided as methods, systems, or computer program products. Therefore, the present invention may take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of one or more computer-usable storage media (including, but not limited to, disk storage, etc.) containing computer-usable program code. CD - ROMIt takes the form of a computer program product implemented on (such as optical memory, etc.).

[0107] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0108] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0109] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0110] Contents not described in detail in this specification are prior art known to those skilled in the art. It is hereby indicated that the above description is intended to help those skilled in the art understand this invention, but does not limit the scope of protection of this invention. Any equivalent substitutions, modifications, improvements, or simplifications of the above descriptions that do not depart from the essential content of this invention fall within the scope of protection of this invention.

Claims

1. A method for predicting the loss thickness of rare earth steel based on machine learning, characterized in that, Includes the following steps: S1. Data Acquisition and Preprocessing: Collect chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests. Clean and standardize all datasets. S2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements. Use feature selection methods to screen out key features that have a significant impact on the loss thickness. S3. First Model Construction and Training: Construct a first machine learning model, using the selected key features as input and the loss thickness in the laboratory corrosion dataset as output, and train the model. S4. Second model construction and training: Construct a second machine learning model, using the predicted value of laboratory accelerated corrosion loss thickness output by the first machine learning model as input and the loss thickness in the real atmospheric corrosion dataset as output, and train the model. S5. Joint Model Optimization: The first and second machine learning models trained by the training are jointly optimized using a portion of the data in the real atmospheric corrosion dataset to minimize the prediction error of the real atmospheric loss thickness. S6. Corrosion Prediction and Application: Input the key characteristics of the rare earth steel to be predicted into the optimized first machine learning model to obtain the predicted value of accelerated corrosion loss thickness in the laboratory. Then input this predicted value into the optimized second machine learning model to obtain the final predicted value of loss thickness under real atmospheric conditions.

2. A method for predicting the loss thickness of rare earth steel based on machine learning, characterized in that, Includes the following steps: S1. Data Acquisition and Preprocessing: Collect chemical composition data of rare earth steel samples with different rare earth contents, laboratory corrosion datasets obtained through accelerated corrosion tests, and real atmospheric corrosion datasets obtained through real atmospheric exposure tests. Clean and standardize all datasets. S2. Feature Engineering: Determine the model input features from the preprocessed data. The input features include at least the content of rare earth elements, other chemical components of steel, and laboratory accelerated corrosion test parameters, and screen out key features. S3. Integrated Model Construction and Training: Construct an integrated machine learning model, using the selected key features as input and the loss thickness in the real atmospheric corrosion dataset as the direct output, and train the model. S4. Model Optimization: Optimize the trained ensemble machine learning model using the real atmospheric corrosion dataset; S5. Corrosion Prediction and Application: Input the key features of the rare earth steel to be predicted into the optimized integrated machine learning model, and directly output its predicted loss thickness under real atmospheric conditions.

3. A method for predicting the loss thickness of rare earth steel based on machine learning according to claim 1 or 2, characterized in that, In S1, the laboratory accelerated corrosion test includes at least one of the following: neutral salt spray test, cyclic wet-dry corrosion test, and immersion corrosion test.

4. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 3, characterized in that, In S2, the feature selection method includes at least one of correlation analysis, principal component analysis, and recursive feature elimination.

5. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 3, characterized in that, The machine learning model is a model that uses loss thickness as the prediction target, including one or more of the following: linear regression model, support vector machine regression model, K-nearest neighbor regression model, decision tree model, random forest model, gradient boosting model, and artificial neural network model.

6. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 1, characterized in that, In S5, the joint optimization of the models is an iterative process that involves continuously feeding back new real atmospheric exposure data to update and optimize the parameters of the first and second machine learning models.

7. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 3, characterized in that, The method further includes: obtaining predicted loss thickness values under different contents by changing the rare earth content parameters of the input model, and determining the optimal content range of rare earth addition based on the predicted values.

8. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 3, characterized in that, The method further includes: establishing an equivalent conversion relationship between laboratory accelerated corrosion time and real atmospheric exposure time; the equivalent conversion relationship is obtained by constructing corrosion kinetic models in laboratory environment and real atmospheric environment respectively, and linking the two based on the principle of equal loss thickness.

9. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 8, characterized in that, When the real atmospheric corrosion data is insufficient to construct a complete corrosion kinetic model, a data extrapolation method is used to predict the corrosion loss thickness at the target time point, thereby improving the real atmospheric corrosion kinetic model. The predicted values are then calibrated based on the real atmospheric corrosion dataset to correct the prediction error.

10. The method for predicting the loss thickness of rare earth steel based on machine learning according to claim 3, characterized in that, The rare earth steel is rare earth microalloyed low alloy steel, rare earth stainless steel, or rare earth heat-resistant steel.