A method for predicting creep life of high-temperature alloy based on machine learning and symbolic regression
By constructing a creep life prediction model for high-temperature alloys using machine learning and symbolic regression methods, the problems of long creep performance testing time and high cost are solved, achieving efficient creep life prediction and interpretability of creep mechanism, thus improving material design efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SICHUAN UNIV
- Filing Date
- 2024-04-09
- Publication Date
- 2026-06-30
AI Technical Summary
Experiments on the creep properties of high-temperature alloys are time-consuming and costly. Existing technologies struggle to provide efficient and interpretable predictions of creep life and lack a deep understanding of the creep mechanism.
By employing machine learning and symbolic regression methods, combined with thermodynamic software data, and through feature parameter selection and model training, an interpretable creep life prediction model for high-temperature alloys is constructed. Finally, a creep life prediction formula is established using a symbolic regression algorithm based on genetic programming and a least squares fitting formula.
It reduces creep life prediction time and economic costs, improves material design efficiency, and enables interpretable prediction of creep performance and in-depth exploration of creep mechanisms. The model exhibits excellent prediction performance and good generalization ability.
Smart Images

Figure CN118522369B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of high-temperature alloys, and the specific design involves a high-temperature alloy creep life prediction model based on machine learning and symbolic regression. Background Technology
[0002] High-temperature alloys are alloys that possess high strength and good resistance to oxidation and gas corrosion in the temperature range of 650–1000℃. They exhibit excellent thermal and mechanical stability at high temperatures, enabling them to operate near their melting points. High-temperature alloys typically need to withstand both high temperatures and high stress levels during service, thus requiring extremely high high-temperature strength and creep performance. Creep performance is a crucial mechanical property determining the behavior of high-temperature alloy components at high temperatures; however, the time-consuming and costly nature of creep testing for alloys has limited their development.
[0003] With the development and popularization of computer technology, mathematical modeling and machine learning methods have been widely applied in the field of materials performance prediction. Machine learning can uncover intrinsic information about materials and establish relevant relationships through data analysis. Using computational techniques to predict creep properties can greatly save experimental time and accelerate materials development. However, simply achieving black-box prediction cannot reveal the intrinsic mechanism of material microstructure and properties. In recent years, the demand for interpretable machine learning methods has surged. Symbolic regression has become a current research hotspot. Based on existing empirical knowledge and intrinsic theories of materials, it fully utilizes the mapping relationship between data and machine learning models, mathematically combining mathematical operators and material characteristics to regress the mathematical expression between the target quantity and the material characteristic input quantity, thereby constructing an interpretable materials performance prediction model.
[0004] Therefore, using machine learning combined with symbolic regression to construct an interpretable creep life prediction model for high-temperature alloys can not only achieve reliable prediction of the creep life of alloys, but also provide guidance for the study of the mechanism of high-temperature alloys. Summary of the Invention
[0005] The purpose of this invention is to provide a method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression. This model aims to predict the creep life of alloys while simultaneously providing a mechanistic explanation of the relationship between the creep properties and microstructure of high-temperature alloys.
[0006] To achieve the above objectives, the present invention employs the following technical solution:
[0007] Step 1 includes, but is not limited to, using thermodynamic software such as JMatpro to calculate high-temperature alloy data. Parameters that cannot be obtained by the software can be obtained by using alloy creep-related characteristic parameters and normalizing them.
[0008] Step 2: Using the feature parameters as input and the creep life of the high-temperature alloy as output, select the most representative feature parameters from the feature groups with strong correlations through methods such as correlation screening.
[0009] Step 3: Divide the dataset into training and testing sets, and construct multiple learning models with high-temperature alloy creep life as the output to further filter feature parameters. Then, use the optimal subset idea to obtain the optimal combination of feature parameters.
[0010] Step 4, symbolic regression modeling, includes the following sub-steps.
[0011] (1) Input the filtered feature parameters, randomly divide 80% of the data into the training set and 20% into the test set, and standardize the modeling set data; construct the formula model based on the symbolic regression algorithm (GPSR) of genetic programming.
[0012] (2) Candidate formula models are obtained by screening based on the best subset results and prior knowledge of the materials;
[0013] The coefficients and constants of the formula model are fitted using the least squares method to obtain the final formula model.
[0014] A further improvement of the present invention is that:
[0015] Preferably, the creep-related characteristic parameters in step 1 include, but are not limited to, tissue structure parameters and energy parameters.
[0016] Preferably, the feature parameters are normalized, and the formula is as follows;
[0017] Y' = ln(Y+1)
[0018] Preferably, the correlation coefficient in step 2 can be obtained using the following formula:
[0019]
[0020] Preferably, step 3 uses a machine learning model to train the high-temperature alloy characteristic parameters obtained in step 2 to obtain the correlation coefficient R. 2 and mean absolute error (MAE);
[0021] [Correlation coefficient R] 2 The calculation formula is as follows:
[0022]
[0023] The mean absolute error is as follows:
[0024]
[0025] Preferably, the final feature parameters obtained by taking the intersection of the parameter sets selected by the machine learning model are input into the symbolic regression model.
[0026] Genetic symbolic regression algorithm is used to construct a formulaic model, which evaluates the fitness of an individual using the following formula:
[0027] fitness = rC * len(X);
[0028] Where fitness is the individual's fitness, r is the correlation coefficient, C is the penalty coefficient, and len(X) is the number of nodes in the individual's syntax tree.
[0029] Preferably, the coefficients and constants in the formula model include, but are not limited to, those obtained by fitting the data using the least squares method. The formula for the least squares method is shown below:
[0030]
[0031] Where E is the fitted value, m is the number of data samples in the training set, and yi represents the true value of the i-th sample. This represents the predicted value of the i-th sample;
[0032] Two formula models were finally established.
[0033] Compared with the prior art, the present invention has the following advantages:
[0034] This invention discloses an interpretable creep life prediction formula model based on machine learning and symbolic regression. This model is based on a dataset calculated using thermodynamic software, considers the influence of microstructure on the creep properties of high-temperature alloys, and constructs a formula model by screening key parameters of the alloy's creep life. Finally, the model's accuracy and generalization were verified using collected datasets. The establishment of this model not only reduces the time and economic cost of creep life prediction and improves material design efficiency, but also demonstrates excellent prediction performance and can interpret the creep fracture life of various parameter variables, which is beneficial for in-depth research into the creep mechanism of high-temperature alloys. Attached Figure Description
[0035] Figure 1 This is a flowchart of a method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression.
[0036] Figure 2 This is a Pearson correlation diagram of characteristic parameters and creep life in this invention.
[0037] Figure 3The results are calculated using different learning models in this invention; (a) is the result calculated using the SVC machine learning model in this invention; (b) is the result calculated using the BPNN machine learning model in this invention; (c) is the result calculated using the SVR machine learning model in this invention; and (d) is the result calculated using the optimal subset of the SVR model in this invention.
[0038] Figure 4 This is a comparison chart of the calculated values based on Jmatpro software and the predicted values of the nonlinear formula model established by machine learning and symbolic regression methods using existing alloy grade data in this invention.
[0039] Figure 5 This is a comparison chart of the predicted creep life and the actual experimental value obtained by using the creep dataset from Beijing University of Aeronautics and Astronautics and establishing two formula models based on machine learning and symbolic regression methods in this invention. Detailed Implementation
[0040] After initial screening using Pearson correlation and random forest importance ranking, the MD value and Young's modulus E were removed from the original 15 feature parameters, leaving the following: γ′ phase volume fraction Vγ′, TCP phase volume fraction V-TCP, carbide volume fraction VC, stacking fault energy Γ, antiphase domain boundary energy APB, shear modulus G, and T. L The 13 feature parameters, including Tγ′ and Burges vector, are used for secondary screening.
[0041] Based on BPNN, SVR, and SVC machine learning models and combined with prior knowledge of materials science, the feature parameter combinations Vγ′, APB, δ, Г, and T were obtained through secondary screening. L Symbolic regression modeling is performed using the optimal parameter combination of Tγ′ and G;
[0042] The present invention will now be described in further detail with reference to the accompanying drawings.
[0043] The overall framework of this invention is as follows: Figure 1 As shown, a prediction method for predicting the creep life of high-temperature alloys is constructed by combining machine learning with symbolic regression.
[0044] S1. Use Jmatpro software to construct a high-temperature alloy creep dataset, including characteristic parameter data and creep life data. The characteristic parameters include microstructure parameters such as γ′ phase volume fraction Vγ′ and TCP phase volume fraction V-TCP, as well as energy parameters such as stacking fault energy Γ and antiphase domain boundary energy APBE.
[0045] S2, such as Figure 2As shown, the data preprocessed by Pearson correlation calculation and the random forest importance ranking after Bayesian optimization and parameter tuning are used to screen feature parameters, aiming to select representative feature parameters from the clustering of two sets of feature parameters with strong correlation.
[0046] S3, such as Figure 3 As shown in (a), a Support Vector Machine (SVC) model is constructed using machine learning. The feature parameters obtained in step S2 are sequentially removed, and the changes in model accuracy are observed. If the model accuracy decreases significantly after removing a certain parameter, it indicates that the parameter is an important feature parameter.
[0047] S4, such as Figure 3 As shown in (b), the model is constructed using the BP neural network (BPNN) algorithm in machine learning, and the model accuracy is observed by adopting a sequential removal strategy.
[0048] S5, such as Figure 3 As shown in (c), the model is constructed using the Support Vector Regression (SVR) algorithm in machine learning, and the model accuracy is observed by adopting a sequential removal strategy.
[0049] There is no specific order among steps S3, S4, and S5;
[0050] S6, such as Figure 3 As shown in (d), the optimal subset model is constructed using ten-fold cross-validation of the support vector regression (SVR) algorithm, aiming to find the number of feature parameter combinations with high model accuracy and provide a reference for subsequent symbolic regression modeling;
[0051] S7. A formulaic model is established using symbolic regression based on a genetic algorithm, and the model form is determined as follows:
[0052] (1) Nonlinear Model
[0053]
[0054] (2) Linear Model
[0055] t f =a2*Vγ'+b2*G-c2*Γ-d2*Tγ'-e2
[0056] Where a1, b1, c1, d1, e1, and f1 are the coefficients to be fitted, and h1 is the constant to be fitted (the same applies to a2-h2). Finally, the least squares method is used to fit the model to obtain the final formula.
[0057] Figure 4 The comparison between the creep life predicted by the model of this invention and the theoretical calculation value shows that the model can achieve good prediction results, and compared with the black box model, the model of this invention has the advantage of interpretability.
[0058] Figure 5 Comparing the calculated data from the Beihang University materials database with experimental values, the nonlinear model of this invention demonstrates superior predictive performance, with a prediction error of less than 20%. While the linear model cannot accurately predict extreme creep life, its simple structure and convenient calculation make it suitable for conservative estimation of alloy creep life.
[0059] As can be seen from the examples above, the model of this invention, while ensuring the accuracy and generalization of model predictions, also achieves visualization of model predictions and interpretability of the alloy creep life change mechanism. The model construction methods described above are not intended to limit this invention. Any modifications, equivalent substitutions, or improvements made within the spirit and principles of this invention should be included within the scope of protection of this invention.
Claims
1. A method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression, characterized in that, Includes the following steps: Step 1: Dataset construction. Based on prior knowledge of materials science, the sampling space is determined. High-temperature alloy data with random composition are calculated using thermodynamic software. A high-temperature alloy creep-related dataset is constructed and the data is normalized. The high-temperature alloy creep-related dataset is alloy data obtained by calculating the optimal composition space using thermodynamic software. Step 2: Feature parameter filtering; Step 2.1: Select clusters with strong correlation feature parameters by Pearson correlation. Strong correlation means that the correlation coefficient is greater than the set threshold. Use random forest importance ranking to filter out feature parameters with importance lower than the set threshold from the same cluster, and retain feature parameters with importance greater than or equal to the set threshold. Correlation coefficient The calculation method is as follows: ; in, It is the true value of creep life. This represents the average creep life. It is a predicted value for creep life. This represents the average predicted creep life. Indicates the number of data samples; i represents the i-th prediction sample; Step 2.2: Divide the dataset corresponding to the feature parameters selected in Step 2.1 into training and testing sets, construct a machine learning model with the creep life of the high-temperature alloy as the output, and calculate the correlation coefficient R. 2 and mean absolute error (MAE), correlation coefficient (R) 2 The closer the value is to 1, and the closer the MAE is to 0, the better the model's prediction performance. Each model has a set correlation coefficient threshold and a mean absolute error threshold. The optimal parameters for each model are selected based on these thresholds. The optimal parameter combination is obtained by taking the intersection of the feature parameters based on the prior knowledge of the materials and the screening results of several machine learning models. Step 3: Construct the formula model and determine the coefficients and constants of the formula model; Step 3.1: Construct a syntax tree with feature parameters as input and creep life of high-temperature alloy as output. The greater the applicability of an individual model, the higher its accuracy. Construct multiple formula models for predicting creep life based on the symbolic regression method of genetic programming as candidate formula models. Step 3.2: Determine the final formula model form from the candidate formula models based on prior knowledge of the materials; Step 3.3: Fit the coefficients and constants of the selected formula model to obtain the final formula model; Step 4: Calculate the creep life of the target high-temperature alloy using the formula model constructed in Step 3.
2. The method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression according to claim 1, characterized in that, The correlation coefficient R in step 2 2 The calculation method is as follows: ; Where n is the number of data samples. It is the true value of creep life. It is a predicted value for creep life. This represents the average creep life.
3. The method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression according to claim 1, characterized in that, The method for calculating the mean absolute error (MAE) is as follows: ; Where n is the number of data samples. It is the true value of creep life. It is a predicted value for creep life.
4. The method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression according to claim 1, characterized in that, In step 2.2, multiple machine learning models are used to calculate the feature parameters selected by each machine learning model, and then the intersection is taken to obtain the final selection result.
5. The method for predicting the creep life of high-temperature alloys based on machine learning and symbolic regression according to claim 1, characterized in that, In step 3.1, when constructing the formulaic model for predicting creep life using the symbolic regression method based on genetic programming, the following formula is used to evaluate the fitness of an individual: fitness = rC * len(X); Where fitness is the individual's fitness, r is the correlation coefficient, C is the penalty coefficient, and len(X) is the number of nodes in the individual's syntax tree; Step 3.3 uses the least squares method to fit the model coefficients and constants; The formula for the least squares method is shown below: ; in, This represents the value to be fitted, where m is the number of data samples in the training set. This represents the true value of the i-th sample. This represents the predicted value of the i-th sample.
6. An information data processing terminal for implementing the high-temperature alloy creep life prediction method according to any one of claims 1-5.
7. A computer-readable storage medium comprising instructions, when executed on a computer, causing the computer to perform the high-temperature alloy creep life prediction method as described in any one of claims 1-5.