A method and system for predicting the fatigue life of a shaft part repaired by laser cladding
An adaptive feature framework constructed using feature extraction and the XGBoost algorithm solves the problem of insufficient fatigue life prediction accuracy of traditional methods under complex working conditions, and realizes high-precision real-time prediction and dynamic updating of laser-repaired shaft parts.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- TIANJIN UNIV
- Filing Date
- 2025-06-05
- Publication Date
- 2026-06-26
AI Technical Summary
Traditional methods for predicting the fatigue life of mechanical parts have limitations when dealing with complex and variable actual working conditions. They cannot achieve real-time prediction and dynamic updates, and traditional models are difficult to effectively handle high-dimensional features and multi-factor coupling, resulting in low prediction accuracy and inability to adapt to the iteration of laser repair processes or the application of new materials.
We employ a method based on feature extraction and the XGBoost algorithm. By constructing an adaptive feature extraction framework driven by both mechanism and data, and combining online learning and feedback optimization mechanisms, we build an XGBoost fatigue life prediction model. We dynamically adjust hyperparameters and set regularization constraints to screen out key feature parameters, thereby achieving effective processing of high-dimensional heterogeneous data.
It improves the accuracy of fatigue life prediction under complex working conditions, ensures that the prediction results conform to the laws of actual engineering applications, and realizes real-time model updates and high-precision prediction.
Smart Images

Figure CN120670818B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of laser repair shaft component life prediction technology, specifically to a method and system for predicting the fatigue life of laser repair shaft components based on feature extraction and XGBoost algorithm. Background Technology
[0002] In the fields of rail transit, wind power equipment, marine equipment, and mechanical engineering, shaft parts (such as axles, main shafts, drive shafts, crankshafts, rotor shafts, etc.) are core components of power transmission systems, bearing key power and motion transmission functions, and they play a vital role in various mechanical equipment.
[0003] These types of shaft parts frequently fail and become unusable due to service wear, scratches, and impacts. Traditional methods for predicting the fatigue life of mechanical parts mainly rely on physical models and experimental data. These methods typically require significant time and resources and have limitations when dealing with complex and variable real-world operating conditions. Furthermore, statically deployed models cannot be updated online, causing predictive performance to degrade over time. Therefore, traditional methods struggle to handle complex nonlinear relationships and cannot achieve real-time prediction and dynamic updates, thus affecting prediction accuracy.
[0004] Current methods for predicting the fatigue life of repaired shaft components still have the following problems:
[0005] (1) Predicting the fatigue life of repaired shaft parts is a complex multi-factor coupled problem. Its failure mechanism is jointly affected by multi-source heterogeneous features such as cladding layer composition, laser power, spot diameter, cladding speed, and overlap rate. Traditional prediction methods are difficult to effectively handle high-dimensional features, and the feature selection of model variables depends on manual experience and lacks theoretical support, resulting in low prediction accuracy.
[0006] (2) For fatigue life prediction models of parts, traditional prediction methods are mainly based on classical mechanics theory, such as the SN curve method (stress-life method), local strain method, and fracture mechanics model (Paris formula). However, these methods rely too much on simplification assumptions and ignore complex working conditions and multi-factor coupling, resulting in insufficient prediction accuracy and poor adaptability. In recent years, machine learning technology has provided new ideas for fatigue life prediction. For example, algorithms such as neural networks and support vector machines have been used to establish nonlinear mapping relationships. However, the models have poor interpretability, large data requirements, are sensitive to outliers, and lack real-time performance, making them difficult to adapt to industrial scenarios.
[0007] (3) When laser repair processes iterate or new materials are applied, traditional models need to be recalibrated experimentally, which is time-consuming and costly. In addition, most existing prediction models are trained offline and deployed statically, and cannot be continuously updated with online data.
[0008] XGBoost (eXtreme Gradient Boosting) is a high-efficiency machine learning algorithm based on the gradient boosting framework. It constructs a strong learner by integrating multiple weak learners (typically decision trees), demonstrating outstanding performance in regression, classification, and ranking tasks. Its core advantage lies in combining the iterative optimization concept of gradient boosting with multiple engineering optimization techniques, making it one of the preferred algorithms for machine learning competitions and industrial applications.
[0009] Therefore, we are considering combining feature extraction with the XGBoost algorithm to improve the fatigue life prediction technology for rear axle-type parts. Summary of the Invention
[0010] This invention addresses the shortcomings of existing technologies by providing a method and system for predicting the fatigue life of laser cladding repair shaft parts based on feature extraction and the XGBoost algorithm. The prediction method and system, through the construction of an adaptive feature extraction framework driven by both mechanism and data, and the integration of online learning and feedback optimization mechanisms, overcome the challenges of low prediction accuracy and poor generalization ability under high-dimensional heterogeneous data coupling. This is of great significance for improving the accuracy of fatigue life prediction under complex working conditions.
[0011] To achieve the above objectives, the first aspect of this invention provides a method for predicting the fatigue life of laser cladding repair shaft parts based on feature extraction and the XGBoost algorithm, comprising:
[0012] Step 1: For shaft-type part specimens prepared using typical laser cladding repair process parameters, micro-area characterization techniques (micro-CT) are used to obtain the micro-defect characteristics of each specimen and the heterogeneous microstructure characteristics of the cladding layer-heat-affected zone-matrix microstructure. Rotational bending fatigue tests are conducted with different fatigue amplitudes and a stress ratio of -1. Using ultra-depth-of-field electron microscopy and scanning electron microscopy, the crack initiation location and propagation path of each fatigue specimen are statistically analyzed. The crack initiation location of each fatigue specimen is statistically analyzed, and the rotational bending fatigue life data corresponding to the micro-defect characteristics and heterogeneous microstructure characteristics are obtained. Fatigue life labels are established for the data. The micro-defect characteristics include the number, shape, size, and distribution characteristics of the micro-defects in the fatigue specimens. The heterogeneous microstructure characteristics include grain size, grain orientation, texture intensity, austenite / martensite content, and residual stress.
[0013] Step 2: Based on the above-mentioned microstructure characterization and fatigue test results of laser cladding repaired axles, establish an initial dataset including the characteristic parameters of minute defects in the cladding layer, the characteristic parameters of heterogeneous microstructure, and the fatigue life data of the axles; preprocess and dimensionality reduction of the initial dataset to obtain a dataset with a balanced number of samples of different fatigue lives of the repaired axles.
[0014] Step 3: Form a dataset from the dimensionality-reduced data, build a feature extraction model, and train it using the obtained dataset; use a random forest regression model to quantify feature importance and identify key features;
[0015] Step 4: Construct a fatigue life prediction model for the XGBoost algorithm. The key feature variables obtained in Step 3 are integrated with the fatigue life labels obtained in Step 1 to form a structured dataset for training. The correlation between the feature parameters of minor defects in the cladding layer and the feature parameters of heterogeneous structures is evaluated using dynamic hyperparameter adjustment and regularization constraints. Redundant features with high correlation are removed, and feature parameters that characterize minor defects and heterogeneous structures with high independence are obtained. Then, SHAP values are used to analyze the importance of feature parameters in affecting the fatigue life of the axle, and key feature parameters and their weights affecting the fatigue performance of laser cladding repaired axles are selected.
[0016] By screening out key characteristic parameters that affect the fatigue performance of laser cladding repaired axles, the influence of these key characteristic parameters is quantified into fatigue crack tip stress intensity factors according to their weights, and the remaining fatigue life is predicted based on the fracture mechanics framework.
[0017] In the above steps, crack initiation refers to the critical state at which a crack begins to propagate in a material. It is the point at which a pre-existing crack inside or on the surface of the material transitions from a stable state to unstable propagation under external loads or residual stress. This process determines the fracture toughness of the material and directly affects the safety and lifespan assessment of engineering structures.
[0018] Furthermore, step two specifically includes:
[0019] S21: Standardize the data collected in step one to make different features comparable;
[0020] S22: Principal component analysis is used to reduce the dimensionality of the standardized multidimensional features; then the original data is projected onto a low-dimensional space composed of k feature vectors to obtain the dimensionality-reduced data; the threshold k is selected based on the cumulative variance contribution rate and the scree plot inflection point criterion.
[0021] Furthermore, step three specifically includes:
[0022] S31: Form a dataset from the dimensionality-reduced data and divide it into training and test sets; build a feature extraction model and train it using the obtained dataset;
[0023] S32: Evaluate the performance of the model obtained in S31 and calculate R0. 2 RMSE, which determines the accuracy of the model based on the data size. 2The calculated value is between 0 and 1, with a larger value indicating a more accurate model; the smaller the RMSE calculated value, the more accurate the model.
[0024] S33: A random forest regression model is used to quantify feature importance. The principal component importance score is calculated based on the reduction of node splitting error and then mapped back to the original feature weights to identify key features.
[0025] S34: Determine whether the result of the evaluation in step S32 meets the expected accuracy. If yes, proceed to step four; otherwise, return to S31.
[0026] Furthermore, step four specifically includes:
[0027] S41: After matching the obtained key feature variables with the fatigue life labels obtained in step one, integrate them into a structured dataset;
[0028] S42: Build an XGBoost model and train it using the dataset generated in S41;
[0029] S43: Five-fold cross-validation and early stopping are used for model training and cross-validation; then Bayesian optimization is used to automatically search for the optimal hyperparameter combination for hyperparameter tuning, and regularization constraints are introduced to prevent overfitting.
[0030] S44: Verify model performance using the test set, test prediction stability using the residual distribution, and then analyze feature contribution based on SHAP values to test the obtained model until a well-trained XGBoost model is obtained.
[0031] S45: Using the trained XGBoost model, we obtain feature parameters that characterize minute defects and heterogeneous structures with high independence; then, we use SHAP values to analyze the importance of the feature parameters in the fatigue life of the axle, and screen out the key feature parameters and weights that affect the fatigue performance of the axle repaired by laser cladding.
[0032] Furthermore, for the trained XGBoost algorithm fatigue life prediction model, the method also includes: establishing a feedback mechanism, which dynamically extracts features every 24 hours through principal component analysis and random forest algorithm in steps two and three, and then inputs the new data into the fatigue life prediction model to trigger incremental training.
[0033] If the SHAP value detects a feature contribution drift exceeding a preset threshold, return to step four for retraining.
[0034] The second aspect of this invention discloses a fatigue life prediction system for laser cladding repair of shaft parts, comprising:
[0035] Multi-feature collection module: For shaft parts prepared using typical laser cladding repair process parameters, micro-area characterization methods are used to obtain the micro-defect characteristics of each specimen and the heterogeneous microstructure characteristics of the cladding layer-heat-affected zone-matrix microstructure. Rotational bending fatigue tests are conducted with different fatigue amplitudes and a stress ratio of -1. Using ultra-depth-of-field electron microscopy and scanning electron microscopy, the crack initiation location and propagation path of each fatigue specimen are statistically analyzed. The crack initiation location of each fatigue specimen is statistically analyzed, and the rotational bending fatigue life data corresponding to the micro-defect characteristics and heterogeneous microstructure characteristics are obtained. Fatigue life labels are established for the data. The micro-defect characteristics include the number, shape, size, and distribution characteristics of the micro-defects in the fatigue specimen. The heterogeneous microstructure characteristics include grain size, grain orientation, texture intensity, austenite / martensite content, and residual stress.
[0036] Data preprocessing module: Based on the microstructure characterization and fatigue test results of the laser cladding repair axle from the multi-dimensional feature collection module, an initial dataset is established, including the micro-defect feature parameters of the cladding layer, the feature parameters of the heterogeneous microstructure, and the fatigue life data of the axle; the initial dataset is preprocessed and dimensionality reduced to obtain a dataset with a balanced number of samples of different fatigue lives of the repaired axle.
[0037] Feature extraction module: The data obtained by the data preprocessing module is used to form a dataset, a feature extraction model is established, and the obtained dataset is used for training; a random forest regression model is used to quantify the importance of features and identify key features;
[0038] Key Feature Parameter Extraction Module: A fatigue life prediction model based on the XGBoost algorithm is constructed. Key feature variables obtained from the feature extraction module are matched with fatigue life labels obtained from the multivariate feature collection module and integrated into a structured dataset for training. A method of dynamically adjusting hyperparameters and setting regularization constraints is used to evaluate the correlation between the feature parameters of minor defects in the cladding layer and the feature parameters of heterogeneous structures. Redundant features with high correlation are deleted, and feature parameters that highly represent the independence of minor defects and heterogeneous structures are obtained. Then, SHAP values are used to analyze the importance of feature parameters in affecting the fatigue life of the axle, and key feature parameters and their weights affecting the fatigue performance of laser cladding repaired axles are selected.
[0039] Remaining fatigue life prediction module: The key feature parameter extraction module is used to screen out the key feature parameters that affect the fatigue performance of laser cladding repair of axles. The influence of the key feature parameters is quantified into the fatigue crack tip stress intensity factor according to the weight, and the remaining fatigue life is predicted based on the fracture mechanics framework.
[0040] The beneficial effects of this invention are as follows:
[0041] The prediction method described in this invention performs dimensionality reduction on the original features affecting the fatigue life of shaft parts through pre-principal component analysis (PCA), and then uses the random forest algorithm to evaluate the importance of principal components and select the core features affecting the fatigue life as model input variables.
[0042] A fatigue life prediction model based on the XGBoost algorithm was constructed. The model was trained and optimized by dynamically adjusting hyperparameters and setting regularization constraints. The feature contribution was analyzed by SHAP value to quantify the influence weight of key factors on life and ensure that the prediction results conform to the laws of actual engineering applications. Attached Figure Description
[0043] Figure 1 This is a flowchart of the laser cladding repair method for predicting the fatigue life of shaft parts as described in this invention. Detailed Implementation
[0044] To make the objectives, technical solutions, beneficial effects, and significant advancements of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings provided in the examples of the present invention. Obviously, all the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0045] In the description of this application, unless otherwise expressly specified and limited, the terms "first," "second," and "third" are used for descriptive purposes only and should not be construed as indicating or implying relative importance; the term "multiple" refers to two or more; unless otherwise specified or explained, the terms "connected," "fixed," etc., should be interpreted broadly. For example, "connected" can be a fixed connection, a detachable connection, an integral connection, or an electrical connection; "connected" can be a direct connection or an indirect connection through an intermediate medium. Those skilled in the art can understand the specific meaning of the above terms in this application according to the specific circumstances.
[0046] like Figure 1 As shown, a method for predicting the fatigue life of laser cladding repair shaft parts includes:
[0047] Step 1: Multi-source feature data acquisition
[0048] Shaft-type part specimens were prepared using typical laser cladding repair process parameters; multi-source characteristic data affecting the fatigue life of shaft-type parts after laser cladding repair were collected, including:
[0049] Geometric parameters were measured using micro-CT, including the size, location, shape, and distribution of pores in the laser cladding repair layer.
[0050] The surface roughness was measured using a surface roughness meter: the surface roughness was Ra 1.6–6.3 μm;
[0051] The properties of the cladding repair material were measured using equipment: hardness 45–62 HRC, tensile strength 1200–1800 MPa, yield strength 900–1400 MPa, and elongation 8%–18%.
[0052] Process parameters: laser power 1.1~1.5kW, cladding speed 11~15mm / s, spot diameter 1~3mm, defocusing amount -2~2mm, scanning speed 100~500mm / min, powder feeding speed 8~25g / min, preheating temperature 200~600℃;
[0053] Loading conditions: cyclic stress amplitude of 200-800 MPa, average stress of 100-400 MPa, load frequency of 5-20 Hz, and stress ratio of -1-0.5.
[0054] Simultaneously acquire measured fatigue life data as labels C1, C2, ..., Cn to ensure a one-to-one correspondence between the data. For example, C1 corresponds to a cladding layer with a maximum cross-sectional diameter of 20 mm, a surface roughness of Ra 1.6 μm, a hardness of 45 HRC, a tensile strength of 1200 MPa, a yield strength of 900 MPa, an elongation of 8%, a laser power of 1.1 kW, a cladding speed of 11 mm / s, a spot diameter of 1 mm, a defocusing amount of -2 mm, a scanning speed of 100 mm / min, a powder feeding speed of 8 g / min, a preheating temperature of 200℃, a cyclic stress amplitude of 200 MPa, an average stress of 100 MPa, a load frequency of 5 Hz, and a stress ratio of -1. C2 corresponds to a cladding layer with a maximum cross-sectional diameter of 30 mm and a surface roughness of Ra 1.6 μm. The specifications include: 2.4 μm thickness, hardness of 50 HRC, tensile strength of 1300 MPa, yield strength of 950 MPa, elongation of 9%, laser power of 1.4 kW, cladding speed of 13 mm / s, spot diameter of 1.5 mm, defocusing amount of 0 mm, scanning speed of 150 mm / min, powder feeding speed of 10 g / min, preheating temperature of 400℃, cyclic stress amplitude of 200 MPa, average stress of 100 MPa, load frequency of 5 Hz, stress ratio of -1, etc. Fatigue life is defined as the number of loading cycles corresponding to fatigue fracture of the specimen.
[0055] Step 2: Create a dataset
[0056] S21: Data Cleaning
[0057] The data collected in step one is screened. If there are missing values, they are processed using the average interpolation method. One-hot encoding is performed on the categorical variables to convert them into numerical format.
[0058] And normalize the values according to the standardization formula to eliminate the dimension difference and make different features comparable.
[0059] Standardization formula:
[0060] S22: Use the principal component analysis (PCA) method to reduce the dimension of the multi-dimensional features standardized in step S21. The idea of PCA is to map the n-dimensional features to the k-dimensional space (k < n), and the k-dimensional space is a new orthogonal feature space.
[0061] First, centralize the data: subtract the mean of each feature of the data to make the mean of the data 0, thus moving the distribution center of the data to the origin for subsequent calculation of the covariance matrix.
[0062] Centralization formula:
[0063] S23: Calculate the covariance of the centralized data matrix. The covariance matrix describes the correlation between data features; obtain the eigenvalues and eigenvectors by solving the covariance matrix; the eigenvalues reflect the variance of the data in the direction of the corresponding eigenvectors. The larger the eigenvalue, the greater the degree of change of the data in that direction and the more information it contains.
[0064] S24: Then sort the eigenvalues in descending order, and the corresponding eigenvectors are sorted accordingly; select the first k largest eigenvalues and their corresponding eigenvectors. These eigenvectors form the basis vectors of the new low-dimensional space; project the original data into the low-dimensional space composed of k eigenvectors to obtain the data after dimensionality reduction. The selection of the threshold k is based on the cumulative variance contribution rate (≥95%) and the scree plot inflection point criterion to screen the number of principal components to be 5 - 15.
[0065] S25: Associate physical meanings with the selected principal component parameters, such as: 1.1 kW is associated with laser power, 45 HRC is associated with hardness, 1200 MPa is associated with tensile strength, 8% is associated with elongation, etc., so that each parameter corresponds to the actual situation one by one.
[0066] Step three: Random forest feature extraction
[0067] S31: Based on the data after PCA dimensionality reduction, divide the data set into a training set (80%) and a test set (20%); set the model parameters, set parameters such as the number of trees from 10 to 1000 and the maximum depth of the tree from 3 to 10, establish a feature extraction model, and train the model.
[0068] S32: Evaluate the performance of the model obtained in S31 and calculate R 2RMS, RMSE, etc., are used to determine the accuracy of the model based on the data size. 2 The calculated value is between 0 and 1, with a larger value indicating a more accurate model; the smaller the RMSE calculated value, the more accurate the model.
[0069] S33: Then, a random forest regression model is used to quantify the importance of features. The importance score of the principal component is calculated based on the reduction of node splitting error and then mapped back to the original feature weights. The feature variables are identified as minimum cross-sectional diameter, surface roughness, hardness, tensile strength, laser power, cladding speed, and cyclic stress amplitude, thereby identifying key features.
[0070] S34: Determine whether the evaluation result of step S32 meets the expected accuracy of the model, such as R. 2 Check if the calculated value is greater than 0.9 (threshold) and if the calculated value RMSE is close to 0. If yes, proceed to step four; otherwise, return to S31.
[0071] Step 4: XGBoost Model Training
[0072] S41: Integrate the feature variables extracted by principal component analysis and random forest with the fatigue life labels obtained in step one into a structured dataset;
[0073] S42: Divide the dataset into a training set (80%), a validation set (10%), and a test set (10%). The training set is used for model training, the validation set is used for parameter tuning and early stopping, and the test set is used for final model evaluation; build the XGBoost model;
[0074] S43: Initialize the XGBoost model parameters, setting the maximum tree depth to 3-10, the learning rate to 0.01-0.3, the regularization parameter α to 0.1-1, and λ to 0.8-1;
[0075] S44: The model is trained and cross-validated using 5-fold cross-validation and early stopping (stopping when the validation set loss does not decrease for 10 consecutive iterations); then, the optimal hyperparameter combination is automatically searched through Bayesian optimization for hyperparameter tuning, and regularization constraints are introduced to prevent overfitting.
[0076] S45: Validate model performance using a test set (R 2 ≥0.90, RMSE≤5% of lifetime range), the prediction stability is tested using the residual distribution, and the feature contribution is analyzed based on the SHAP value, achieving full-dimensional verification of "prediction accuracy - error distribution - physical interpretability" to ensure the reliability of the model in industrial scenarios. If the test model performance does not meet the above standards, return to S44; otherwise, it is considered that the XGBoost model has been trained and proceed to step five;
[0077] Step 5: Lightweight Deployment and Dynamic Updates of the Model
[0078] The trained XGBoost model is exported as a JSON file and embedded in an edge computing device or industrial server; the exported file contains the tree structure, splitting conditions, and leaf node weights.
[0079] Deploy the model on an industrial server and design a RESTful API interface to receive real-time data streams (such as stress and temperature signals collected by sensors).
[0080] Establish a feedback mechanism: every 24 hours, features are dynamically extracted using PCA and random forest modules and then input into the model to collect new data, triggering incremental training.
[0081] If the SHAP value detects a feature contribution drift exceeding the threshold (e.g., ±10%), return to step four to automatically start model retraining; retain 20% of historical data as a validation set to prevent model performance degradation, thereby solving the problem that statically deployed models in industrial scenarios cannot be updated online, leading to a decline in prediction performance over time.
[0082] Furthermore, it should be understood that although this specification describes embodiments, not every embodiment contains only one independent technical solution. This narrative style of the specification is merely for clarity. Those skilled in the art should regard the specification as a whole, and the technical solutions in the embodiments can also be appropriately combined to form other embodiments that can be understood by those skilled in the art.
Claims
1. A method for predicting the fatigue life of laser cladding repaired shaft parts, comprising: Step 1: For shaft parts prepared using typical laser cladding repair process parameters, micro-area characterization techniques are used to obtain the micro-defect characteristics of each part and the heterogeneous microstructure characteristics of the cladding layer-heat affected zone-matrix. Rotational bending fatigue tests are conducted with different fatigue amplitudes and a stress ratio of -1. The crack initiation location and propagation path of each fatigue specimen are statistically analyzed using ultra-depth-of-field electron microscopy and scanning electron microscopy. The rotational bending fatigue life data corresponding to the micro-defect characteristics and heterogeneous microstructure characteristics are obtained, and fatigue life labels are established for the data. The micro-defect characteristics include the number, shape, size, and distribution characteristics of micro-defects in the fatigue specimen, and the heterogeneous microstructure characteristics include grain size, grain orientation, texture intensity, austenite / martensite content, and residual stress. Step 2: Based on the above-mentioned microstructure characterization and fatigue test results of laser cladding repaired axles, establish an initial dataset including the characteristic parameters of minute defects in the cladding layer, the characteristic parameters of heterogeneous microstructure, and axle fatigue life data; The initial dataset is preprocessed and dimensionality reduced to obtain a dataset with a balanced number of samples of different fatigue lives of the repaired axles; Step 3: Form a dataset from the dimensionality-reduced data, build a feature extraction model, and train it using the obtained dataset; use a random forest regression model to quantify feature importance and identify key features; Step 4: Construct a fatigue life prediction model for the XGBoost algorithm. The key feature variables obtained in Step 3 are integrated with the fatigue life labels obtained in Step 1 to form a structured dataset for training. The correlation between the feature parameters of minor defects in the cladding layer and the feature parameters of heterogeneous structures is evaluated using dynamic hyperparameter adjustment and regularization constraints. Redundant features with high correlation are removed, and feature parameters that characterize minor defects and heterogeneous structures with high independence are obtained. Then, SHAP values are used to analyze the importance of feature parameters in affecting the fatigue life of the axle, and key feature parameters and their weights affecting the fatigue performance of laser cladding repaired axles are selected. By screening out key characteristic parameters that affect the fatigue performance of laser cladding repaired axles, the influence of these key characteristic parameters is quantified into fatigue crack tip stress intensity factors according to their weights, and the remaining fatigue life is predicted based on the fracture mechanics framework.
2. The method for predicting the fatigue life of laser cladding repaired shaft parts according to claim 1, characterized in that, Step two specifically includes: S21: Standardize the data collected in step one to make different features comparable; S22: Principal component analysis is used to reduce the dimensionality of the standardized multidimensional features; then the original data is projected into a low-dimensional space composed of k feature vectors to obtain the dimensionality-reduced data; the threshold k is selected based on the cumulative variance contribution rate and the scree plot inflection point criterion.
3. The method for predicting the fatigue life of laser cladding repaired shaft parts according to claim 1, characterized in that, Step three specifically includes: S31: Form a dataset from the dimensionality-reduced data and divide it into training and test sets; build a feature extraction model and train it using the obtained dataset; S32: Evaluate the performance of the model obtained in S31 and calculate R0. 2 RMSE, which determines the accuracy of the model based on the data size. 2 The calculated value is between 0 and 1, with a larger value indicating a more accurate model; the smaller the RMSE calculated value, the more accurate the model. S33: A random forest regression model is used to quantify feature importance. The principal component importance score is calculated based on the reduction of node splitting error and then mapped back to the original feature weights to identify key features. S34: Determine whether the result of the evaluation in step S32 meets the expected accuracy. If yes, proceed to step four; otherwise, return to S31.
4. The method for predicting the fatigue life of laser cladding repaired shaft parts according to claim 1, characterized in that, Step four specifically includes: S41: After matching the key feature variables obtained in step three with the fatigue life labels obtained in step one, integrate them into a structured dataset; S42: Build an XGBoost model and train it using the dataset generated in S41; S43: Five-fold cross-validation and early stopping are used for model training and cross-validation; then Bayesian optimization is used to automatically search for the optimal hyperparameter combination for hyperparameter tuning, and regularization constraints are introduced to prevent overfitting. S44: Verify model performance using the test set, test prediction stability using the residual distribution, and then analyze feature contribution based on SHAP values to test the obtained model until a trained XGBoost model is obtained. S45: Using the trained XGBoost model, we obtain feature parameters that characterize minute defects and heterogeneous structures with high independence; then, we use SHAP values to analyze the importance of the feature parameters in the fatigue life of the axle, and screen out the key feature parameters and weights that affect the fatigue performance of the axle repaired by laser cladding.
5. The method for predicting the fatigue life of laser cladding repaired shaft parts according to claim 4, characterized in that, For the fatigue life prediction model of the trained XGBoost algorithm, the following is also included: establishing a feedback mechanism, which dynamically extracts features through principal component analysis and random forest algorithm in steps two and three every 24 hours and inputs them into the fatigue life prediction model to collect new data and trigger incremental training; If the SHAP value detects a feature contribution drift exceeding a preset threshold, return to step four for retraining.
6. A fatigue life prediction system for laser cladding repair of shaft parts, characterized in that... include: Multi-feature collection module: For shaft parts prepared using typical laser cladding repair process parameters, micro-area characterization methods are used to obtain the micro-defect characteristics of each specimen and the heterogeneous microstructure characteristics of the cladding layer-heat affected zone-matrix microstructure. Rotational bending fatigue tests with different fatigue amplitudes and stress ratios of -1 are carried out. The crack initiation location and propagation path of each fatigue specimen are statistically analyzed using ultra-depth electron microscopy and scanning electron microscopy. The crack initiation location of each fatigue specimen is statistically analyzed, and the rotational bending fatigue life data corresponding to the micro-defect characteristics and heterogeneous microstructure characteristics are obtained. Fatigue life labels corresponding to the data are established. The micro-defect characteristics include the number, shape, size, and distribution characteristics of micro-defects in the fatigue specimen, and the heterogeneous microstructure characteristics include grain size, grain orientation, texture intensity, austenite / martensite content, and residual stress. Data preprocessing module: Based on the microstructure characterization and fatigue test results of the laser cladding repair axle from the multi-dimensional feature collection module, an initial dataset is established, including the micro-defect feature parameters of the cladding layer, the feature parameters of the heterogeneous structure, and the fatigue life data of the axle. The initial dataset is preprocessed and dimensionality reduced to obtain a dataset with a balanced number of samples of different fatigue lives of the repaired axles; Feature extraction module: The data obtained by the data preprocessing module is used to form a dataset, a feature extraction model is established, and the obtained dataset is used for training; a random forest regression model is used to quantify the importance of features and identify key features; Key Feature Parameter Extraction Module: A fatigue life prediction model based on the XGBoost algorithm is constructed. Key feature variables obtained from the feature extraction module are matched with fatigue life labels obtained from the multivariate feature collection module and integrated into a structured dataset for training. A method of dynamically adjusting hyperparameters and setting regularization constraints is used to evaluate the correlation between the feature parameters of minor defects in the cladding layer and the feature parameters of heterogeneous structures. Redundant features with high correlation are deleted, and feature parameters that highly represent the independence of minor defects and heterogeneous structures are obtained. Then, SHAP values are used to analyze the importance of feature parameters in affecting the fatigue life of the axle, and key feature parameters and their weights affecting the fatigue performance of laser cladding repaired axles are selected. Remaining fatigue life prediction module: The key feature parameter extraction module is used to screen out the key feature parameters that affect the fatigue performance of laser cladding repair of axles. The influence of the key feature parameters is quantified into the fatigue crack tip stress intensity factor according to the weight, and the remaining fatigue life is predicted based on the fracture mechanics framework.