A method for optimizing prediction accuracy of soil erosion model
By introducing biocrust factors into the soil erosion model and combining them with machine learning optimization methods, the problem of insufficient prediction accuracy of soil erosion models in arid and semi-arid regions was solved, and more accurate soil loss prediction was achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 黄河流域水土保持生态环境监测中心
- Filing Date
- 2026-03-12
- Publication Date
- 2026-06-12
AI Technical Summary
Existing soil erosion models have insufficient prediction accuracy in arid and semi-arid regions, mainly due to the lack of quantitative characterization of biocrusts and factor coupling, which makes it impossible to effectively capture the nonlinear relationships and interactions among multiple factors, thus limiting the applicability of the models in biocrust distribution areas.
A biological crust factor equation was constructed and incorporated as a sub-factor into the Chinese soil loss equation. Nonlinear relationships were captured by gradient boosting decision trees, and the model was calibrated by combining cross-validation and measured data to generate a high-resolution raster layer, thereby improving the model's forecast accuracy.
The quantitative characterization of the erosion reduction effect of biological crusts was achieved, the prediction accuracy of the model in the distribution area of biological crusts was improved, the parameter system was improved, and the spatial matching degree and fitting accuracy were enhanced.
Smart Images

Figure CN122198348A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of soil and water conservation and ecological environment engineering technology, specifically to a method for optimizing the prediction accuracy of soil erosion models. Background Technology
[0002] Soil erosion models are key tools for quantitatively assessing soil loss and guiding regional soil and water conservation planning. The Chinese Soil Loss Equation (CSLE), as a mainstream model adapted to my country's soil erosion characteristics and landform types, still exhibits significant shortcomings in prediction accuracy when applied to arid and semi-arid regions. The core reason lies in the fact that existing models have failed to fully and reasonably quantitatively characterize and integrate the soil and water conservation function of biological crusts into the model system. Specific problems include:
[0003] Within the current soil erosion model framework, there is a lack of universally applicable quantitative factors for biocrusts. There is no unified and operable method to quantitatively characterize the role of biocrusts in reducing soil loss, thus failing to achieve the model-based transformation of the soil and water conservation function of biocrusts.
[0004] The vegetation cover and biological measures factors in the existing Chinese soil loss equation are only constructed based on vascular vegetation and do not include the erosion reduction effect of biocrust. There is no mature technical means to couple the biocrust factor as a sub-factor with the existing factor, which leads to the core parameters of the model not matching the actual situation in areas with extensive biocrust coverage.
[0005] Optimization methods for soil erosion models lack the ability to capture complex nonlinear relationships and high-order interactions among multiple factors. Traditional model validation methods are difficult to achieve a scientific and systematic evaluation of the accuracy of the corrected model, which further limits the applicability of the model in areas where biological crusts are distributed.
[0006] Therefore, it is urgent to construct a set of factor equations that can quantify the erosion reduction effect of biological crusts, effectively integrate them into the Chinese soil loss equation to complete model correction, and establish scientific model optimization and evaluation methods to improve the prediction accuracy of soil erosion models in areas where biological crusts are distributed. This method has important application value for improving the soil erosion model system and serving the accurate assessment of soil and water loss in the region. Summary of the Invention
[0007] This invention aims to overcome the technical shortcomings of existing soil erosion models in terms of insufficient forecast accuracy in arid and semi-arid regions, and provides a method to optimize the forecast accuracy of soil erosion models. This invention achieves quantitative characterization of the erosion-reducing effect of biocrust by constructing a biocrust factor equation, incorporates it as a sub-factor into the Chinese soil loss equation to complete model correction, and combines gradient boosting decision trees to capture the complex nonlinear relationships between erosion factors. Model calibration is completed through cross-validation and measured data, ultimately achieving a significant improvement in the forecast accuracy of soil erosion models.
[0008] To solve the above-mentioned technical problems, the technical solution provided by the present invention is as follows:
[0009] A method for optimizing the prediction accuracy of a soil erosion model is proposed. This method constructs a biocrust factor equation applicable to different environments, calculates soil erosion using the Chinese soil loss equation, incorporates the biocrust factor as a sub-factor into a soil erosion database, and modifies the original model. A gradient boosting decision tree is used to capture complex nonlinear relationships and high-order interactions in the data. The modified model is then evaluated and calibrated using cross-validation and measured data. The method includes the following steps:
[0010] S1. Calculation of biocrust factor: Biocrust factor is characterized by soil loss rate. Biocrust factor values are calculated grid by grid through coupled probability mapping, spatial interpolation and kriging optimization to generate a watershed-scale high-resolution raster layer.
[0011] S2. Soil erosion model calculation: The Chinese soil loss equation is used to calculate the amount of soil loss on the slope. The biological crust factor is included as a sub-factor of vegetation cover and biological measures factor in the model. The original vegetation cover and biological measures factor are corrected to obtain the corrected Chinese soil loss equation.
[0012] S3. Evaluation of the modified soil erosion model: The training set and validation set were divided using spatial stratified sampling. A machine learning optimization model with gradient boosting decision tree as the core was constructed. The objective function adopted was the Huber loss function. The optimal hyperparameter combination was automatically searched through Bayesian optimization algorithm. The prediction accuracy of the model was evaluated by combining the original factors and the biological crust factor, using root mean square error, mean absolute error and coefficient of determination.
[0013] Further, in step S1, the biological crust factor equation is:
[0014] ;
[0015] in, It is a biological skin-forming factor. The percentage of soil loss;
[0016] ;
[0017] in, For the sediment yield of biological crust plots, This represents the amount of soil erosion on sloping farmland.
[0018] Furthermore, in step S2, the original model expression for the Chinese soil loss equation is:
[0019] ;
[0020] in For soil erosion modulus, As the erosivity factor of rainfall, As a soil erodibility factor, For slope length and slope factor, Vegetation cover and biological factors, As engineering measure factors, For cultivation measures factors; , , , All are dimensionless;
[0021] Furthermore, vegetation cover and biological measures factors are calculated differently according to land use type, using the following formula:
[0022] ;
[0023] in, The soil loss rate, Weighted by month;
[0024] Furthermore, vegetation cover and biological measures factors are calculated separately for two different land use types: grassland and non-grassland.
[0025] The formula for calculating the soil loss rate for non-grassland areas is:
[0026] ;
[0027] The formula for calculating the soil loss rate for grassland is:
[0028] ;
[0029] in, Vegetation cover calculated based on the normalized vegetation index;
[0030] The revised expressions for vegetation cover and biological measures factors are as follows:
[0031] ;
[0032] in, Vegetation cover and biological control factors containing biocrust factors, Vegetation cover and biological factors, It is a biological skin-forming factor. , , All are dimensionless.
[0033] The advantages of this invention compared to the prior art are:
[0034] This invention constructs a biocrust factor equation with soil loss ratio as the core, realizes the quantitative characterization of the role of biocrust in reducing soil loss, fills the gap in the quantitative factor of biocrust in the current soil erosion model, and provides a unified and operable quantitative indicator for the model integration of biocrust function.
[0035] This invention incorporates biocrust factors as sub-factors of vegetation cover and biological measures factors into the Chinese soil loss equation, establishes a coupling method between biocrust factors and core parameters of existing models, improves the parameter system of soil erosion models, and solves the core problem that existing models do not consider the erosion reduction effect of biocrust.
[0036] This invention calculates the biocrust factor values grid by grid through coupled probability mapping, spatial interpolation, and kriging optimization, generating a high-resolution raster layer at the watershed scale. This achieves accurate calculation and spatial representation of the biocrust factor at the watershed scale, and improves the spatial matching degree between the factor and the soil erosion model.
[0037] This invention uses gradient boosting decision trees to construct a machine learning optimization model, and combines the Huber loss function and Bayesian optimization algorithm to effectively capture the complex nonlinear relationships and high-order interactions among various factors affecting soil erosion, thereby improving the model's fitting accuracy and generalization ability.
[0038] This invention establishes a model evaluation system based on root mean square error, mean absolute error, and coefficient of determination, combined with spatial stratified sampling and cross-validation. This system enables scientific and systematic verification of the accuracy of the corrected soil erosion model, providing a basis for the model's practical application. Attached Figure Description
[0039] The accompanying drawings, which form part of this application, are used to provide a further understanding of the application and to make other features, objects, and advantages of the application more apparent. The illustrative embodiments and descriptions of this application are used to explain the application and do not constitute an undue limitation of the application.
[0040] In the attached diagram:
[0041] Figure 1 This is a flowchart of a method for optimizing the prediction accuracy of a soil erosion model in Example 1.
[0042] Figure 2 This is a statistical graph illustrating the establishment of the biological crust factor equation under indoor simulated rainfall conditions in Example 1.
[0043] Figure 3 This is a verification diagram of the biological crust factor equation under natural rainfall conditions in the field in Example 1.
[0044] Figure 4This is a comparison chart of the calculated and measured values of soil loss under different rainfall conditions in Example 1.
[0045] Figure 5 This is a comparison chart of the calculated and measured values of soil loss under different rainfall conditions in Example 1 using the corrected model. Detailed Implementation
[0046] The following detailed description of the embodiments is intended to exemplify the principles of this application, but should not be used to limit the scope of this application. That is, the method for optimizing the prediction accuracy of soil erosion models in this application is not limited to the described embodiments.
[0047] The present invention will be further described below with reference to embodiments.
[0048] like Figure 1 As shown, a method for optimizing the prediction accuracy of a soil erosion model is proposed. This method constructs a biocrust factor equation suitable for different environments, calculates soil erosion using the Chinese soil loss equation, incorporates the biocrust factor as a sub-factor into a soil erosion database, and modifies the original model. A gradient boosting decision tree is used to capture complex nonlinear relationships and high-order interactions in the data. The modified model is then evaluated and calibrated using cross-validation and measured data. The method includes the following steps:
[0049] S1. Calculation of biocrust factor: Biocrust factor is characterized by soil loss rate. Biocrust factor values are calculated grid by grid through coupled probability mapping, spatial interpolation and kriging optimization to generate a watershed-scale high-resolution raster layer.
[0050] S2. Soil erosion model calculation: The Chinese soil loss equation is used to calculate the amount of soil loss on the slope. The biological crust factor is included as a sub-factor of vegetation cover and biological measures factor in the model. The original vegetation cover and biological measures factor are corrected to obtain the corrected Chinese soil loss equation.
[0051] S3. Evaluation of the modified soil erosion model: The training set and validation set were divided using spatial stratified sampling. A machine learning optimization model with gradient boosting decision tree as the core was constructed. The objective function adopted was the Huber loss function. The optimal hyperparameter combination was automatically searched through Bayesian optimization algorithm. The prediction accuracy of the model was evaluated by combining the original factors and the biological crust factor, using root mean square error, mean absolute error and coefficient of determination.
[0052] Further, in step S1, the biological crust factor equation is:
[0053] ;
[0054] in, It is a biological skin-forming factor. The percentage of soil loss;
[0055] ;
[0056] in, For the sediment yield of biological crust plots, This represents the amount of soil erosion on sloping farmland.
[0057] In a specific embodiment, based on a spatial distribution model of biocrust that can quantify the relationship between environmental variables and distribution probabilities, and slope biocrust factor data under different environmental backgrounds, the biocrust factor values are calculated grid-by-grid by a method that couples probability mapping, spatial interpolation, and kriging optimization. The probability mapping multiplies the distribution probability of the spatial distribution model of biocrust with the slope biocrust factor data according to environmental parameters. Spatial interpolation uses inverse distance weighting or random forest interpolation for missing areas. Kriging optimization uses a semi-variogram function to fit spatial autocorrelation to generate a continuous surface. Finally, a high-resolution raster layer at the watershed scale is generated. The accuracy is ensured by cross-validation and correction with measured data. The output format maintains the Albers projection coordinate system and GeoTIFF encoding standard consistent with the basic geographic data.
[0058] Furthermore, in step S2, the soil erosion model calculation adopts the Chinese Soil Loss Equation (CSLE), which is based on the USLE / RUSLE model and was developed by Liu Baoyuan et al. based on the soil and water loss situation and landform characteristics in my country. It is particularly suitable for the landform type of the Loess Plateau and can more accurately estimate the slope soil erosion modulus in this region.
[0059] The original model expression for the soil loss equation in China is:
[0060] ;
[0061] in Soil erosion modulus, As the erosivity factor of rainfall, As a soil erodibility factor, For slope length and slope factor, For vegetation cover and biological measures factors, As engineering measure factors, For cultivation measures factors; , , , All are dimensionless;
[0062] Apart from Apart from that, the calculation of other factors is based on the "Technical Regulations for Dynamic Monitoring of Regional Soil and Water Loss (Trial)" issued by the Ministry of Water Resources of the People's Republic of China in 2018.
[0063] Furthermore, vegetation cover and biological measures factors are calculated differently according to land use type, using the following formula:
[0064] ;
[0065] in, The soil loss rate, Weighted by month;
[0066] Furthermore, vegetation cover and biological measures factors are calculated separately for two different land use types: grassland and non-grassland.
[0067] The formula for calculating the soil loss rate for non-grassland areas is:
[0068] ;
[0069] The formula for calculating the soil loss rate for grassland is:
[0070] ;
[0071] in, Vegetation cover calculated based on the normalized vegetation index;
[0072] The revised expressions for vegetation cover and biological measures factors are as follows:
[0073] ;
[0074] in, Vegetation cover and biological control factors containing biocrust factors, Vegetation cover and biological factors, It is a biological skin-forming factor. , , All are dimensionless.
[0075] In a specific embodiment, the evaluation of the modified soil erosion model involves comparing the simulated erosion values of the original and new models with multi-year observation values from multiple field ecological observation stations, including the Suide Soil and Water Conservation Comprehensive Experimental Station and the Ansai Soil and Water Conservation Comprehensive Experimental Station of the Chinese Academy of Sciences, as well as field monitoring values from the research group. A spatial stratified sampling method was used to randomly divide the experimental data into two parts: 80% as the training set for model construction and 20% as the validation set for model accuracy verification. Based on the erosion calculated by the original CSLE model, and combined with the original factors and biocrust factors in the dataset, a machine learning model with Gradient Boosting Decision Tree (GBDT) as its core was constructed. The model's objective function adopted the Huber loss function to reduce outlier interference, and the optimal hyperparameter combination was automatically searched using a Bayesian optimization algorithm. The modified GBDT model was evaluated using an independent test set, and indicators such as root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R²) were calculated to ensure that the model's prediction accuracy was significantly better than the traditional CSLE model.
[0076] like Figure 2 As shown, it is reasonable to express biocrust development factors through an exponential function. Referring to the definition of the cover and management factor (C factor) in the USLE / RUSLE model, the biocrust development factor is expressed using the soil loss rate (SLR).
[0077] like Figure 3 As shown, the model was validated using observational data from 10 slope runoff experimental plots in Ansai from 2021 to 2022. The model-calculated and measured values of soil loss ratio (SLR) were relatively evenly distributed on both sides of the 1:1 line. The model NASH coefficients for 2021 and 2022 were 0.82 and 0.66, respectively, and the RMSEs were 0.10 and 0.08, respectively.
[0078] like Figure 4 As shown, incorporating biocrust factors into the slope soil erosion model is necessary. Soil loss calculated based on the original model, regardless of whether under simulated or natural rainfall conditions, shows significant errors compared to the corresponding measured values; the model-calculated values are all higher than the measured values.
[0079] like Figure 5As shown, after correcting the biocrust factor into the soil erosion prediction model, under simulated rainfall in the field, the corrected model has a Nash coefficient of 0.56 and an RMSE of 0.41. Under natural rainfall in the field, the corrected model has a Nash coefficient of 0.48 and an RMSE of 0.34. Finally, the figure shows that the predicted soil loss and the measured soil loss are relatively evenly distributed on both sides of the 1:1 line. The established corrected soil loss equation can not only be used for soil erosion surveys, but also to more accurately calculate the soil and water loss on slopes.
[0080] It should be noted that the combination of the technical features in this case is not limited to the combination methods described in the claims of this case or the combination methods described in the specific embodiments. All technical features described in this case can be freely combined or combined in any way, unless they contradict each other.
[0081] It should also be noted that the embodiments listed above are merely specific embodiments of the present invention. Obviously, the present invention is not limited to the above embodiments, and similar changes or modifications made thereto are those that can be directly derived or easily conceived by those skilled in the art from the content disclosed in the present invention, and should all fall within the protection scope of the present invention.
Claims
1. A method for optimizing the prediction accuracy of soil erosion models, characterized in that, A biocrust factor equation suitable for different environments was constructed, and soil erosion was calculated using the Chinese soil loss equation. The biocrust factor was incorporated as a sub-factor into the soil erosion database to modify the original model. A gradient boosting decision tree was used to capture complex nonlinear relationships and high-order interactions in the data. The modified model was evaluated and calibrated using cross-validation and measured data, including the following steps: S1. Calculation of biocrust factor: Biocrust factor is characterized by soil loss rate. Biocrust factor values are calculated grid by grid through coupled probability mapping, spatial interpolation and kriging optimization to generate a watershed-scale high-resolution raster layer. S2. Soil erosion model calculation: The Chinese soil loss equation is used to calculate the amount of soil loss on the slope. The biological crust factor is included as a sub-factor of vegetation cover and biological measures factor in the model. The original vegetation cover and biological measures factor are corrected to obtain the corrected Chinese soil loss equation. S3. Evaluation of the modified soil erosion model: The training set and validation set were divided using spatial stratified sampling. A machine learning optimization model with gradient boosting decision tree as the core was constructed. The objective function adopted was the Huber loss function. The optimal hyperparameter combination was automatically searched through Bayesian optimization algorithm. The prediction accuracy of the model was evaluated by combining the original factors and the biological crust factor, using root mean square error, mean absolute error and coefficient of determination.
2. The method for optimizing the prediction accuracy of a soil erosion model according to claim 1, characterized in that: In step S1, the equation for the biological crust factor is: ; in, It is a biological skin-forming factor. The percentage of soil loss; ; in, For the sediment yield of biological crust plots, This represents the amount of soil erosion on sloping farmland.
3. The method for optimizing the prediction accuracy of a soil erosion model according to claim 2, characterized in that: In step S2, the original model expression of the Chinese soil loss equation is: ; in Soil erosion modulus, As the erosivity factor of rainfall, As a soil erodibility factor, For slope length and slope factor, For vegetation cover and biological measures factors, As engineering measure factors, For cultivation measures factors; , , , All are dimensionless.
4. The method for optimizing the prediction accuracy of a soil erosion model according to claim 3, characterized in that: The vegetation cover and biological measures factors are calculated differently according to land use type, and the calculation formula is as follows: ; in, The soil loss rate, Weighted by month.
5. The method for optimizing the prediction accuracy of soil erosion models according to claim 4, characterized in that: The vegetation cover and biological measures factors are calculated separately for two different land use types: grassland and non-grassland. The formula for calculating the soil loss rate corresponding to non-grassland areas is as follows: ; The formula for calculating the soil loss rate corresponding to the grassland is: ; in, Vegetation cover calculated based on the normalized vegetation index; The revised expressions for vegetation cover and biological measures factors are as follows: ; in, Vegetation cover and biological control factors containing biocrust factors, Vegetation cover and biological factors, It is a biological skin-forming factor. , , All are dimensionless.