Prediction method for collapse gully, computer device and readable storage medium
By dynamically adjusting the neighborhood radius and hyperparameter combination using the DNSO algorithm, the local optimum problem in traditional gully collapse prediction methods is solved, improving prediction accuracy and stability. This achieves accurate capture of the multi-factor nonlinear characteristics of gully collapse and enhances the robustness of the model.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- JIANGXI ACAD OF WATER RESOURCES (JIANGXI PROVINCE DAM SAFETY MANAGEMENT CENT JIANGXI PROVINCE WATER RESOURCES MANAGEMENT CENT)
- Filing Date
- 2026-05-19
- Publication Date
- 2026-06-19
AI Technical Summary
Traditional methods for predicting landslides struggle to fully capture the nonlinear evolution driven by multiple factors, resulting in limited prediction accuracy. Swarm intelligence optimization algorithms are prone to losing population diversity in the later stages of the search, getting trapped in local optima, which limits the improvement of the accuracy and stability of the prediction model.
The Dynamic Neighborhood Search Optimization (DNSO) algorithm is adopted. By randomly generating an initial population and neighborhood radius, the model is trained and the neighborhood radius is dynamically adjusted according to the fitness value to optimize the hyperparameter combination, construct a target prediction model, and generate the collapse prediction result.
It improves the accuracy and stability of the landslide prediction model, can more accurately capture the nonlinear characteristics of multiple factors coupled together in landslides, enhances the robustness of the model on different datasets, and significantly improves the prediction accuracy.
Smart Images

Figure CN122242283A_ABST
Abstract
Description
Technical Field
[0001] This application belongs to the field of artificial intelligence technology, specifically relating to a method for predicting landslides, a computer device, and a readable storage medium. Background Technology
[0002] Gravel erosion is a severe soil erosion disaster unique to granite regions in southern my country. Its formation is influenced by a combination of factors, including topography, geology, hydrology, and vegetation, and is characterized by high complexity and nonlinearity.
[0003] Traditional prediction methods primarily rely on empirical models and statistical regression, focusing on single-factor analysis or linear relationship modeling. This makes it difficult to comprehensively capture the nonlinear evolutionary patterns driven by multiple factors, resulting in limited prediction accuracy. In recent years, swarm intelligence optimization algorithms (such as genetic algorithms and particle swarm optimization) have been introduced into hill collapse prediction, improving the ability of prediction models to handle complex problems to some extent. However, significant shortcomings remain in practical applications: swarm intelligence optimization algorithms are prone to losing population diversity in the later stages of the search, getting trapped in local optima, and struggling to achieve global parameter optimization, thus limiting the improvement in accuracy and stability of the prediction model. Summary of the Invention
[0004] The purpose of this application is to provide a method, computer device, and readable storage medium for predicting landslides, which can improve the prediction accuracy and stability of the prediction model.
[0005] To solve the above-mentioned technical problems, this application is implemented as follows: In a first aspect, embodiments of this application provide a method for predicting landslides, the method comprising: Randomly generate an initial population with hyperparameters and determine the initial neighborhood radius; Training models are then trained for each individual in the initial population; wherein the training models are constructed based on the hyperparameters. Neighborhood search optimization is performed based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual; wherein, the target optimal individual is a target hyperparameter, the target hyperparameter is determined based on candidate hyperparameters, and the candidate hyperparameters are determined based on the neighborhood radius used in the neighborhood search optimization process; for the neighborhood radius, in the first round of the neighborhood search optimization, the neighborhood radius is the initial neighborhood radius, and in subsequent rounds of the neighborhood search optimization, the neighborhood radius is determined by the initial neighborhood radius and the fitness values of the trained models; The training model corresponding to the optimal individual of the target is trained to obtain the target prediction model; In response to the acquisition of data on factors causing landslides, the target prediction model is invoked to generate a prediction result on whether a landslide will occur.
[0006] Secondly, embodiments of this application provide a hill collapse prediction device, the hill collapse prediction device comprising: The generation module is used to randomly generate the initial population of hyperparameters and determine the initial neighborhood radius; The first training module is used to train a training model corresponding to each individual in the initial population; wherein the training model is constructed based on the hyperparameters. An optimization module is used to perform neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual; wherein, the target optimal individual is a target hyperparameter, the target hyperparameter is determined based on candidate hyperparameters, and the candidate hyperparameters are determined based on the neighborhood radius used in the neighborhood search optimization process; for the neighborhood radius, in the first round of the neighborhood search optimization, the neighborhood radius is the initial neighborhood radius, and in subsequent rounds of the neighborhood search optimization, the neighborhood radius is determined by the initial neighborhood radius and the fitness values of the trained models; The second training module is used to train the training model corresponding to the optimal individual of the target to obtain the target prediction model; The calling module is used to call the target prediction model in response to the acquisition of data on the disaster-causing factors of landslides, and generate a prediction result on whether a landslide will occur.
[0007] Thirdly, embodiments of this application provide a computer device including a processor, a memory, and a program or instructions stored in the memory and executable on the processor, wherein the program or instructions, when executed by the processor, implement the steps of the method described in the first aspect.
[0008] Fourthly, embodiments of this application provide a computer-readable storage medium on which a program or instructions are stored, which, when executed by a processor, implement the steps of the method described in the first aspect.
[0009] Fifthly, embodiments of this application also provide a computer program product, including a computer program that, when executed by a processor, implements the steps of the method described in the first aspect.
[0010] In this embodiment, the neighborhood radius is dynamically determined by the fitness value, allowing the search range to be adaptively adjusted based on the current performance of the trained model. This expands the neighborhood in the early stages of the search (when the fitness value is low) to enhance global exploration, and shrinks it in the later stages (when the fitness value is high) for refined development, effectively preventing population diversity loss and solving the problem of traditional algorithms easily getting trapped in local optima. Furthermore, since the neighborhood radius is determined by the fitness value of the trained model, and the fitness value changes during the neighborhood search optimization process, the neighborhood radius changes dynamically accordingly, thus reducing the sensitivity of model performance to the initially set neighborhood radius. In summary, this embodiment improves the prediction accuracy and stability of the target prediction model. Moreover, because a better combination of hyperparameters can be found through dynamic neighborhood search, the trained target prediction model can more accurately capture the nonlinear characteristics of multi-factor coupling in landslides, thereby significantly improving the prediction accuracy of whether a landslide will occur and enhancing the model's robustness across different datasets. Attached Figure Description
[0011] Figure 1 This is a flowchart illustrating a landslide prediction method provided in some embodiments of this application; Figure 2 These are the ROC curves of the initial RF involved in some embodiments of this application; Figure 3 These are ROC curves of DNSO-RF provided in some embodiments of this application; Figure 4 These are prediction results of the initial RF involved in some embodiments of this application; Figure 5 These are prediction results diagrams of DNSO-RF provided by some embodiments of this application; Figure 6 This is a structural block diagram of a landslide prediction device provided in some embodiments of this application; Figure 7 These are internal structural diagrams of a computer device provided in some embodiments of this application. Detailed Implementation
[0012] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0013] The terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and not to describe a specific order or sequence. It should be understood that such use of data can be interchanged where appropriate so that embodiments of this application can be implemented in orders other than those illustrated or described herein. Furthermore, in the specification and claims, "and / or" indicates at least one of the connected objects, and the character " / " generally indicates that the preceding and following objects are in an "or" relationship.
[0014] The method for predicting landslides provided in this application will be described in detail below with reference to the accompanying drawings, through specific embodiments and application scenarios.
[0015] In one exemplary embodiment, this application proposes a method for predicting landslides, referring to... Figure 1 The method includes steps 102-110. Wherein: Step 102: Randomly generate the initial population of hyperparameters and determine the initial neighborhood radius.
[0016] It should be noted that the initial population can be the initial solution set. X= { x 1 , x 2 , …, x M},in, x i For hyperparameters, M It is a positive integer; the number of individuals in the initial population is not limited in this embodiment.
[0017] In some embodiments, the initial neighborhood radius It can be set to a larger value, thereby enabling a global search of the solution space.
[0018] Step 104: Train the training model corresponding to each individual in the initial population; wherein the training model is constructed based on hyperparameters.
[0019] Step 106: Perform neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual; wherein, the target optimal individual is a target hyperparameter, the target hyperparameter is determined based on candidate hyperparameters, and the candidate hyperparameters are determined based on the neighborhood radius used in the neighborhood search optimization process; for the neighborhood radius, in the first round of the neighborhood search optimization, the neighborhood radius is the initial neighborhood radius, and in subsequent rounds of the neighborhood search optimization, the neighborhood radius is determined by the initial neighborhood radius and the fitness values of the trained models.
[0020] Step 108: Train the training model corresponding to the optimal individual of the target to obtain the target prediction model.
[0021] Step 110: In response to obtaining the data on the disaster-causing factors of hill collapse, the target prediction model is invoked to generate a prediction result on whether a hill collapse will occur.
[0022] In some embodiments, the prediction result can be the outcome of whether a collapse has occurred, including whether it has occurred or not.
[0023] In some embodiments, the data on landslide-causing factors cover dimensions such as topography, geology, hydrology and meteorology, vegetation cover, and human activities.
[0024] Among them, the data of the topography dimension are the topographic factors, and the data sources of the topographic factors may include, but are not limited to: Digital Elevation Model (DEM) and contour maps.
[0025] The DEM can be generated using ALOS PALSAR, TanDEM-X, or UAV LiDAR point clouds. Contour maps can be obtained by digitizing and vectorizing historical surveying data.
[0026] In some embodiments, terrain factors may include, but are not limited to, slope, aspect, and topographic relief.
[0027] Among them, slope θ It is a key factor determining the component of gravity and runoff velocity, and its mathematical expression is as follows: in, / and / These are DEM data in and Rate of change of elevation in the direction.
[0028] Among them, slope aspect The mathematical formulas that influence the degree of sunlight exposure, evaporation, and weathering are as follows: Among them, topographic relief H r The mathematical expression reflecting the depth of surface incision is as follows: in, Indicates the first in the window Line number The elevation values of the raster columns can be displayed in a window of 3×3 or 5×5. A window is a continuous subset of raster cells of a certain shape and size, centered on a raster cell to be processed, used to define the local neighborhood analysis range.
[0029] In some embodiments, the data for the geological lithology dimension may include, but are not limited to: lithology index, soil type, and soil erosion modulus.
[0030] For lithology indices, values can be assigned based on the rock's resistance to weathering (e.g., granite = 0.9, sandstone = 0.6, shale = 0.4, loose deposits = 1.0). Using ArcGIS spatial analysis tools, the lithology type map can be converted into continuously distributed lithology index raster data.
[0031] Soil types can be classified into red soil, yellow soil, lateritic red soil, purple soil, and paddy soil based on soil survey data and field sampling analysis results in the study area. The soil types are assigned values of 0.75, 0.68, 0.82, 0.90, and 0.55, respectively. The soil type map is then converted into continuously distributed lithological index raster data using ArcGIS spatial analysis tools.
[0032] Soil erosion modulus is a quantitative indicator reflecting the intensity of soil erosion in a region, with units of t / (km²·a). It is used to characterize the potential risk of surface soil erosion and is an important basic factor in assessing the likelihood of landslides.
[0033] In some embodiments, the sources of hydrometeorological data may include, but are not limited to, rain gauge data and radar rainfall data. Rain gauge data consists of daily and hourly rainfall records from meteorological stations in and around the study area; radar rainfall data consists of gridded rainfall data with a spatial resolution of 1 km × 1 km.
[0034] In some embodiments, hydro-meteorological data may include, but are not limited to: multi-year average rainfall, number of days with daily rainfall ≥20 mm, and distance from the water system.
[0035] The multi-year average rainfall can be the arithmetic mean of annual rainfall over the past 30 years (or shorter or longer time series) in the study area, expressed in mm. This indicator reflects the overall level of precipitation in the region, is a fundamental parameter for measuring the impact of water input on surface erosion, and is one of the key dynamic factors triggering ridge collapses.
[0036] The number of days with rainfall greater than 20 mm refers to the total number of days during the study period when the daily rainfall reached or exceeded 20 mm. This indicator directly reflects the frequency of heavy rainfall events. By statistically analyzing daily rainfall data from meteorological stations in the study area over the past 30 years (or shorter or longer time series), samples with daily rainfall ≥ 20 mm were selected and the number of days was accumulated. This allows for the quantification of the intra-annual and inter-annual distribution characteristics of heavy rainfall in the region, providing fundamental data for assessing the hydrological driving risks of gully landslides.
[0037] The distance to the water system refers to the straight-line distance from each raster cell in the study area to the nearest river, valley, or other water system, measured in meters. It characterizes the direct erosive effect of surface water on the slope and the groundwater recharge conditions. The closer the distance, the stronger the infiltration and lateral erosion of the slope by the water flow, and the higher the risk of landslides. The Euclidean distance from the raster cell to the nearest river / valley can be calculated using ArcGIS spatial analysis tools' buffer analysis.
[0038] In some embodiments, the data sources for the vegetation cover dimension may include, but are not limited to: remote sensing imagery and Normalized Difference Vegetation Index (NDVI) data products.
[0039] The remote sensing imagery can be Landsat 8 OLI or Sentinel-2 MSI (growing season imagery). NDVI data products can be MODIS NDVI (high temporal resolution) or calculated from remote sensing imagery.
[0040] In some embodiments, the data for the vegetation cover dimension can be vegetation cover degree.
[0041] Among them, vegetation cover (C factor) is an indicator that characterizes the degree of vegetation cover on the ground surface. The higher the vegetation cover, the larger the C factor value, indicating that the vegetation has a stronger protective effect on the ground surface and can effectively reduce the risk of landslides. It is an important ecological factor for assessing the susceptibility to landslides. It can be calculated using the pixel dichotomy model, and its mathematical expression is as follows: in, NDVI soil For pure earth pixels, the value can typically be between 0.05 and 0.2; NDVIveg The value is the pure vegetation pixel value, which can usually be 0.9-0.95; the specific value can be adjusted according to the measured data of the study area, and this embodiment does not limit it.
[0042] In some embodiments, the data for the human activity dimension can be historical landslide data and human activity data. The data sources may include, but are not limited to: historical disaster investigation reports, satellite image interpretation, open street maps (OSM), and landslide susceptibility index (Showalter Index (SI)).
[0043] Among them, historical disaster investigation reports are used to extract the locations, areas, and volumes of landslides that occurred over the past 10-30 years; satellite image interpretation (such as Google Earth, 0.5m resolution) is used for manual delineation or AI identification of historical landslide boundaries; OSM is used to obtain data on roads, settlements, and farmland irrigation facilities; SI can be calculated using the frequency ratio (FR) method for landslide sites, and its mathematical expression is as follows: in, For the first The evaluation factor is at the _ The number of collapses within each grade or category; This represents the total number of hill collapses within the study area; For the first The evaluation factor is at the _ The area of a grid within a level or category; This represents the total area of the study area grid.
[0044] In some embodiments, in order to eliminate the influence of dimensions and adapt to the search space of the subsequent Dynamic Neighborhood Search Optimization (DNSO) algorithm, the acquired raw data can be preprocessed after acquisition to obtain the data on landslide disaster factors.
[0045] Data preprocessing can include missing value handling, outlier removal, and standardization.
[0046] Missing value handling refers to filling missing meteorological data using Kriging interpolation or Multiple Imputation by Chained Equations (MICE).
[0047] The formula for the Kriging estimate is: in, Point to be estimated The predicted value at that location; The weighting coefficients satisfy unbiasedness and minimum variance estimation. For the first Known sampling points Observations at; n The total number of sampling points involved in the interpolation.
[0048] Outlier removal refers to the process of eliminating outliers caused by sensor errors using the 3σ principle or the interquartile range (IQR) method of box plots.
[0049] The criterion for whether to exclude is: if x > Q3 + 1.5 × IQR or x <Q1 A value of 1.5 × IQR is considered abnormal; otherwise, it is normal.
[0050] Standardization refers to the standardization of data values. The standardized factor values (such as slope 0-1, vegetation coverage 0-1) will be used as input features when training the DNSO optimization model (such as the random forest model) as hyperparameters, thereby ensuring the consistency of the search space and the effectiveness of the algorithm.
[0051] In some embodiments, label construction is performed before model training. The construction process can be as follows: based on historical collapse event records, non-collapse points with the same number of collapses (1:1) are randomly selected, and binary labels (1 = occurred, 0 = did not occur) are constructed. The historical collapse event records cover data from the above-mentioned dimensions.
[0052] In some embodiments, the training model may employ a Random Forest (RF) model or other binary classification models; this embodiment is not limited to any particular model. It should be noted that after inputting the feature variables into the RF model, the probability of a landslide can be output. P That is, the model output is a probability value. P ∈[0,1], and divided into different risk intensities by a preset probability threshold (such as 0.5).
[0053] In this embodiment, hyperparameters can be tree depth, number of features, number of trees, etc. It is understood that training models built using different algorithms will have different types of hyperparameters.
[0054] In some embodiments, the fitness value can be the accuracy rate. Or AUC-ROC value.
[0055] in, The mathematical expression is as follows: Among them, TP is a true positive, TN is a true negative, FP is a false positive, and FN is a false negative.
[0056] The ROC curve is plotted with the false positive rate (FPR = FP / (FP+TN)) on the horizontal axis and the true positive rate (TPR = TP / (TP+FN)) on the vertical axis. A value closer to 1 indicates a stronger ability of the model to distinguish between instances of polio occurring and those not occurring. The AUC value is the area under the ROC curve.
[0057] This embodiment dynamically determines the neighborhood radius through the fitness value, adaptively adjusting the search range based on the current performance of the trained model. This expands the neighborhood in the early stages of the search (when the fitness value is low) to enhance global exploration, and shrinks the neighborhood in the later stages (when the fitness value is high) for refined development, effectively preventing the loss of population diversity and solving the problem of traditional algorithms easily getting trapped in local optima. Furthermore, since the neighborhood radius is determined by the fitness value of the trained model, and the fitness value changes during the neighborhood search optimization process, the neighborhood radius changes dynamically accordingly, thus reducing the sensitivity of model performance to the initially set neighborhood radius. In summary, this embodiment improves the prediction accuracy and stability of the target prediction model. Moreover, because a better combination of hyperparameters can be found through dynamic neighborhood search, the trained target prediction model can more accurately capture the nonlinear characteristics of multi-factor coupling in landslides, thereby significantly improving the prediction accuracy of whether a landslide will occur and enhancing the model's robustness across different datasets.
[0058] In some embodiments, the step of performing neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual includes: Repeat the following operations until the loop termination condition is met: The elite individuals of the current population are determined based on the fitness values of each training model after training.
[0059] The elite individuals are randomly perturbed based on their neighborhood radius; wherein, in the first round, the neighborhood radius is the initial neighborhood radius.
[0060] The intermediate population for the next round is determined based on the elite individuals after the perturbation and the elite individuals in the current round of the population.
[0061] If the loop termination condition is met, the candidate hyperparameters corresponding to the training model with the highest fitness value in this round of the population are taken as the target hyperparameters.
[0062] It is understandable that all individuals in all populations are candidate hyperparameters as described above.
[0063] In some embodiments, the loop termination condition may be that the number of iterations reaches a preset threshold, which is not limited in this embodiment; the loop termination condition may also be that the neighborhood radius is less than the minimum neighborhood radius.
[0064] In some embodiments, the elite individuals in the current population can be the individuals with fitness values in the top n% of the current population, where n is a positive integer, such as 5.
[0065] In some embodiments, the random perturbation can be the addition of Gaussian noise, and the mathematical expression for the perturbation is as follows: in, For the elite individuals after the disturbance; For elite individuals; The intensity of the disturbance is controlled by the neighborhood radius, and the two are positively correlated. It follows a normal distribution.
[0066] In some embodiments, in the subsequent rounds, the random perturbation of the elite individuals based on the neighborhood radius includes: The initial neighborhood radius is dynamically reduced based on the fitness values to obtain the neighborhood radius for this round.
[0067] The elite individuals are randomly perturbed based on the current neighborhood radius.
[0068] In some embodiments, the initial neighborhood radius is dynamically reduced based on each fitness value, and the mathematical expression for the neighborhood radius in this round is as follows: in, The radius of the neighborhood in this round, Let the initial neighborhood radius be 1. The attenuation coefficient is... This represents the number of iterations.
[0069] It is understandable that as the number of iterations increases, the fitness value increases and the neighborhood radius decreases. Therefore, the fitness value can implicitly determine the neighborhood radius, that is, as the fitness value increases, the neighborhood radius decreases.
[0070] Specifically, in the early stages of the search, Smaller Approaching 1, ≈ Explore the solution space to discover potential high-quality solutions; in the later stages of the search, as iterations proceed, Increase The fitness value approaches 0 and gradually increases. Shrink until it reaches This avoids wasting computing resources in suboptimal areas.
[0071] In some embodiments, the step of dynamically reducing the initial neighborhood radius based on the fitness value to obtain the current neighborhood radius includes: Based on the maximum fitness value among all fitness values, a radius scaling factor is obtained from a preset mapping relationship; wherein the preset mapping relationship indicates the association between fitness values and radius scaling factors.
[0072] The neighborhood radius for this round is obtained based on the radius scaling factor and the initial neighborhood radius.
[0073] In some embodiments, the radius scaling factor can be the ratio between the fitness value and the initial neighborhood radius, and different fitness values correspond to different radius scaling factors, with a negative correlation between the fitness value and the radius scaling factor. For example, if the maximum fitness value is 80%, the corresponding radius scaling factor is 7, then the neighborhood radius in this round is 5.6.
[0074] In some embodiments, the step of dynamically reducing the initial neighborhood radius based on the fitness value to obtain the current neighborhood radius includes: Determine the target elite individual corresponding to the maximum fitness value among all fitness values.
[0075] Based on the target elite individual, multiple surrounding hyperparameters are determined to assess the changing trend of the neighborhood radius.
[0076] The trend parameters are determined based on the fitness values corresponding to the surrounding hyperparameters; wherein, the trend parameters include local gradient and curvature.
[0077] The initial neighborhood radius is adjusted based on the changing trend parameter to obtain the neighborhood radius for this round.
[0078] In this embodiment, the changing trend of the neighborhood radius corresponding to the target elite individual refers to the local terrain around the target elite individual in the solution space, that is, the surrounding hyperparameters reflect the local terrain information.
[0079] It is understandable that when the local gradient and curvature are large, the initial neighborhood radius can be reduced to obtain the current neighborhood radius; while when the local gradient and curvature are small, the current neighborhood radius can be maintained at the same level as the previous neighborhood radius, or even increased. It should be noted that a large local gradient means the local gradient is greater than a preset gradient threshold, and a small local gradient means the local gradient is less than a preset gradient threshold; a large curvature means the curvature is greater than a preset curvature threshold, and a small curvature means the curvature is less than a preset curvature threshold. The preset gradient threshold and preset curvature threshold are empirical values and are not limited in this embodiment.
[0080] In this embodiment, when the local gradient and curvature are large, the local terrain around the target elite individual is a steep region, meaning that the fitness value changes rapidly in this region. Therefore, the neighborhood radius is reduced to achieve a refined search and improve the accuracy of the target hyperparameters. Conversely, when the local gradient and curvature are small, the local terrain around the target elite individual is a flat region, meaning that the fitness value changes slowly in this region. Therefore, the neighborhood radius is maintained or even increased to quickly traverse this region and improve search efficiency.
[0081] In some embodiments, determining the intermediate population for the next round based on the perturbed elite individuals and the elite individuals of the current round population includes: Determine the fitness value of the elite individuals after the perturbation.
[0082] The acceptance probability is determined based on the fitness value of the elite individual and the fitness value of the perturbed elite individual.
[0083] The elite individuals of the current population are replaced with perturbed elite individuals according to the acceptance probability to obtain the intermediate population of the next round; wherein the acceptance probability is less than or equal to 1.
[0084] In some embodiments, with probability Accepting inferior solutions simulates the annealing process, in which... The mathematical expression is as follows: in, The fitness difference is the difference between the fitness value of an elite individual and the fitness value of the elite individual after perturbation. This refers to a temperature parameter, such as 100.
[0085] It is understandable that, with a probability of 1, the elite individuals in the current population will inevitably be replaced by the perturbed elite individuals.
[0086] In this embodiment, the purpose of accepting inferior solutions is to avoid local optima, find the global optimum as much as possible, and balance the model's convergence speed and diversity, thereby further improving the accuracy of the model's predictions.
[0087] In some embodiments, the step of performing neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual includes: The initial population is divided into multiple initial subpopulations. For example, the initial population is divided into... K Individual Neighborhood N 1 , N 2 ,…,NK .
[0088] The initial optimal individuals for each initial subpopulation are determined based on the fitness values of each trained model after training.
[0089] If the number of cycles exceeds a preset cycle threshold, the initial optimal individuals of each of the initial subpopulations are swapped.
[0090] After the loop ends, the optimal individual for the target is obtained.
[0091] In some embodiments, the preset round threshold is an empirical value, which is not limited in this embodiment. For example, if the preset round threshold is 10, then an exchange is performed every 10 cycles. Through the exchange, the "initial optimal individual" in one subpopulation is incorporated into the individual set of another subpopulation, and the set is obtained through the union of the sets (…). This enables the cross-population transfer of high-quality solutions. The mathematical expression for exchanging the initial optimal individuals is as follows: in, Indicates the first A collection of individuals in a subpopulation; ≠ , indicating subpopulation and These are different subpopulations (ensuring that exchanges occur between different populations and avoiding meaningless self-exchanges). This is the fitness value; For the first A collection of individuals in a subpopulation; This represents the initial optimal individual in each subpopulation; The union operator for sets means that all elements of two sets are combined into a new set (only one duplicate element is kept).
[0092] It should be noted that the process of obtaining the target hyperparameters based on the search optimization of each initial subpopulation can refer to the process of obtaining the target hyperparameters based on the search optimization of the initial population described above, and will not be repeated in this embodiment.
[0093] In this embodiment, the target optimal individual is the individual with the highest fitness value in each subpopulation after the cycle ends. It can be understood that periodically exchanging the initial optimal individuals of each subpopulation can promote information exchange between sub-neighborhoods, thereby enhancing the robustness of the model.
[0094] It should be noted that after the DNSO iteration terminates and the target hyperparameters are obtained, the optimal parameter combination output will be... x The target hyperparameters are loaded into the base prediction model (i.e., the trained model after training) to obtain the target prediction model. The input is the feature data D of the region to be predicted (i.e., a certain area). test At this point, the target prediction model has learned the nonlinear mapping relationship between disaster-causing factors and landslide occurrence through historical training data. f (),Right now: P=f ( X features | x ) in, P The probability of a collapse. X features The input feature vector.
[0095] Obtain real-time monitoring or forecast data of disaster-causing factors in the area to be predicted or in the future time period, denoted as a vector. X input : X input =[ θ , α , R , K , C ,…] T .Will X input Input the target prediction model, and calculate the probability of a collapse through ensemble learning or a probabilistic model. P collapse .
[0096] Using the target prediction model, single-tree voting is performed. i Decision trees based on input features X input Traverse the nodes until you reach a leaf node, then output the binary classification result. y i ∈{0,1} (0 = stable, 1 = collapse), count all N tree The proportion of trees predicted to "collapse" out of the total number of trees. P collapse The mathematical expression is as follows: in, This represents the probability of a landslide occurring in the area to be predicted. N tree This represents the total number of decision trees in the model. y i For the first i The predicted results for each tree; It is an exponential function.
[0097] The risk level threshold is determined by analyzing the probability distribution of historical landslide samples: High risk: P >0.7; Medium risk: 0.3≤ P ≤0.7; Low risk: P <0.3.
[0098] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.
[0099] For ease of understanding, a specific embodiment will be used as an example: Taking Xingning City, Guangdong Province as an example, based on the typical characteristics of gully erosion development in this area, the steps of this method are as follows: Step 1: Overview of the study area and data collection.
[0100] Xingning City, located in the mountainous region of northeastern Guangdong, belongs to the Nanling granite red soil area. Its terrain is mainly low mountains and hills, with elevations ranging from 10 to 1000 meters and steep slopes (average 25°-35°). Annual rainfall is 1500-1800 mm, concentrated during the rainy season (April-September), accounting for over 80% of the annual rainfall. Due to factors such as a deep weathered granite crust and low vegetation cover, landslides are frequent, making it a typical high-risk area for landslides in Guangdong Province. Its data characteristics (such as differences in granite lithology and vegetation cover) represent common problems in southern granite regions.
[0101] Collect multi-source heterogeneous data from 2024 to construct a database containing 10 key impact factors: Slope, aspect, and topographic relief: These were extracted using the 3DAnalyst tool in ArcGIS software based on 30m resolution DEM data of the study area.
[0102] Lithological index: 0.9 for granite, 0.6 for sandstone, 0.4 for shale, and 1.0 for loose sedimentary layers.
[0103] Soil types: Red soil 0.75, Yellow soil 0.68, Lateritic red soil 0.82, Purple soil 0.90, Paddy soil 0.55.
[0104] Multi-year average rainfall: arithmetic mean of meteorological station data from 2014 to 2024.
[0105] Number of days with daily rainfall ≥20mm: calculated from meteorological station data from 2014 to 2024.
[0106] Distance to river (distance to water system): The Euclidean distance (in meters) from the raster to the nearest water system is calculated through GIS buffer analysis.
[0107] Soil erosion capacity (soil erosion modulus): calculated by field sampling and the RUSLE model (unit: t / (km²·a)).
[0108] Vegetation cover: Soil erosion intensity (SEI) was calculated using the pixel-based binary method based on Landsat 8 NDVI data.
[0109] Historical hill collapse density: The number of hill collapses per unit area was statistically analyzed using high-resolution imagery interpretation (Google Earth 0.5m).
[0110] Step 2: Data preprocessing.
[0111] Missing value handling: Rainfall data missing: Weather station data for July 5-10, 2024 is missing. Kriging interpolation was used to fill the gaps, with an interpolation error RMSE=5.2 mm.
[0112] Missing soil erosion modulus: For 12 grid cells, the average value of neighboring sites was used to fill the gap (error <8%).
[0113] Outlier removal: Soil erosion modulus: Three outliers were removed using the box plot method (IQR=70.5) (lower limit=0, upper limit=191.45, corresponding to Q1=15.2 and Q3=85.7).
[0114] Vegetation coverage: For pixels with NDVI values > 0.95, adjust by 0.95 (to avoid oversaturation of vegetation).
[0115] Standardization process: Min-Max standardization was applied to all factors to eliminate dimensional differences.
[0116] Step 3: DNSO parameter settings.
[0117] Algorithm initialization: Neighborhood radius: Initial =50 (global search), minimum =5 (Local search).
[0118] Attenuation coefficient: =0.05 (controls the neighborhood shrinkage rate).
[0119] Elite retention rate: 5% (the top 5% of individuals with the highest fitness are retained in each iteration).
[0120] Disturbance intensity: =0.1, temperature parameter T=100 (simulated annealing control parameter).
[0121] Multi-neighborhood cooperation: Divide the population into 4 sub-neighborhoods (K=4), and exchange the optimal solution every 10 iterations.
[0122] Fitness function: The AUC-ROC value is used to evaluate the model's discriminative ability, with a target value ≥ 0.85.
[0123] Step 4: Construction and optimization of the hill collapse prediction model.
[0124] Tag building: Set 10685 collapse points as label 1, then randomly generate an equal number (10685) of non-collapse points, set the non-collapse points as label 0, and then divide them into a training set (14959 samples) and a test set (6411 samples) in a 7:3 ratio.
[0125] Base model selection: A random forest model was used as the prediction model, with initial hyperparameters: tree depth d=10, number of features m=30%, and number of trees. N tree =100; for example Figure 2 As shown, the model's AUC-ROC value reached 0.76 (test set).
[0126] DNSO optimizes random forest parameters: Parameters to be optimized: tree depth d (5-20), number of features m (10%-90% of total features), number of trees N tree (100-500).
[0127] Optimization result: The optimal parameter combination is d=12, m=50%. N tree =300; such as Figure 3 As shown, the model's AUC-ROC value reached 0.87 (test set).
[0128] Step 5: Predict and verify the probability of a collapse.
[0129] Prediction Results: Xingning City was divided into 30m×30m grid units (a total of 2,301,372 units). The optimized model (i.e., the target prediction model) was input, and the probability of landslides occurring in each unit was output. Using ArcGIS software, the city was divided into three landslide susceptibility levels based on probability values: 1. Low susceptibility area, 2. Medium susceptibility area, and 3. High susceptibility area. The initial RF prediction results are as follows... Figure 4 As shown, the prediction results of DNSO-RF (i.e., the target prediction model) are as follows: Figure 5 As shown.
[0130] Model validation: Using 10-fold cross-validation, the average accuracy was 86.5% and the Kappa coefficient was 0.73, which is an improvement of 8.3% compared to the unoptimized random forest model (accuracy 78.2%).
[0131] This embodiment constructs a landslide sample database by collecting and preprocessing multi-source heterogeneous data, including topographic, geological, hydrological, vegetation, and human activity data. Binary labels are built based on historical landslide events, and a random forest model is used to output the probability of landslide occurrence, with accuracy or AUC-ROC value as the model fitness function. DNSO effectively solves the local optimum problem of traditional swarm intelligence optimization algorithms by dynamically adjusting the neighborhood radius, retaining and perturbing elite individuals, and promoting multi-neighborhood cooperation. After optimizing the key hyperparameters of the random forest using DNSO, the probability of landslide is calculated and risk levels are classified by inputting multi-source factor data of the area to be predicted. This method considers both global and local search, improving the algorithm's convergence speed and robustness, and increasing the prediction accuracy by 8.3% compared to traditional swarm intelligence optimization algorithms.
[0132] Based on the same inventive concept, this application also provides a hill collapse prediction device for implementing the hill collapse prediction method described above. The solution provided by this device is similar to the solution described in the above method; therefore, the specific limitations in one or more hill collapse prediction device embodiments provided below can be found in the limitations of the hill collapse prediction method described above, and will not be repeated here.
[0133] In one exemplary embodiment, such as Figure 6 As shown, a landslide prediction device is provided, comprising: a generation module 100, a first training module 200, an optimization module 300, a second training module 400, and a calling module 500, wherein: The generation module 100 is used to randomly generate the initial population of hyperparameters and determine the initial neighborhood radius.
[0134] The first training module 200 is used to train a training model corresponding to each individual in the initial population; wherein the training model is constructed based on the hyperparameters.
[0135] The optimization module 300 is used to perform neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model after training, to obtain the target optimal individual; wherein, the target optimal individual is a target hyperparameter, the target hyperparameter is determined based on candidate hyperparameters, and the candidate hyperparameters are determined based on the neighborhood radius used in the neighborhood search optimization process; for the neighborhood radius, in the first round of the neighborhood search optimization, the neighborhood radius is the initial neighborhood radius, and in subsequent rounds of the neighborhood search optimization, the neighborhood radius is determined by the initial neighborhood radius and the fitness values of the trained models.
[0136] The second training module 400 is used to train the training model corresponding to the optimal individual of the target to obtain the target prediction model.
[0137] Module 500 is used to call the target prediction model in response to the acquisition of landslide disaster factor data, and generate a prediction result on whether a landslide will occur.
[0138] In some embodiments, the optimization module 300 is specifically used for: Repeat the following operations until the loop termination condition is met: The elite individuals of the current population are determined based on the fitness values of each training model after training.
[0139] The elite individuals are randomly perturbed based on their neighborhood radius; wherein, in the first round, the neighborhood radius is the initial neighborhood radius.
[0140] The intermediate population for the next round is determined based on the elite individuals after the perturbation and the elite individuals in the current round of the population.
[0141] If the loop termination condition is met, the candidate hyperparameters corresponding to the training model with the highest fitness value in this round of the population are taken as the target hyperparameters.
[0142] In some embodiments, in the subsequent rounds, the optimization module 300 is further configured to: The initial neighborhood radius is dynamically reduced based on the fitness values to obtain the neighborhood radius for this round.
[0143] The elite individuals are randomly perturbed based on the current neighborhood radius.
[0144] In some embodiments, the optimization module 300 is further configured to: Based on the maximum fitness value among all fitness values, a radius scaling factor is obtained from a preset mapping relationship; wherein the preset mapping relationship indicates the association between fitness values and radius scaling factors.
[0145] The neighborhood radius for this round is obtained based on the radius scaling factor and the initial neighborhood radius.
[0146] In some embodiments, the optimization module 300 is further configured to: Determine the target elite individual corresponding to the maximum fitness value among all fitness values.
[0147] Based on the target elite individual, multiple surrounding hyperparameters are determined to assess the changing trend of the neighborhood radius.
[0148] The trend parameters are determined based on the fitness values corresponding to the surrounding hyperparameters; wherein, the trend parameters include local gradient and curvature.
[0149] The initial neighborhood radius is adjusted based on the changing trend parameter to obtain the neighborhood radius for this round.
[0150] In some embodiments, the optimization module 300 is further configured to: Determine the fitness value of the elite individuals after the perturbation.
[0151] The acceptance probability is determined based on the fitness value of the elite individual and the fitness value of the perturbed elite individual.
[0152] The elite individuals of the current population are replaced with perturbed elite individuals according to the acceptance probability to obtain the intermediate population of the next round; wherein the acceptance probability is less than or equal to 1.
[0153] In some embodiments, the optimization module 300 is specifically used for: The initial population is divided into multiple initial subpopulations.
[0154] The initial optimal individuals for each initial subpopulation are determined based on the fitness values of each trained model after training.
[0155] If the number of cycles exceeds a preset cycle threshold, the initial optimal individuals of each of the initial subpopulations are swapped.
[0156] After the loop ends, the optimal individual for the target is obtained.
[0157] Each module in the aforementioned landslide prediction device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the corresponding operations of each module.
[0158] In one exemplary embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 7As shown, this computer device includes a processor, memory, input / output (I / O) interfaces, and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system and computer programs. The internal memory provides the environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The I / O interfaces are used for exchanging information between the processor and external devices. The communication interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements a collapse prediction method.
[0159] Those skilled in the art will understand that Figure 7 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.
[0160] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.
[0161] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile memory and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, artificial intelligence (AI) processors, etc., and are not limited to these.
[0162] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this application.
[0163] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.
Claims
1. A method for predicting hill collapse, characterized in that, The method for predicting landslides includes: Randomly generate an initial population with hyperparameters and determine the initial neighborhood radius; Training models are then trained for each individual in the initial population; wherein the training models are constructed based on the hyperparameters. Neighborhood search optimization is performed based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual; wherein, the target optimal individual is a target hyperparameter, the target hyperparameter is determined based on candidate hyperparameters, and the candidate hyperparameters are determined based on the neighborhood radius used in the neighborhood search optimization process; for the neighborhood radius, in the first round of the neighborhood search optimization, the neighborhood radius is the initial neighborhood radius, and in subsequent rounds of the neighborhood search optimization, the neighborhood radius is determined by the initial neighborhood radius and the fitness values of the trained models; The training model corresponding to the optimal individual of the target is trained to obtain the target prediction model; In response to the acquisition of data on factors causing landslides, the target prediction model is invoked to generate a prediction result on whether a landslide will occur.
2. The method for predicting hill collapse according to claim 1, characterized in that, The step of performing neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual includes: Repeat the following operations until the loop termination condition is met: The elite individuals of the current population are determined based on the fitness values of each of the trained models after training. The elite individuals are randomly perturbed based on their neighborhood radius; wherein, in the first round, the neighborhood radius is the initial neighborhood radius; The intermediate population for the next round is determined based on the elite individuals after the perturbation and the elite individuals of the current population. If the loop termination condition is met, the candidate hyperparameters corresponding to the training model with the highest fitness value in this round of the population are taken as the target hyperparameters.
3. The method for predicting hill collapse according to claim 2, characterized in that, In the subsequent rounds, the random perturbation of the elite individuals based on the neighborhood radius includes: The initial neighborhood radius is dynamically reduced based on the fitness values to obtain the neighborhood radius for this round. The elite individuals are randomly perturbed based on the current neighborhood radius.
4. The method for predicting hill collapse according to claim 3, characterized in that, The step of dynamically reducing the initial neighborhood radius based on the fitness value to obtain the neighborhood radius for this round includes: Based on the maximum fitness value among all fitness values, a radius scaling factor is obtained from a preset mapping relationship; wherein, the preset mapping relationship indicates the association between fitness values and radius scaling factors; The neighborhood radius for this round is obtained based on the radius scaling factor and the initial neighborhood radius.
5. The method for predicting hill collapse according to claim 3, characterized in that, The step of dynamically reducing the initial neighborhood radius based on the fitness value to obtain the neighborhood radius for this round includes: Determine the target elite individual corresponding to the maximum fitness value among all the fitness values; Based on the target elite individual, multiple surrounding hyperparameters are determined to assess the changing trend of the neighborhood radius; The trend parameters are determined based on the fitness values corresponding to the surrounding hyperparameters; wherein, the trend parameters include local gradient and curvature; The initial neighborhood radius is adjusted based on the changing trend parameter to obtain the neighborhood radius for this round.
6. The method for predicting hill collapse according to claim 2, characterized in that, The process of determining the intermediate population for the next round based on the perturbed elite individuals and the elite individuals of the current population includes: Determine the fitness value of the elite individuals after the perturbation; The acceptance probability is determined based on the fitness value of the elite individual and the fitness value of the perturbed elite individual. The elite individuals of the current population are replaced with perturbed elite individuals according to the acceptance probability to obtain the intermediate population of the next round; wherein the acceptance probability is less than or equal to 1.
7. The method for predicting hill collapse according to claim 1, characterized in that, The step of performing neighborhood search optimization based on the initial neighborhood radius and the fitness values of each trained model to obtain the target optimal individual includes: The initial population is divided into multiple initial subpopulations; The initial optimal individuals of each initial subpopulation are determined based on the fitness values of each trained model after training. If the number of cycles exceeds a preset threshold, the initial optimal individuals of each of the initial subpopulations are swapped. After the loop ends, the optimal individual for the target is obtained.
8. A computer device, characterized in that, It includes a processor, a memory, and a program or instructions stored in the memory and executable on the processor, wherein the program or instructions, when executed by the processor, implement the steps of the landslide prediction method as described in any one of claims 1-7.
9. A readable storage medium, characterized in that, The readable storage medium stores a program or instructions that, when executed by a processor, implement the steps of the hill collapse prediction method as described in any one of claims 1-7.