Crop suitability-based planting distribution area prediction method and device, and computer device

By screening multiple environmental factors and training a maximum entropy model, combined with machine learning to assess crop suitability, the subjective and nonlinear response problems of crop suitability evaluation in existing technologies have been solved, and highly accurate prediction of planting distribution areas has been achieved in the context of climate change.

CN122242832APending Publication Date: 2026-06-19GUANGDONG UNIV OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
GUANGDONG UNIV OF TECH
Filing Date
2026-02-06
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing crop suitability assessment methods rely on empirical dominant indicators, have relatively mechanical grading standards, and are greatly influenced by subjectivity. They are unable to reflect the complex nonlinear response relationship between crops and the environment, and are especially unable to meet the needs of agricultural adaptability and risk management in the context of climate change.

Method used

By using a multi-source environmental factor screening method to reduce the collinearity between factors, a maximum entropy model is used for training, and a machine learning model is combined to evaluate crop suitability and predict the planting distribution area with probability output. This optimizes the screening and fusion analysis of environmental factors and improves the accuracy and consistency of the prediction results.

🎯Benefits of technology

It achieves high consistency and reliability of crop suitability prediction results under different sample conditions, and continuously characterizes crop suitability at the spatial unit scale based on probability output, thereby improving the accuracy of planting distribution area prediction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242832A_ABST
    Figure CN122242832A_ABST
Patent Text Reader

Abstract

This invention relates to the field of planting distribution area prediction, and in particular to a method, apparatus, and computer equipment for predicting planting distribution areas based on crop suitability. It optimizes multi-source environmental factors, reduces the collinearity effects between environmental factors, and ensures that crop suitability prediction results maintain high consistency and reliability under different sample conditions. Based on probability output, the crop suitability prediction method continuously characterizes the changes in crop suitability at the spatial unit scale, predicts planting distribution areas, and improves the accuracy of the prediction results.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of planting distribution area prediction, and in particular to a method, apparatus, computer equipment, and storage medium for predicting planting distribution areas based on crop suitability. Background Technology

[0002] Against the backdrop of global climate change, agricultural systems are facing unprecedented uncertainties and risks. Rising temperatures, reorganization of precipitation patterns, and a significant increase in the frequency and intensity of extreme weather events are systematically altering the spatiotemporal distribution of agricultural climate resources. Crop growth is highly sensitive to environmental conditions; its growth cycle, yield formation, and spatial distribution are all constrained by a variety of climatic factors.

[0003] Existing crop suitability assessment methods are mainly based on agroclimatic zoning, stepwise zoning, and multi-index comprehensive evaluation. These methods typically select a few climate indicators such as accumulated temperature, precipitation, and growing season length, and delineate suitable areas through threshold grading or empirical judgment. However, they have significant shortcomings in practical application: traditional methods rely heavily on empirical indicators, grading standards are relatively mechanical, zoning results are significantly influenced by subjectivity, boundary treatment is inflexible, and it is difficult to fully reflect the complex nonlinear response relationship between crops and the environment. Especially against the backdrop of intensified climate change and frequent extreme weather events, evaluation systems based solely on multi-year average climate indicators are insufficient to meet the needs of agricultural adaptability and risk management. Mathematical statistical methods have improved the objectivity and precision of zoning to some extent, but they still cannot completely overcome the human dependence in variable selection and weight assignment, and they lack sufficient insight into the physiological and ecological mechanisms of crops. Summary of the Invention

[0004] Based on this, the purpose of this invention is to provide a method, apparatus, computer equipment, and storage medium for predicting planting distribution areas based on crop suitability. This method optimizes multi-source environmental factors, reduces the collinearity between environmental factors, and ensures that the crop suitability prediction results maintain high consistency and reliability under different sample conditions. Based on a probability-output-based crop suitability prediction method, this method continuously characterizes the changes in crop suitability at the spatial unit scale, predicts planting distribution areas, and improves the accuracy of the prediction results.

[0005] In a first aspect, embodiments of this application provide a method for predicting planting distribution areas based on crop suitability, comprising the following steps:

[0006] Obtain environmental factor datasets and crop suitability raster label data for a sample area over a historical period. The environmental factor dataset includes environmental factor data for several raster cells of the sample area, and the environmental factor data includes several types of environmental factors. Several environmental factor screening methods are used to screen target environmental factors based on the environmental factor dataset, resulting in several environmental factor screening results, wherein the environmental factor screening results include several types of environmental factors after screening. By performing a fusion analysis on the screening results of several target environmental factors, the target environmental factors are determined, and a dataset of target environmental factors for the sample area in a historical time period is obtained. The target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period are input into the crop suitability prediction model to be trained to obtain the target crop suitability prediction model. Obtain the target environmental factor dataset of the area to be predicted, input the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtain the crop suitability raster prediction data of the area to be predicted. Obtain the area prediction data of the area to be predicted; predict the planting distribution area based on the area prediction data of the area to be predicted and the crop suitability grid prediction data, and obtain the planting distribution area prediction result of the area to be predicted.

[0007] Secondly, embodiments of this application provide a crop suitability-based planting distribution area prediction device, comprising: The data acquisition module is used to acquire environmental factor datasets and crop suitability raster label data of the sample area over a historical period. The environmental factor dataset includes environmental factor data of several raster cells of the sample area, and the environmental factor data includes several types of environmental factors. The environmental factor screening module is used to screen target environmental factors based on the environmental factor dataset using several environmental factor screening methods to obtain several environmental factor screening results, wherein the environmental factor screening results include several types of environmental factors after screening. The environmental factor analysis module is used to perform fusion analysis on the screening results of several target environmental factors, determine the target environmental factors, and obtain the target environmental factor dataset of the sample area in a historical time period. The model training module is used to input the target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period into the crop suitability prediction model to be trained, so as to obtain the target crop suitability prediction model. The crop suitability prediction module is used to obtain the target environmental factor dataset of the area to be predicted, input the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtain crop suitability raster prediction data of the area to be predicted. The planting distribution area prediction module is used to obtain the area prediction data of the area to be predicted; and to predict the planting distribution area based on the area prediction data of the area to be predicted and the crop suitability raster prediction data, so as to obtain the planting distribution area prediction result of the area to be predicted.

[0008] Thirdly, embodiments of this application provide a computer device, including: a processor, a memory, and a computer program stored in the memory and executable on the processor; when the computer program is executed by the processor, it implements the steps of the crop suitability-based planting distribution area prediction method as described in the first aspect.

[0009] Fourthly, embodiments of this application provide a storage medium storing a computer program that, when executed by a processor, implements the steps of the crop suitability-based planting distribution area prediction method as described in the first aspect.

[0010] In this application embodiment, a method, apparatus, computer equipment, and storage medium for predicting planting distribution areas based on crop suitability are provided. The method optimizes multi-source environmental factors, reduces the collinearity between environmental factors, and ensures that the crop suitability prediction results maintain high consistency and reliability under different sample conditions. The crop suitability prediction method based on probability output continuously characterizes the changes in crop suitability at the spatial unit scale, predicts planting distribution areas, and improves the accuracy of the prediction results.

[0011] To better understand and implement this invention, the following detailed description is provided in conjunction with the accompanying drawings. Attached Figure Description

[0012] Figure 1 A flowchart illustrating a crop suitability-based planting distribution area prediction method provided in one embodiment of this application; Figure 2 This is a flowchart illustrating step S2 of a crop suitability-based planting distribution area prediction method provided in one embodiment of this application. Figure 3 A flowchart illustrating step S2 in a crop suitability-based planting distribution area prediction method provided in another embodiment of this application; Figure 4 A flowchart illustrating step S2 in a crop suitability-based planting distribution area prediction method provided in another embodiment of this application; Figure 5 This is a flowchart illustrating step S3 of a crop suitability-based planting distribution area prediction method provided in one embodiment of this application. Figure 6 This is a flowchart illustrating step S4 of a crop suitability-based planting distribution area prediction method provided in one embodiment of this application. Figure 7 This is a flowchart illustrating step S6 of a crop suitability-based planting distribution area prediction method provided in one embodiment of this application. Figure 8 A schematic diagram of the structure of a crop suitability-based planting distribution area prediction device provided in one embodiment of this application; Figure 9 This is a schematic diagram of the structure of a computer device provided in one embodiment of this application. Detailed Implementation

[0013] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0014] The terminology used in this application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. The singular forms “a,” “the,” and “the” used in this application and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used herein refers to and includes any or all possible combinations of one or more of the associated listed items.

[0015] It should be understood that although the terms first, second, third, etc., may be used in this application to describe various information, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of this application, first information may also be referred to as second information, and similarly, second information may also be referred to as first information. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to determination."

[0016] Please see Figure 1 , Figure 1 The flowchart illustrates a crop suitability-based planting distribution area prediction method according to an embodiment of this application. The method includes the following steps: S1: Obtain environmental factor datasets and crop suitability raster label data for the sample area over a historical period.

[0017] The execution entity of the crop suitability-based planting distribution area prediction method is a prediction device (hereinafter referred to as prediction device). In an optional embodiment, the prediction device may be a computer device, a server, or a server cluster composed of multiple computer devices.

[0018] In this embodiment, the prediction device obtains environmental factor datasets and crop suitability raster label data for a sample area over a historical period. The environmental factor dataset includes environmental factor data for several raster cells of the sample area, and the environmental factor data includes several types of environmental factors, including climate factors, topographic factors, and soil factors.

[0019] Specifically, the climate factors are selected based on climate data, which is obtained from a surface meteorological element-driven dataset of the sample area. The dataset includes meteorological elements such as near-surface temperature, air pressure, specific humidity, wind speed, downward shortwave radiation flux, downward longwave radiation flux, and precipitation rate. The time resolution is 3 hours, and the horizontal spatial resolution is 0.1°. The sunshine duration is supplemented by a normalized dataset of daily sunshine duration in China from 1961 to 2022. Based on existing research findings on crop climate zoning, climate suitability, and natural vegetation zoning, 15 climatic factors with clear biological significance that may affect the planting and distribution of major grain crops were selected from the national and annual scales. These factors include frost-free period, accumulated temperature ≥0℃, accumulated temperature ≥10℃, accumulated temperature ≥18℃, annual extreme minimum temperature, annual average temperature, annual temperature range, annual total precipitation, average temperature of the coldest month, average temperature of the hottest month, number of days with an annual average temperature ≥0℃, number of days with an annual average temperature ≥10℃, number of days with an annual average temperature ≥18℃, annual precipitation, and sunshine duration.

[0020] The terrain factors are constructed based on terrain data, which is obtained from the digital elevation model dataset of the sample area. The terrain data includes altitude, slope, aspect, and topographic relief. Combined with the altitude raster data, the altitude data is processed into altitude data with the same latitude and longitude range and resolution as the climate factor data through "projection raster", "resampling" and "extraction by mask". Then, through slope and aspect analysis, slope and aspect data with the same latitude and longitude range and resolution as the climate factor data are obtained to construct the terrain factors.

[0021] The soil factors were constructed based on soil data, which included 23 soil physical, chemical, and fertility attributes at six standard depth layers (0-5, 5-15, 15-30, 30-60, 60-100, and 100-200 cm) within the sample area. These attributes were mapped at a spatial resolution of 90 meters. ArcGIS software was used to link the data, and the "Define Projection" tool was used to convert the original raster to the WGS 1984 geographic coordinate system. By connecting the soil attribute table, a series of point-to-raster conversion tools were used to export the required soil factor raster data. Then, tools such as "Resampling" and "Extract by Mask" were used to process the data into raster data with the same latitude and longitude range and resolution as the climate factor data, thus constructing the topographic factors.

[0022] S2: Using several environmental factor screening methods, target environmental factors are screened based on the environmental factor dataset to obtain several environmental factor screening results.

[0023] In this embodiment, the prediction device employs several environmental factor screening methods to screen target environmental factors based on the environmental factor dataset, thereby obtaining several environmental factor screening results. The environmental factor screening results include several types of environmental factors after screening.

[0024] The environmental factor screening results include a linearly redundant set of environmental factors. Please refer to [link / reference]. Figure 2 , Figure 2 The flowchart of S2 in the crop suitability-based planting distribution area prediction method provided in one embodiment of this application includes step S201, as follows: S201: Based on the environmental factor dataset and the preset Pearson correlation coefficient calculation algorithm, obtain the Pearson correlation coefficient between various types of environmental factors; based on the Pearson correlation coefficient between various types of environmental factors and the preset first correlation threshold, obtain several linearly redundant environmental factor pairs and construct a linearly redundant environmental factor set.

[0025] In this embodiment, the prediction device obtains the Pearson correlation coefficients between various types of environmental factors based on the environmental factor dataset and a preset Pearson correlation coefficient calculation algorithm. The Pearson correlation coefficient calculation algorithm is as follows:

[0026] In the formula, R The Pearson correlation coefficient is used. X Environmental factors including all grid cells x The set, Y Environmental factors including all grid cells y The set, N Indicates the number of grid cells.

[0027] The prediction device obtains several linearly redundant environmental factor pairs based on the Pearson correlation coefficients between various types of environmental factors and a preset first correlation threshold, and constructs a linearly redundant environmental factor set. Specifically, when the absolute value of the Pearson correlation coefficient between the environmental factors is higher than the first correlation threshold, it is determined that there is a significant linear correlation between the two environmental factors, which are considered linearly redundant environmental factor pairs, and several linearly redundant environmental factor pairs are obtained to construct a linearly redundant environmental factor set.

[0028] In an optional embodiment, for environmental factor pairs determined to be linearly redundant, the prediction device further filters several linearly redundant environmental factor pairs based on the physiological and ecological characteristics of crops in the corresponding grid cells, so as to preferentially retain environmental factors that have more explicit ecological constraints or higher data quality during crop growth as the final linearly redundant environmental factor pairs, thereby reducing the linear collinearity effect between environmental factors.

[0029] The environmental factor screening results also include a set of nonlinearly correlated environmental factors; please refer to [link / reference]. Figure 3 , Figure 3 A flowchart illustrating step S2 of the crop suitability-based planting distribution area prediction method provided in another embodiment of this application includes steps S211 to S212, as follows: S211: Based on the environmental factor dataset and the preset maximum information coefficient calculation algorithm, obtain the maximum information coefficient between various types of environmental factors.

[0030] In this embodiment, the prediction device calculates the mutual information values ​​of environmental factors at different grid scales based on the environmental factor dataset and a preset maximum information coefficient calculation algorithm. Under the premise of satisfying grid complexity constraints, the result with the largest normalized mutual information is selected as the maximum information coefficient of the environmental factor pair. This process obtains the maximum information coefficients between various types of environmental factors to assess the degree of nonlinear dependence between them. The maximum information coefficient calculation algorithm is as follows:

[0031]

[0032] In the formula, The maximum information coefficient is given by max(), which is the function to find the maximum value. The number of grid divisions for environmental factors along the horizontal axis. The number of grid divisions for environmental factors along the vertical axis. B This is the upper limit of the total number of grid cells. The joint probability density represents the simultaneous occurrence of the corresponding environmental factors of a grid cell.X The i interval sum Y The j The proportion of the interval, for X The marginal probability density represents the corresponding environmental factor of the raster cell falling within the range of... X The i The proportion of the interval, for Y The marginal probability density represents the corresponding environmental factor of the raster cell falling within the range of... Y The i The proportion of the interval.

[0033] S212: Normalize the maximum information coefficients between various types of environmental factors. Based on the normalized maximum information coefficients between various types of environmental factors and the preset second correlation threshold, obtain several pairs of nonlinearly correlated environmental factors and construct a set of nonlinearly correlated environmental factors.

[0034] In this embodiment, the prediction device normalizes the maximum information coefficients between various types of environmental factors to obtain the normalized maximum information coefficients between various types of environmental factors, making the nonlinear dependence between different environmental factor pairs comparable.

[0035] The prediction device obtains several pairs of nonlinearly correlated environmental factors based on the maximum information coefficients among the normalized environmental factors of each type and a preset second correlation threshold, and constructs a set of nonlinearly correlated environmental factors. Specifically, when the value of a pair of nonlinearly correlated environmental factors is higher than the second correlation threshold, it is determined that there is a significant nonlinear correlation between the two environmental factors corresponding to the pair, and several pairs of nonlinearly correlated environmental factors are obtained to construct a set of nonlinearly correlated environmental factors.

[0036] In an optional embodiment, for environmental factor pairs determined to be nonlinearly correlated, the prediction device further filters several nonlinearly correlated environmental factor pairs based on the physiological and ecological characteristics of crops in the corresponding grid cells, so as to prioritize and retain environmental factors that have a more direct impact on crop suitability, clearer ecological significance and better spatial stability as the final nonlinearly correlated environmental factor pairs.

[0037] The environmental factor screening results also include a set of key environmental factors; please refer to [link / reference]. Figure 4 , Figure 4 The flowchart of S2 in the crop suitability-based planting distribution area prediction method provided in another embodiment of this application includes steps S221 to S222, as follows: S221: Calculate the contribution of each type of environmental factor to each grid cell under each machine learning model based on the environmental factor dataset and several preset machine learning models.

[0038] The machine learning model can employ tree structure models such as CatBoost, XGBoost, LightGBM, Random Forest, and AdaBoost. For any predicted sample, i.e., a set of environmental factor values ​​corresponding to a spatial grid cell, the model prediction process can be represented as the process of the sample splitting layer by layer from the root node to the leaf node along each decision tree. Each leaf node of the tree corresponds to an incremental contribution to the final prediction result. In the TreeSHAP calculation process, a single decision tree is first analyzed, enumerating all possible feature subset paths in the tree. While keeping the tree structure unchanged, the changes in predicted values ​​under different feature combinations are calculated, thereby obtaining the marginal contribution value of each environmental factor to the prediction result in the tree. This process comprehensively considers the order in which all possible features are added by weighted averaging based on whether the feature participates in node splitting, thus ensuring the fairness of the contribution calculation of each environmental factor.

[0039] In this embodiment, the prediction device calculates the contribution based on the environmental factor dataset and several preset machine learning models to obtain the contribution value of each type of environmental factor to each grid cell under each machine learning model, wherein the contribution value is:

[0040] In the formula, To be on the current grid cell, the first i The marginal contribution values ​​of each type of environmental factor to the current machine learning model. S A subset of environmental factors, representing a subset that does not contain the first environmental factor. i Any subset of environmental factors of each type N For environmental factor datasets, N ( i To remove the first i Environmental factor datasets of various types of environmental factors The marginal contribution value represents the value of the first... i Each type of environmental factor is added to the current environmental factor subset. S Afterwards, the changes in the prediction results of the current machine learning model are observed.

[0041] For gradient boosting tree models such as CatBoost, XGBoost, and LightGBM, the model prediction result is obtained by summing the outputs of multiple weak learning trees. Therefore, after calculating the Shapley value of a single tree, the Shapley values ​​of the corresponding environmental factors in all trees are summed to obtain the total contribution of that environmental factor to the predicted sample under the model. For the RandomForest model, since the prediction result is jointly determined by multiple independent decision trees, the Shapley value is calculated for each tree separately, and the average of the results of all trees is taken to obtain the final contribution of the environmental factor. For the AdaBoost model, based on the Shapley value of a single weak classifier or regression tree, the weight coefficients of each weak learner in the model are further combined to perform a weighted summation of their contributions, thereby obtaining the contribution value of each type of environmental factor to each grid cell.

[0042] S222: Averaging the contribution values ​​of each type of environmental factor to each grid cell under the same machine learning model to obtain the average contribution value of each type of environmental factor under each machine learning model; extracting key environmental factors based on the average contribution value of each type of environmental factor under each machine learning model to obtain several key environmental factors, and constructing a key environmental factor set.

[0043] In this embodiment, the prediction device averages the contribution values ​​of each type of environmental factor to each grid cell under the same machine learning model to obtain the average contribution value of each type of environmental factor under each machine learning model.

[0044] The prediction device extracts key environmental factors based on the average contribution value of each type of environmental factor under each machine learning model, obtains several key environmental factors, and constructs a key environmental factor set.

[0045] Specifically, the prediction device comprehensively compares the average contribution values ​​of various types of environmental factors under various machine learning models. When the contribution value of a certain environmental factor is low in different models or shows significant fluctuations during multiple training processes, it is determined that the contribution of the environmental factor to crop suitability prediction is unstable and it is removed. Only the key environmental factors that have high contribution and good stability under multiple model conditions are retained to construct a set of key environmental factors.

[0046] S3: Perform fusion analysis on the screening results of several target environmental factors to determine the target environmental factors and obtain the target environmental factor dataset of the sample area in the historical time period.

[0047] In this embodiment, the prediction device performs a fusion analysis on the screening results of several target environmental factors to determine the target environmental factors and obtain a dataset of target environmental factors for the sample area over a historical period. By fusing correlation analysis, the multi-source environmental factors are systematically optimized, avoiding the model instability problems caused by arbitrary selection of environmental factors and severe variable redundancy in traditional methods. This ensures that the crop suitability prediction results maintain high consistency and reliability under different sample conditions, thereby significantly improving the scientific validity of the prediction results.

[0048] Please see Figure 5 , Figure 5 The flowchart of S3 in the crop suitability-based planting distribution area prediction method provided in one embodiment of this application includes steps S31 to S32, as follows: S31: Construct a set of candidate environmental factors.

[0049] In this embodiment, the prediction device constructs a candidate environmental factor set based on the linear redundancy environmental factor set, the nonlinear correlation environmental factor set, and the key environmental factor set, wherein the candidate environmental factor set is as follows:

[0050] In the formula, For the set of candidate environmental factors, For the set of linearly redundant environmental factors, It is a set of nonlinearly correlated environmental factors. This is a set of key environmental factors.

[0051] S32: Based on the candidate environmental factor set and the preset trust degree calculation algorithm, obtain the trust value of each type of environmental factor; determine the target environmental factor based on the trust value of each type of environmental factor, and obtain the target environmental factor dataset of the sample area in the historical time period.

[0052] In this embodiment, the prediction device obtains the trust value of each type of environmental factor based on the candidate environmental factor set and a preset trust calculation algorithm, wherein the trust calculation algorithm is as follows:

[0053] In the formula, For the set of candidate environmental factors, For the set of linearly redundant environmental factors, It is a set of nonlinearly correlated environmental factors. This is a set of key environmental factors.

[0054] The prediction device determines the target environmental factors based on the trust values ​​of each type of environmental factor, and obtains the target environmental factor dataset of the sample area in the historical time period. Specifically, the prediction device takes the largest number of environmental factors of each type as the target environmental factors and obtains the target environmental factor dataset of the sample area in the historical time period.

[0055] S4: Input the target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period into the crop suitability prediction model to be trained, and obtain the target crop suitability prediction model.

[0056] The crop suitability prediction model adopts the maximum entropy model. The maximum entropy (MaxEnt) model is a species distribution prediction model based on the principle of maximum entropy. When estimating the unknown spatial distribution of a species, it should select the probability distribution with the largest entropy value, provided that only known constraints are met, i.e., the environmental characteristics at the known distribution points of the species are met.

[0057] In this embodiment, the prediction device inputs the target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period into the crop suitability prediction model to be trained, and obtains the target crop suitability prediction model.

[0058] Please see Figure 6 , Figure 6 The flowchart of S4 in the crop suitability-based planting distribution area prediction method provided in one embodiment of this application includes steps S41 to S42, as follows: S41: Based on the target environmental factor data of each grid cell in the target environmental factor dataset and the preset crop suitability probability calculation algorithm, obtain crop suitability prediction probability data.

[0059] In this embodiment, the prediction device obtains crop suitability prediction probability data based on the target environmental factor data of each grid cell in the target environmental factor dataset and a preset crop suitability probability calculation algorithm. This data reflects the ecological potential of ecological and climatic conditions on the spatial distribution of a crop. The crop suitability prediction probability data includes crop suitability prediction probability vectors for each grid cell. The crop suitability probability calculation algorithm is as follows:

[0060] In the formula, Let g be the crop suitability prediction probability vector for the g-th grid cell. G The number of grid cells. m The number of target environmental factors. For the firsti The characteristic weight coefficients of each target environmental factor For the g-th grid cell, the first... i Characteristic values ​​of each target environmental factor; S42: Calculate the loss value based on the crop suitability prediction probability vector of each grid cell in the crop suitability prediction probability data and the crop suitability label probability vector of each grid cell in the crop suitability label probability data; train the crop suitability prediction model to be trained based on the loss value to obtain the target crop suitability prediction model.

[0061] In this embodiment, the prediction device performs loss calculation based on the crop suitability prediction probability vector of each grid cell in the crop suitability prediction probability data and the crop suitability label probability vector of each grid cell in the crop suitability label probability data to obtain a loss value. Specifically, the loss value of the prediction device is constructed based on the cross-entropy criterion and class weights are introduced to alleviate the class imbalance problem.

[0062] The prediction device trains the crop suitability prediction model to be trained based on the loss value to obtain the target crop suitability prediction model.

[0063] S5: Obtain the target environmental factor dataset of the area to be predicted, input the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtain the crop suitability raster prediction data of the area to be predicted.

[0064] In this embodiment, the prediction device obtains a target environmental factor dataset of the area to be predicted, inputs the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtains crop suitability raster prediction data of the area to be predicted.

[0065] S6: Obtain the area prediction data of the area to be predicted; predict the planting distribution area based on the area prediction data of the area to be predicted and the crop suitability grid prediction data, and obtain the planting distribution area prediction result of the area to be predicted.

[0066] In this embodiment, the prediction device obtains the area prediction data of the area to be predicted. Specifically, the area prediction data is crop distribution intensity information extracted based on historical harvest area raster data, which can be understood as the actual or potential planting intensity of each spatial unit, obtained through multi-year averaging and spatial interpolation.

[0067] The prediction device predicts the planting distribution area based on the area prediction data and crop suitability raster prediction data of the region to be predicted, obtaining the predicted planting distribution area result for the region to be predicted. Optimization of multi-source environmental factors reduces the collinearity effect between environmental factors, ensuring high consistency and reliability of crop suitability prediction results under different sample conditions. The probability-based crop suitability prediction method continuously characterizes the changes in crop suitability at the spatial unit scale, improving the accuracy of the prediction results. Please see Figure 7 , Figure 7 The flowchart of S6 in the crop suitability-based planting distribution area prediction method provided in one embodiment of this application includes steps S61 to S62, as follows: S61: Multiply the area prediction parameters of each grid cell in the area prediction data with the crop suitability prediction probability vector of the corresponding grid cell in the crop suitability grid prediction data to obtain the comprehensive index of each grid cell.

[0068] In this embodiment, the prediction device multiplies the area prediction parameters of each grid cell in the area prediction data with the crop suitability prediction probability vector of the corresponding grid cell in the crop suitability grid prediction data to obtain the comprehensive index of each grid cell.

[0069] S62: Based on the comprehensive index of each grid cell and the preset planting area identification threshold, each grid cell is divided into planting area grids and non-planting area grids to obtain the planting distribution area prediction result of the area to be predicted.

[0070] In this embodiment, the prediction device divides each grid cell into a planting area grid and a non-planting area grid based on the comprehensive index of each grid cell and a preset planting area identification threshold. If the comprehensive index is greater than the planting area identification threshold, the device obtains the predicted planting distribution area of ​​the area to be predicted, thus achieving a transition from crop potential distribution to actual suitable distribution. The predicted crop area provides spatial weights, while crop suitability data reflects ecological constraints. The "planting distribution area" result obtained by simultaneously constraining both factors has both climatic and ecological rationality and preserves the spatial consistency of the actual agricultural distribution pattern.

[0071] Please refer to Figure 8 , Figure 8 This is a schematic diagram of a crop suitability-based planting distribution area prediction device according to an embodiment of this application. The device can be implemented entirely or partially through software, hardware, or a combination of both. The device 8 includes: The data acquisition module 81 is used to acquire environmental factor datasets and crop suitability raster label data of the sample area over a historical period. The environmental factor dataset includes environmental factor data of several raster cells of the sample area, and the environmental factor data includes several types of environmental factors. The environmental factor screening module 82 is used to screen target environmental factors based on the environmental factor dataset using several environmental factor screening methods to obtain several environmental factor screening results, wherein the environmental factor screening results include several types of environmental factors after screening. The environmental factor analysis module 83 is used to perform fusion analysis on the screening results of several target environmental factors, determine the target environmental factors, and obtain the target environmental factor dataset of the sample area in a historical time period. The model training module 84 is used to input the target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period into the crop suitability prediction model to be trained for training, so as to obtain the target crop suitability prediction model. The crop suitability prediction module 85 is used to obtain the target environmental factor dataset of the area to be predicted, input the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtain the crop suitability raster prediction data of the area to be predicted. The planting distribution area prediction module 86 is used to obtain the area prediction data of the area to be predicted; and to predict the planting distribution area based on the area prediction data of the area to be predicted and the crop suitability grid prediction data, so as to obtain the planting distribution area prediction result of the area to be predicted.

[0072] In this embodiment, a data acquisition module obtains environmental factor datasets and crop suitability raster label data for a sample area over a historical period. The environmental factor dataset includes environmental factor data for several raster cells within the sample area, and these environmental factor data include several types of environmental factors. An environmental factor screening module employs several environmental factor screening methods to screen target environmental factors based on the environmental factor dataset, obtaining several environmental factor screening results. These environmental factor screening results include several types of screened environmental factors. An environmental factor analysis module performs a fusion analysis on the several target environmental factor screening results to determine the target environmental factors, thus obtaining a target environmental factor dataset for the sample area over a historical period. The process involves several steps: First, a model training module inputs the target environmental factor dataset and crop suitability raster label data for the sample region over a historical period into the crop suitability prediction model for training, resulting in a target crop suitability prediction model. Second, a crop suitability prediction module obtains the target environmental factor dataset for the region to be predicted, inputting this dataset into the target crop suitability prediction model to predict crop suitability, resulting in crop suitability raster prediction data for the region to be predicted. Third, a planting distribution area prediction module obtains the area prediction data for the region to be predicted. Finally, based on the area prediction data and crop suitability raster prediction data, a planting distribution area prediction is performed to obtain the planting distribution area prediction result for the region to be predicted. This approach optimizes multi-source environmental factors to reduce collinearity among them, ensuring high consistency and reliability of crop suitability prediction results under different sample conditions. The probability-based crop suitability prediction method continuously characterizes changes in crop suitability at the spatial unit scale, improving the accuracy of the prediction results.

[0073] Please refer to Figure 9 , Figure 9 This is a schematic diagram of the structure of a computer device provided in one embodiment of this application. The computer device 9 includes: a processor 91, a memory 92, and a computer program 93 stored in the memory 92 and executable on the processor 91. The computer device can store multiple instructions, which are adapted to be loaded and executed by the processor 91 as described above. Figures 1 to 7 The method steps of the illustrated embodiment, and the specific execution process, can be found in the illustration. Figures 1 to 7 The specific details of the illustrated embodiments will not be elaborated here.

[0074] The processor 91 may include one or more processing cores. The processor 91 connects to various parts of the server using various interfaces and lines. It executes various functions and processes data of the crop suitability-based planting distribution area prediction device 7 by running or executing instructions, programs, code sets, or instruction sets stored in the memory 92, and by calling data from the memory 92. Optionally, the processor 91 may be implemented using at least one hardware form of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), or Programmable Logic Array (PLA). The processor 91 may integrate one or more of the following: a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), and a modem. The CPU primarily handles the operating system, user interface, and applications; the GPU is responsible for rendering and drawing the content required to be displayed on the touch screen; and the modem handles wireless communication. It is understood that the modem may also not be integrated into the processor 91 and may be implemented as a separate chip.

[0075] The memory 92 may include random access memory (RAM) or read-only memory. Optionally, the memory 92 may include a non-transitory computer-readable storage medium. The memory 92 can be used to store instructions, programs, code, code sets, or instruction sets. The memory 92 may include a program storage area and a data storage area, wherein the program storage area may store instructions for implementing an operating system, instructions for at least one function (such as touch instructions), instructions for implementing the various method embodiments described above, etc.; the data storage area may store data involved in the various method embodiments described above, etc. Optionally, the memory 92 may also be at least one storage device located remotely from the aforementioned processor 91.

[0076] This application embodiment also provides a storage medium that can store multiple instructions, which are adapted to be loaded and executed by a processor as described above. Figures 1 to 7 The method steps of the illustrated embodiment, and the specific execution process, can be found in the illustration. Figures 1 to 7 The specific details of the illustrated embodiments will not be elaborated here.

[0077] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is merely an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit. Furthermore, the specific names of the functional units and modules are only for easy differentiation and are not intended to limit the scope of protection of this application. The specific working process of the units and modules in the above system can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0078] In the above embodiments, the descriptions of each embodiment have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0079] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the algorithm. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0080] In the embodiments provided by this invention, it should be understood that the disclosed apparatus / terminal devices and methods can be implemented in other ways. For example, the apparatus / terminal device embodiments described above are merely illustrative. For instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0081] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0082] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0083] If the integrated module / unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments of the present invention can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms.

[0084] This invention is not limited to the above-described embodiments. If any modifications or variations to this invention do not depart from the spirit and scope of this invention, and if such modifications and variations fall within the scope of the claims and equivalent technologies of this invention, then this invention also intends to include such modifications and variations.

Claims

1. A method for predicting planting distribution areas based on crop suitability, characterized in that, Includes the following steps: Obtain environmental factor datasets and crop suitability raster label data for a sample area over a historical period. The environmental factor dataset includes environmental factor data for several raster cells of the sample area, and the environmental factor data includes several types of environmental factors. Several environmental factor screening methods are used to screen target environmental factors based on the environmental factor dataset, resulting in several environmental factor screening results, wherein the environmental factor screening results include several types of environmental factors after screening. By performing a fusion analysis on the screening results of several target environmental factors, the target environmental factors are determined, and a dataset of target environmental factors for the sample area in a historical time period is obtained. The target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period are input into the crop suitability prediction model to be trained to obtain the target crop suitability prediction model. Obtain the target environmental factor dataset of the area to be predicted, input the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtain the crop suitability raster prediction data of the area to be predicted. Obtain the area prediction data of the area to be predicted; predict the planting distribution area based on the area prediction data of the area to be predicted and the crop suitability raster prediction data, and obtain the planting distribution area prediction result of the area to be predicted.

2. The method for predicting planting distribution areas based on crop suitability according to claim 1, characterized in that: The environmental factor screening results include a set of linearly redundant environmental factors, a set of nonlinearly correlated environmental factors, and a set of key environmental factors. The method employs several environmental factor screening methods, filters target environmental factors based on the environmental factor dataset, and obtains several environmental factor screening results, including the following steps: Based on the environmental factor dataset and a preset Pearson correlation coefficient calculation algorithm, Pearson correlation coefficients are obtained between various types of environmental factors. Based on the Pearson correlation coefficients between various types of environmental factors and a preset first correlation threshold, several linearly redundant environmental factor pairs are obtained, and a linearly redundant environmental factor set is constructed. The Pearson correlation coefficient calculation algorithm is as follows: In the formula, R The Pearson correlation coefficient is used. X Environmental factors including all grid cells x The set, Y Environmental factors including all grid cells y The set, N Indicates the number of grid cells.

3. The method for predicting planting distribution areas based on crop suitability according to claim 1, characterized in that: The environmental factor screening results also include a set of nonlinearly correlated environmental factors; The method employs several environmental factor screening methods, filters target environmental factors based on the environmental factor dataset, and obtains several environmental factor screening results, including the following steps: Based on the environmental factor dataset and the preset maximum information coefficient calculation algorithm, the maximum information coefficient among various types of environmental factors is obtained, wherein the maximum information coefficient calculation algorithm is as follows: In the formula, The maximum information coefficient is given by max(), which is the function to find the maximum value. The number of grid divisions for environmental factors along the horizontal axis. The number of grid divisions for environmental factors along the vertical axis. B This is the upper limit of the total number of grid cells. The joint probability density represents the simultaneous occurrence of the corresponding environmental factors of a grid cell. X The i interval sum Y The j The proportion of the interval, for X The marginal probability density represents the corresponding environmental factor of the raster cell falling within the range of... X The i The proportion of the interval, for Y The marginal probability density represents the corresponding environmental factor of the raster cell falling within the range of... Y The i The proportion of the interval; The maximum information coefficients between various types of environmental factors are normalized. Based on the normalized maximum information coefficients between various types of environmental factors and the preset second correlation threshold, several nonlinearly correlated environmental factor pairs are obtained, and a set of nonlinearly correlated environmental factors is constructed.

4. The method for predicting planting distribution areas based on crop suitability according to claim 1, characterized in that: The environmental factor screening results also include a set of key environmental factors; The method employs several environmental factor screening methods, filters target environmental factors based on the environmental factor dataset, and obtains several environmental factor screening results, including the following steps: Based on the environmental factor dataset and several preset machine learning models, contribution values ​​are calculated to obtain the contribution values ​​of each type of environmental factor to each grid cell under each machine learning model, wherein the contribution values ​​are: In the formula, To be on the current grid cell, the first i The marginal contribution values ​​of each type of environmental factor to the current machine learning model. S A subset of environmental factors, representing a subset that does not contain the first environmental factor. i Any subset of each type of environmental factor N For environmental factor datasets, N ( i To remove the first i Environmental factor datasets of various types of environmental factors The marginal contribution value represents the value of the first... i Each type of environmental factor is added to the current environmental factor subset. S Afterwards, the changes in the prediction results of the current machine learning model; The contribution values ​​of each type of environmental factor to each grid cell under the same machine learning model are averaged to obtain the average contribution value of each type of environmental factor under each machine learning model. Based on the average contribution value of each type of environmental factor under each machine learning model, key environmental factors are extracted to obtain several key environmental factors and construct a key environmental factor set.

5. The method for predicting planting distribution areas based on crop suitability according to claim 2, characterized in that, The step of fusing and analyzing the screening results of several target environmental factors to determine the target environmental factors and obtain the target environmental factor dataset of the sample area over a historical period includes the following steps: Construct a candidate environmental factor set, wherein the candidate environmental factor set is as follows: In the formula, For the set of candidate environmental factors, For the set of linearly redundant environmental factors, It is a set of environmental factors with nonlinear correlation. A set of key environmental factors; Based on the candidate environmental factor set and a preset trust calculation algorithm, the trust value of each type of environmental factor is obtained; based on the trust value of each type of environmental factor, the target environmental factors are determined to obtain the target environmental factor dataset of the sample area in a historical time period, wherein the trust calculation algorithm is as follows: In the formula, For the set of candidate environmental factors, For the set of linearly redundant environmental factors, It is a set of environmental factors with nonlinear correlation. This is a set of key environmental factors.

6. The method for predicting planting distribution areas based on crop suitability according to claim 1, characterized in that: The crop suitability label probability data includes crop suitability label probability vectors of several grid cells; The step of inputting the target environmental factor dataset and crop suitability raster label data of the sample area in a historical time period into the crop suitability prediction model to be trained, and obtaining the target crop suitability prediction model, includes the following steps: Based on the target environmental factor data of each grid cell in the target environmental factor dataset and a preset crop suitability probability calculation algorithm, crop suitability prediction probability data is obtained. The crop suitability prediction probability data includes a crop suitability prediction probability vector for each grid cell. The crop suitability probability calculation algorithm is as follows: In the formula, Let g be the crop suitability prediction probability vector for the g-th grid cell. G The number of grid cells. m The number of target environmental factors. For the first i The characteristic weight coefficients of each target environmental factor For the g-th grid cell, the first... i Characteristic values ​​of each target environmental factor; Loss is calculated based on the crop suitability prediction probability vector of each grid cell in the crop suitability prediction probability data and the crop suitability label probability vector of each grid cell in the crop suitability label probability data to obtain a loss value; based on the loss value, the crop suitability prediction model to be trained is trained to obtain the target crop suitability prediction model.

7. The method for predicting planting distribution areas based on crop suitability according to claim 3, characterized in that: The area prediction data includes the area prediction parameters for each grid cell; The step of predicting the planting distribution area based on the area prediction data and crop suitability raster prediction data of the area to be predicted, and obtaining the planting distribution area prediction result of the area to be predicted, includes the following steps: The area prediction parameters of each grid cell in the area prediction data are multiplied by the crop suitability prediction probability vector of the corresponding grid cell in the crop suitability grid prediction data to obtain the comprehensive index of each grid cell. Based on the comprehensive index of each grid cell and the preset planting area identification threshold, each grid cell is divided into planting area grids and non-planting area grids to obtain the planting distribution area prediction result of the area to be predicted.

8. A crop suitability-based planting distribution area prediction device, characterized in that, include: The data acquisition module is used to acquire environmental factor datasets and crop suitability raster label data of the sample area over a historical period. The environmental factor dataset includes environmental factor data of several raster cells of the sample area, and the environmental factor data includes several types of environmental factors. An environmental factor screening module is used to screen target environmental factors based on the environmental factor dataset using several environmental factor screening methods to obtain several environmental factor screening results, wherein the environmental factor screening results include several types of environmental factors after screening. The environmental factor analysis module is used to perform fusion analysis on the screening results of several target environmental factors, determine the target environmental factors, and obtain the target environmental factor dataset of the sample area in a historical time period. The model training module is used to input the target environmental factor dataset and crop suitability raster label data of the sample area in the historical time period into the crop suitability prediction model to be trained, so as to obtain the target crop suitability prediction model. The crop suitability prediction module is used to obtain a target environmental factor dataset of the area to be predicted, input the target environmental factor dataset of the area to be predicted into the target crop suitability prediction model to perform crop suitability prediction, and obtain crop suitability raster prediction data of the area to be predicted. The planting distribution area prediction module is used to obtain the area prediction data of the area to be predicted; and to predict the planting distribution area based on the area prediction data of the area to be predicted and the crop suitability raster prediction data, so as to obtain the planting distribution area prediction result of the area to be predicted.

9. A computer device, characterized in that, include: A processor, a memory, and a computer program stored in the memory and executable on the processor; the computer program, when executed by the processor, implements the steps of the crop suitability-based planting distribution area prediction method as described in any one of claims 1 to 7.

10. A storage medium, characterized in that: The storage medium stores a computer program that, when executed by a processor, implements the steps of the crop suitability-based planting distribution area prediction method as described in any one of claims 1 to 7.