Vegetation drought index construction method integrating soil information and multi-source remote sensing data
By constructing a vegetation drought index that integrates comprehensive soil information and multi-source remote sensing data, and combining multiple remote sensing indices with soil physical information, the shortcomings of existing vegetation drought monitoring technologies have been addressed, enabling accurate, applicable, and forward-looking monitoring of drought conditions.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI
- Filing Date
- 2024-12-25
- Publication Date
- 2026-07-02
AI Technical Summary
Existing technologies are insufficient to comprehensively and accurately monitor vegetation drought conditions. A single remote sensing index is inadequate to reflect the complexity of vegetation drought. Traditional methods are insufficient in terms of spatial coverage and timeliness, and do not fully consider soil physical characteristics, resulting in delayed or large errors in monitoring results.
A vegetation drought index integrating soil information and multi-source remote sensing data was constructed. Multiple remote sensing indices were integrated through principal component analysis, and the first principal component was extracted as the comprehensive drought index in combination with soil physical information. Considering the lagged response of vegetation to drought, Z-score processing and data dimensionality reduction were performed to avoid the subjectivity of empirical weights.
It improves the accuracy and applicability of vegetation drought monitoring, making it applicable to different ecological environments, enhancing the accuracy of drought detection and prediction capabilities, and providing scientific drought early warning support.
Smart Images

Figure CN2024142376_02072026_PF_FP_ABST
Abstract
Description
A Method for Constructing a Vegetation Drought Index Based on Integrated Soil Information and Multi-Source Remote Sensing Data Technical Field
[0001] This invention relates to the field of agricultural remote sensing technology, and more specifically, to a method for constructing a vegetation drought index that integrates soil information and multi-source remote sensing data. Background Technology
[0002] Vegetation drought is a continuation and transfer of meteorological drought. Its occurrence and development are influenced by a variety of complex factors, including the volatility of meteorological factors, the drought resistance of crop species, soil water retention capacity, and the overall characteristics of the regional ecological environment. Because vegetation drought involves not only meteorological conditions such as precipitation and temperature, but also soil moisture, vegetation cover, and its response speed to climate change, monitoring a single indicator is often insufficient to comprehensively and accurately identify and assess vegetation drought conditions. Furthermore, vegetation exhibits a significant lag in response to meteorological drought and changes in soil moisture. The physical and chemical composition of the soil (such as soil texture and organic matter content) also has a crucial impact on field water holding capacity. These factors collectively determine the drought resistance of soil and vegetation. Moreover, the diversity and complexity of soil composition not only affect its ability to absorb and retain precipitation but also directly determine the water use efficiency of crop roots. This multi-factor interaction places higher demands on the monitoring and prediction of the occurrence mechanism and dynamic changes of vegetation drought.
[0003] Traditional drought monitoring methods mainly rely on meteorological observation data (such as precipitation and evapotranspiration) and crop physiological models, but these methods have the following limitations: insufficient spatial coverage: limited meteorological station data makes it difficult to reflect drought conditions over a wide area; insufficient timeliness: monitoring methods that rely on ground observations cannot meet the need for rapid updates; single variable: traditional drought indices (such as the Standardized Precipitation Index (SPI) and the Crop Water Consumption Index (CWSI)) are usually based on a single variable and are difficult to reflect the comprehensive characteristics of drought.
[0004] With the rapid development of remote sensing technology, the ability to acquire large-scale, high-frequency data has significantly improved, providing new possibilities for drought monitoring. Remote sensing technology can acquire surface radiation and reflectance information to generate key drought-related remote sensing indices, such as: vegetation-related indices: NDVI (Normalized Difference Vegetation Index) and VCI (Vegetation Health Index); meteorological-related indices: TCI (Temperature Conditions Index); soil-related indices: SWDI (Soil Moisture and Drought Index); and water-related indices: NDWI (Water Body Index). These indices provide rich data support for analyzing the relationship between meteorological drought and vegetation drought. As research deepens, current vegetation drought monitoring is trending towards multi-source remote sensing data fusion. A single remote sensing index cannot fully reflect the complexity of vegetation drought, while multi-source remote sensing data fusion technology can integrate multi-dimensional information from vegetation, meteorology, and soil, providing a more accurate solution for vegetation drought monitoring. For example, by combining indicators such as PCI (Precipitation Conditions Index), VCI (Vegetation Health Index), and SWDI (Soil Moisture and Drought Index), different dimensions of drought can be reflected more comprehensively.
[0005] Currently, there is no unified definition for vegetation drought, leading to uncertainty in the classification of drought severity. For example, some researchers use SPEI (Standardized Precipitation Evapotranspiration Index), SPI (Standardized Precipitation Index), and soil moisture content as ground truth values for vegetation drought, employing machine learning methods to invert different levels of vegetation drought. These methods may introduce biases in the assessment of drought severity. Different crops have different water requirements during their growth cycles in vegetation drought, and the choice of time scale affects whether SPEI accurately reflects the actual drought situation. SPI cannot reflect the impact of rising temperatures or soil drying on vegetation drought and is slow to respond to rapid changes in vegetation drought, making it unsuitable for assessing short-term drought. Especially under hot drought scenarios with high temperatures, it tends to underestimate the severity of drought. For large-scale or spatially heterogeneous agricultural areas, soil moisture content data may not reflect the overall drought situation and cannot comprehensively reflect the complexity of vegetation drought on its own. Whether it's SPEI, SPI, or soil moisture content, no single indicator can fully describe the complex characteristics of vegetation drought because it is simultaneously influenced by multiple dimensions of factors, including meteorological, soil, and vegetation conditions. Furthermore, spatial distance methods typically rely on predefined distance weight matrices, which are difficult to accurately reflect the complex spatial heterogeneity of factors such as topography, climate, and soil in actual vegetation drought monitoring. This can lead to overestimating or underestimating drought levels in certain areas, particularly in regions with significant geographical features (such as mountain-plain boundaries) or large differences in meteorological conditions. Weight matrices are often based on human experience or simple distance metrics (such as Euclidean distance), lacking a systematic consideration of environmental variables, as illustrated in patent application CN202310792531.8. While this approach can capture the joint distribution among variables, it is primarily based on probabilistic statistical modeling and struggles to directly reflect the complex dynamic interactions between meteorological factors, soil moisture, and vegetation status. Moreover, other existing weighting methods are often based on empirical weights and lack scientific basis.
[0006] Analysis reveals that existing technologies primarily rely on commonly used remote sensing drought monitoring indicators (such as canopy temperature, vegetation greenness, and soil moisture) and combine them with observation station data to calibrate the model. While this approach is suitable for specific regions, its applicability is poor in areas with scarce data or insufficient station distribution, making it difficult to achieve universal global or cross-regional application. Furthermore, station data cannot fully represent the vegetation drought process, leading to monitoring errors. In addition, the modeling of the relationship between vegetation and meteorological and soil moisture often fails to adequately consider the lag effect of vegetation on meteorological and soil drought, easily causing monitoring results to lag behind the actual occurrence of vegetation drought, failing to accurately capture drought dynamics, and reducing the predictive power and foresight of the monitoring results. Moreover, existing methods typically utilize soil information only for direct monitoring of soil moisture content, without fully integrating soil physical properties (such as field capacity and soil wilting coefficient) to construct a drought index, affecting the accuracy of drought severity monitoring. Summary of the Invention
[0007] The purpose of this invention is to overcome the shortcomings of the prior art and provide a method for constructing a vegetation drought index that integrates soil information and multi-source remote sensing data. This method includes the following steps:
[0008] Construct an index dataset of multi-source remote sensing data for a given historical year. The index dataset includes a precipitation condition index dataset, a temperature condition index dataset, a vegetation condition index dataset, a water body index dataset, a temperature-vegetation-drought index dataset, and a soil moisture-drought index dataset.
[0009] The index dataset is centered using Z-score, and pixels with Z-scores greater than a set threshold are identified as outliers based on the data distribution. This process removes outliers from the data, resulting in a preprocessed dataset.
[0010] Principal component analysis was performed on the preprocessed dataset to extract the first principal component as the comprehensive drought index. Pixels in the original image that were outliers were then filled into the comprehensive drought index result to obtain the regional comprehensive drought index.
[0011] Based on the regional comprehensive drought index, assess the drought risk of the target area.
[0012] Compared with existing technologies, the advantages of this invention are: it integrates multiple indices, providing more comprehensive information than a single remote sensing index; compared with traditional methods that rely solely on vegetation information, this invention introduces soil physical information, enhancing the response capability to vegetation drought; it considers time lag characteristics, improving the accuracy of the index in capturing drought; and it uses PCA to extract principal components, avoiding the subjectivity of empirical weights and improving the reliability of the results. In summary, this invention is applicable to various agricultural scenarios and has stronger adaptability to different ecological environments.
[0013] Other features and advantages of the invention will become clear from the following detailed description of exemplary embodiments of the invention with reference to the accompanying drawings. Attached Figure Description
[0014] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments of the invention and, together with their description, serve to explain the principles of the invention.
[0015] Figure 1 is a flowchart of a method for constructing a vegetation drought index based on integrated soil information and multi-source remote sensing data according to an embodiment of the present invention;
[0016] Figure 2 is a schematic diagram of the process of constructing a vegetation drought index by integrating soil information and multi-source remote sensing data according to an embodiment of the present invention.
[0017] Figure 3 is a location map of the Kazakhstan research area according to an embodiment of the present invention;
[0018] Figure 4 is a schematic diagram of the comprehensive drought index of Kazakhstan from May to September 2024 according to an embodiment of the present invention;
[0019] Figure 5 is a schematic diagram of the distribution of average vegetation drought level and soil moisture monitoring stations in May 2024 according to an embodiment of the present invention.
[0020] Figure 6 is a scatter plot and correlation diagram of the average vegetation drought level and soil moisture in May 2024 according to an embodiment of the present invention.
[0021] Figure 7 is a schematic diagram of the distribution of average vegetation drought level and soil moisture monitoring stations in August 2024 according to an embodiment of the present invention;
[0022] Figure 8 is a scatter plot and correlation diagram of the average vegetation drought level in August 2024 and the ground station SPI-1 according to an embodiment of the present invention.
[0023] Figure 9 is a location map of the research area in Tajikistan according to an embodiment of the present invention;
[0024] Figure 10 is a schematic diagram of the comprehensive drought index of Tajikistan from May to September 2023 according to an embodiment of the present invention;
[0025] Figure 11 is a scatter plot and correlation diagram of the average drought level from May to September 2024 with SPEI-1 according to an embodiment of the present invention. Detailed Implementation
[0026] Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that, unless otherwise specifically stated, the relative arrangement, numerical expressions, and values of the components and steps set forth in these embodiments do not limit the scope of the invention.
[0027] The following description of at least one exemplary embodiment is merely illustrative and is in no way intended to limit the invention or its application or use.
[0028] Techniques, methods, and equipment known to those skilled in the art may not be discussed in detail, but where appropriate, such techniques, methods, and equipment should be considered part of the specification.
[0029] In all the examples shown and discussed herein, any specific values should be interpreted as merely exemplary and not as limitations. Therefore, other examples of exemplary embodiments may have different values.
[0030] It should be noted that similar labels and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be discussed further in subsequent figures.
[0031] As shown in Figures 1 and 2, the method for constructing the integrated vegetation drought index based on comprehensive soil information and multi-source remote sensing data includes the following steps.
[0032] Step S1: Construct a precipitation condition index dataset.
[0033] For example, NASA / GPM_L3 data is acquired, synthesized every 10 days, and the spatial resolution of the data is forcibly resampled to 1km to form a Precipitation Condition Index (PCI) dataset for historical years, which is generated every 10 days.
[0034] Specifically, firstly, the cumulative daily rainfall is calculated from NASA / GPM_L3 data to obtain the rainfall value. Then, the maximum and minimum rainfall values over a certain period are obtained from historical datasets. Finally, the PCI value is calculated pixel-by-pixel using the following formula: PCI = (P i -P min ) / (P max -P min (1)
[0035] Among them, P i It is the rainfall of pixel i, P min P is the minimum value of the historical dataset for cell i. max It is the maximum value of the historical dataset for pixel i.
[0036] Step S2: Construct datasets for temperature condition index, vegetation condition index, water body index, and temperature-vegetation-drought index.
[0037] For example, MODIS data is acquired and synthesized every 10 days to form a dataset of TCI (Temperature Condition Index), VCI (Vegetation Condition Index), NDWI (Water Index), and TVDI (Temperature-Vegetation-Drought Index) for historical years, and the spatial resolution of the data is forcibly resampled to 1km.
[0038] Specifically, step S2 includes the following sub-steps:
[0039] Step 2.1: Obtain MOD13Q1, MYD13Q1, MOD11A2, and MYD11A2 from the MODIS data; after merging identical datasets, take the average value for 10-day periods, and obtain the maximum and minimum values of NDVI and LST within a certain period based on historical datasets. Calculate the Vegetation Condition Index (VCI) and Temperature Condition Index (TCI) according to formulas (2) and (3), respectively. VCI = (NDVI) i -NDVI min ) / (NDVI max -NDVI min (2) TCI=(LST) max -LST i ) / (LST max -LST min (3)
[0040] Among them, NDVI i This is the NDVI value of pixel i. min This is the minimum value in the NDVI historical dataset. max This is the maximum value in the NDVI historical dataset. LST i This is the NDVI value of pixel i. LST min It is the minimum value in the NDVI historical dataset. LST max This is the maximum value in the historical NDVI dataset. It should be understood that, to ensure the TCI and other indices such as VCI represent the drought direction consistently, the molecule uses (LST). max -LST i )express.
[0041] Step 2.2: Obtain MOD13Q1, MYD13Q1, MOD11A2, and MYD11A2 from the MODIS data; after merging identical datasets, take the average value based on a 10-day period, obtain the spatial distribution of NDVI-LST based on historical datasets, and fit the LST according to formula (4). max and LST minThe Temperature-Vegetation Drought Index (TVDI) was calculated. TVDI = (LST max -LST i ) / (LST max -LST min (4)
[0042] Among them, LST i This is the current surface temperature. LST max =a*NDVI max +b, LST min =d*NDVI min +e, where a, b, d, and e are the maximum NDVI values within different LST value ranges (NDVI). max ) and NDVI minimum (NDVI min The slope and intercept of the fitted straight line. LST represents Land Surface Temperature. min This represents the minimum temperature at the wet / dry boundary (wet edge). LST max Indicates the maximum temperature at the wet-dry boundary (dry edge). NDVI max NDVI min For different LST thresholds, NDVI represents the maximum and minimum values.
[0043] It should be noted that, to ensure that TVDI and other indices such as VCI represent drought in the same direction, the molecule uses (LST). max -LST i )express.
[0044] Step 2.3: Obtain MOD09GA and MYD09GA from the MODIS data, perform cloud cover filtering and cloud removal, and then merge the datasets. Take the average value over 10 days and calculate the Water Body Index (NDWI) according to formula (5). NDWI=(G+NIR) / (G-NIR) (5)
[0045] Where G is the reflectance in the green light band, and NIR is the reflectance in the near-infrared band.
[0046] Step S3: Obtain soil sand content, clay content, organic matter content and soil bulk density datasets, and calculate the soil field water holding capacity and wilting point of the soil layer to construct a soil moisture dataset.
[0047] For example, obtain the soil sand content, clay content, organic matter content and soil bulk density dataset of Soilgrids 0-100cm. According to formulas (6) to (8), calculate the soil field water holding capacity and wilting point of the soil layer 0-100cm, and force the spatial resolution of the data to be resampled to 1km. Obtain the SMAP (Soil Moisture Active Passive) soil moisture (0-100cm) dataset, and average the data every 10 days. Force the spatial resolution of the data to be resampled to 1km to obtain the soil moisture dataset of historical years. Calculate the soil moisture drought index (SWDI) according to formula (8).
[0048] Among them, U 33t This refers to the soil moisture content at a suction level of 33 kPa, corresponding to field capacity (FC). 1500t This represents the soil moisture content at a suction level of 1500 kPa, corresponding to the Permanent Wilting (PW) point. Sand indicates the soil sand content (%). Clay indicates the soil clay content (%). OM indicates the soil organic matter content (%). SM i This represents the soil moisture value for pixel i. FC represents the soil field capacity. WP represents the wilting point.
[0049] In one embodiment, step S3 includes the following sub-steps:
[0050] Step 3.1: Obtain soil sand content, clay content, organic matter content, and soil bulk density datasets for soil layers 0cm, 10cm, 30cm, 60cm, and 100cm. Correct the soil data for the 0-100cm layer using the following method: Soil bulk density of the same soil layer SBK * 0.01 * corresponding layer depth (cm) * content of sand, clay, and organic matter in the corresponding soil layer; sum the corrected data and divide by the soil bulk density for the 0-100cm layer (Formula 9). B1 = SBK1 * 0.05; B2 = SBK2 * 0.1; B3 = SBK3 * 0.3; B4 = SBK4 * 0.6; B5 = SBK5 * 1.0; B 0-100 =B1+B2+B3+B4+B5 S1=SBK1*Sand1*0.01*0.05; S2=SBK2*Sand2*0.01*0.1; S3=SBK3*Sand3*0.01*0.3; S4=SBK4*Sand4*0.01*0.6; S5=SBK5*Sand5*0.01*1.0; S 0-100 =(S1+S2+S3+S4+S5) / B 0-100(9)
[0051] Similarly: OM1 = SBK1 * OM1 * 0.0 * 0.05 * 1.724; OM2 = SBK2 * OM2 * 0.01 * 0.1 * 1.724; OM3 = SBK2 * OM3 * 0.01 * 0.3 * 1.724; OM4 = SBK2 * OM4 * 0.01 * 0.6 * 1.724; OM5 = SBK2 * OM5 * 0.01 * 1.0 * 1.724; OM 0-100 =(OM1+OM2+OM3+OM4+OM5) / (B 0-100 )
[0052] Similarly, C can be calculated. 0-100 . C1=SBK1*C1*0.0*.05; C2=SBK2*C2*0.01*0.1; C3=SBK2*C3*0.01*0.3; C4=SBK2*C4*0.01*0.6; C5=SBK2*C5*0.01*1.0; C 0-100 =(C1+C2+C3+C4+C5) / (B 0-100 (10)
[0053] in:
[0054] SBK1, Sand1, OM1, and C1 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 0-5 cm.
[0055] B1, S1, OM1, and C1 represent the soil bulk density, sand content, organic matter content, and clay content at a depth of 0-5 cm after depth correction. SBK2, Sand2, OM2, and C2 represent the soil bulk density, sand content, organic matter content, and clay content at a depth of 5-10 cm.
[0056] B2, S2, OM2, and C2 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 5-10 cm after depth correction.
[0057] SBK3, Sand3, OM3, and C3 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 10-30 cm.
[0058] B3, S3, OM3, and C3 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 10-30 cm after deep correction.
[0059] SBK4, Sand4, OM4, and C4 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 30-60 cm.
[0060] B4, S4, OM4, and C4 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 30-60cm after depth correction.
[0061] SBK5, Sand5, OM5, and C5 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 60-100cm.
[0062] B5, S5, OM5, and C5 represent the soil bulk density, soil sand content, soil organic matter content, and soil clay content at a depth of 0-5cm after depth correction.
[0063] Step 3.2: Correct the soil sand content, clay content, organic matter content and soil bulk density data of 0-100cm soil according to soil bulk density data, and calculate the field water holding capacity (FC) and wilting point (WP) of 0-100cm soil according to formula (6) and formula (7).
[0064] Step 3.3: Obtain the SMAP (Soil Moisture Active Passive) soil moisture (0-100cm) dataset, and average the data every 10 days to obtain the soil moisture dataset for historical years. Calculate the soil moisture drought index (SWDI) according to formula (8).
[0065] Step S4: Perform Z-score centering on the multi-source remote sensing index dataset of historical years, and set out the outliers for pixels with Z-scores greater than a set threshold according to the data distribution, thereby removing outliers from the data.
[0066] Step S4 is the data preprocessing process. For example, considering the lag in vegetation response to drought, VCI and NDWI are lagged by 10 days, i.e., 1 period. For the remote sensing index dataset of historical years, Z-score centering is performed. According to the data distribution, pixels with Z-scores greater than 3 are set as outliers, and pixels with outliers (NAN) are removed from the data.
[0067] In one embodiment, step S4 includes the following sub-steps:
[0068] Step 4.1, calculate the Z-score (standard score), which measures the degree of deviation of a data point from the mean of its distribution. The calculation formula is as follows: Z=(X-μ) / σ (11)
[0069] Where Z is the standardized Z-score, X is the original data point value, μ is the mean of the data, and σ is the standard deviation of the data.
[0070] Step 4.2: Set the pixels with Z-scores greater than 3 as NANs and mark the location information of all NAN values.
[0071] Step S5: Principal component analysis is used to reduce the dimensionality of the data, the first principal component is extracted as the comprehensive drought index, and the outliers in the original image are filled into the comprehensive drought index result to obtain the vegetation comprehensive drought index.
[0072] In step S5, dimensionality reduction is performed on the multi-source data using principal component analysis (PCA). The first principal component is extracted as the comprehensive drought index (PCA_CDI), and the pixels with NAN values from the original image are filled into the comprehensive drought index result to obtain the regional comprehensive drought index.
[0073] In one embodiment, the principal component analysis process includes the following steps:
[0074] Step 5.1, Standardize Data
[0075] If different features of the data have different scales or units, the data is standardized so that the mean of each feature is 0 and the standard deviation is 1.
[0076] Step 5.2, calculate the covariance matrix.
[0077] Calculate the covariance matrix for the standardized data. The covariance matrix reflects the correlation between different features. The covariance matrix is a symmetric matrix whose elements represent the covariance between different features. Assuming we have an n×d data matrix X, where n is the number of samples and d is the number of features, the formula for calculating the covariance matrix C is as follows:
[0078] Step 5.3, calculate eigenvalues and eigenvectors.
[0079] Eigenvalue decomposition is performed on the covariance matrix to obtain eigenvalues and corresponding eigenvectors. The eigenvalues represent the variance of the data along the eigenvector directions.
[0080] Suppose C is a symmetric matrix whose eigenvalues are represented by λ1, λ2, ..., λ3. d This indicates that the corresponding eigenvectors are represented by v1, v2, ..., v d The formula for calculating eigenvalue decomposition is as follows: Cvi = λ i v i(13)
[0081] Step 5.4: Select principal components and projection data
[0082] Based on the magnitude of the eigenvalues, the eigenvectors corresponding to the k largest eigenvalues are selected as principal components, where k is the dimension after dimensionality reduction. The original data is then projected onto the selected principal components to obtain the dimensionality-reduced dataset.
[0083] Obtain the first principal component feature vectors v1, v2, ..., v k As principal components, the original data X is projected onto these principal components to obtain the dimensionality-reduced data matrix X. new This is the comprehensive drought index. The formula for calculating the projection is as follows: X new =XV k (14)
[0084] Principal component analysis (PCA) can directly reduce the dimensionality of multi-source data (such as meteorological data, vegetation indices, and soil moisture), integrating the main information of multiple variables into a few principal components. The first principal component is used as the comprehensive drought index, which has a clear physical meaning, facilitating the analysis of drought driving factors. This method avoids complex marginal distribution fitting and joint distribution modeling, significantly reducing computational complexity. Furthermore, it eliminates the need for complex assumptions about input variables (such as distribution types and parameterized models), making it more flexible and applicable to processing heterogeneous multi-source data. In addition, matrix factorization (such as singular value decomposition) is a core step in PCA, offering high computational efficiency and suitability for large-scale data processing. Moreover, during dimensionality reduction, PCA automatically adjusts weights based on the statistical correlation of the data, eliminating the need for manually specifying weight matrices. This allows it to effectively capture the inherent spatiotemporal variations of the data, making it suitable for processing data with significant regional heterogeneity.
[0085] Step S6: Assess the drought level of the target area based on the comprehensive drought index.
[0086] Mapping is performed based on the Composite Drought Index (PCA_CDI) to obtain a regional vegetation drought risk map. For example, the PCA_CDI is used to classify the PCA_CDI of a specified area to obtain a regional vegetation drought risk map, with drought levels divided proportionally.
[0087] To verify the application performance of this invention in real-world scenarios, the Central Asian region was selected as the research target area. Taking Kazakhstan as an example, as shown in Figure 3 (location map of the Kazakhstan study area), its climate is predominantly arid and semi-arid, with most of its land consisting of grasslands and deserts. Therefore, drought has a significant impact on agricultural production and the ecological environment. In recent years, with the intensification of global climate change, the frequency and intensity of droughts in Kazakhstan have been increasing, especially during the crop growing season, when water scarcity becomes even more severe.
[0088] To provide effective agricultural management decision support, this invention was used to calculate the drought index for major agricultural regions in Kazakhstan. The calculation process includes:
[0089] Step 11: Calculate the 2024 Precipitation Condition Index (PCI). Obtain NASA / GPM_L3 data, synthesize a period every 10 days, and force the spatial resolution of the data to be resampled to 1km to form a PCI index dataset with a period of 10 days for historical years, and calculate the 2024 Precipitation Condition Index.
[0090] Step 12: Calculate the Temperature Condition Index (TCI), Vegetation Condition Index (VCI), Water Body Index (NDWI), and Temperature Vegetation Drought Index (TVDI). Obtain MODIS data and synthesize a period every 10 days to form a historical TCI, VCI, NDWI, and TVDI dataset every 10 days. Force the spatial resolution of the data to be resampled to 1km. Calculate the 2024 Temperature Condition Index (TCI), Vegetation Condition Index (VCI), Water Body Index (NDWI), and Temperature Vegetation Drought Index (TVDI) data based on historical data.
[0091] Step 13: Obtain the soil sand content, clay content, organic matter content, and soil bulk density datasets for the 0-100cm soil layer in Soilgrids. Calculate the field water holding capacity and wilting point of the 0-100cm soil layer according to formulas (6) to (8), and force a resampling of the data spatial resolution to 1km. Obtain the SMAP (Soil Moisture Active Passive) dataset (0-100cm), and average the data every 10 days to obtain the 2024 soil moisture dataset. Combine the soil sand content, clay content, organic matter content, and soil bulk density datasets to calculate the Soil Moisture Aridity Index (SWDI).
[0092] Step 14: Data preprocessing. For the remote sensing index dataset of historical years, Z-score centering is performed. According to the data distribution, pixels with Z-score scores greater than 3 are set as outliers, and pixels with NAN values are removed from the data.
[0093] Step 15: Based on PCA dimensionality reduction, perform dimensionality reduction analysis on the multi-source data, extract the first principal component as the comprehensive drought index (PCA_CDI), and fill the pixels with NAN values in the original image into the comprehensive drought index result to obtain the regional comprehensive drought index.
[0094] Step 16: Create a map based on the Composite Drought Index (PCA_CDI) to obtain a regional vegetation drought risk map.
[0095] Based on the above process, the comprehensive drought index for vegetation in Kazakhstan from May to September 2024 was calculated, as shown in Figure 4. The index was also validated using soil moisture data for May and standardized precipitation index (SPI) data for August provided by the Kazakh meteorological department. The results show that the comprehensive drought index proposed in this invention can accurately reflect the spatial distribution of drought.
[0096] Furthermore, during the verification process, the average degree of vegetation drought index in May was compared and analyzed with soil moisture data, as shown in Figures 5 and 6. The results show that the drought index can effectively reflect changes in actual soil moisture content. The correlation between the comprehensive drought index and the soil moisture content data at the stations reached a strong correlation (0.821), with an R² of 0.673, and both passed the significance test. Through comparative analysis, this invention can effectively reflect changes in soil moisture content, demonstrating the effectiveness of the comprehensive drought index in early identification of vegetation drought. The results show that areas with high drought indices generally have lower soil moisture, indicating that PCA_CDI can reliably monitor the spatial distribution characteristics of drought.
[0097] Furthermore, Figures 7 and 8 show the correlation between the average Composite Drought Index (CDI) and the Standardized Precipitation Index (SPI) in August. Correlation analysis revealed a significant positive correlation between the CDI and SPI, exceeding 0.8, particularly in areas experiencing persistent drought, where their trends were consistent. This further validates the scientific validity and accuracy of the drought index in assessing vegetation drought, contributing to future improvements in drought monitoring and early warning systems.
[0098] Verification analysis shows that the Composite Drought Index (PCA_CDI) can provide timely drought warnings for the agricultural sector in Kazakhstan, helping farmers and policymakers formulate reasonable irrigation and farmland management measures, thereby reducing the impact of drought on agricultural production. This invention also provides valuable experience for the promotion and application of PCA_CDI in other parts of Central Asia.
[0099] Tajikistan, a Central Asian country (see Figure 9), faces a challenging environment due to its geographical location and climate. Annual average precipitation varies significantly across different regions. While mountainous areas receive abundant rainfall, the lowlands and the desert and semi-desert regions in the east experience very limited precipitation, frequently leading to water shortages. In recent years, global climate change has further exacerbated the instability of precipitation patterns in Tajikistan. The increased frequency of extreme weather events, including prolonged droughts, has further intensified the pressure on Tajikistan's water resources.
[0100] According to the present invention, the comprehensive drought index of Tajikistan from May to September 2023 was calculated and verified according to the commonly used drought index (Standardized Precipitation Evapotranspiration Index, SPEI).
[0101] In the validation, Figures 10 and 11 were used to compare the average drought index of Tajikistan from May to September with SPEI-1 data. The results show that the drought index effectively reflects meteorological drought. The correlation between the composite drought index and SPEI data reached a strong correlation (0.80), with an R² of 0.64, and both passed the significance test. Through comparative analysis, this invention effectively reflects meteorological changes, demonstrating the effectiveness of the composite drought index in early identification of vegetation drought. The results show that areas with high drought indices have lower meteorological drought levels, indicating that the PCA_CDI can reliably monitor the spatial distribution characteristics of drought.
[0102] In summary, compared with the prior art, the present invention has the following advantages:
[0103] 1) This invention fully integrates multi-source remote sensing data and extracts key features through principal component analysis, avoiding the information loss problems that may result from simple resolution conversion. By extracting principal components, it avoids the oversimplification of variable interactions in empirical models, and more scientifically reflects the complexity and multidimensional characteristics of vegetation drought. This unsupervised method for constructing a comprehensive vegetation drought index based on principal component analysis ensures the scientific validity and simplicity of the index.
[0104] 2) This invention integrates multiple remote sensing indicators such as precipitation condition index, vegetation health index, temperature condition index, water body index, temperature-vegetation drought index, and soil moisture drought index to construct a comprehensive drought index with strong universality, which can be applied to regions with different climatic conditions and ecological environments.
[0105] 3) This invention performs lag discrimination on remote sensing vegetation data (such as VCI, NDWI) and meteorological data (PCI, TCI) and soil moisture data (SWDI) to quantitatively assess the lag response of vegetation to meteorological and soil drought. In turn, it introduces lag characteristics into the construction of the comprehensive drought index, thereby improving the index's ability to dynamically monitor vegetation drought.
[0106] 4) This invention, based on remote sensing soil moisture monitoring, further incorporates soil physical properties such as field capacity and wilting coefficient, enhancing the sensitivity and physical significance of the comprehensive drought index to vegetation drought. Furthermore, it considers regional differences in soil properties, making the monitoring model more adaptable to different regions. By strengthening the utilization of soil physical information, it can accurately reflect the complexity of vegetation drought. By integrating multi-source data from meteorology, vegetation, and soil, it comprehensively describes the occurrence mechanism of vegetation drought, achieving integrated modeling of multi-dimensional factors.
[0107] 5) This invention can not only monitor the dynamic changes of vegetation drought in real time, but also make forward-looking predictions of drought occurrence trends through hysteresis response analysis, providing more scientific data support for vegetation drought resistance decision-making and improving the ability to predict drought occurrence.
[0108] This invention can be a system, method, and / or computer program product. A computer program product may include a computer-readable storage medium having computer-readable program instructions loaded thereon for causing a processor to implement various aspects of the invention.
[0109] Computer-readable storage media can be tangible devices capable of holding and storing instructions for use by an instruction execution device. Computer-readable storage media can be, for example, but not limited to, electrical storage devices, magnetic storage devices, optical storage devices, electromagnetic storage devices, semiconductor storage devices, or any suitable combination thereof. More specific examples (a non-exhaustive list) of computer-readable storage media include: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static random access memory (SRAM), portable compact disc read-only memory (CD-ROM), digital multifunction disc (DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards or recessed protrusions storing instructions thereon, and any suitable combination thereof. The computer-readable storage media used herein are not to be construed as transient signals themselves, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., light pulses through fiber optic cables), or electrical signals transmitted through wires.
[0110] The computer-readable program instructions described herein can be downloaded from computer-readable storage media to various computing / processing devices, or downloaded via a network, such as the Internet, local area network, wide area network, and / or wireless network, to an external computer or external storage device. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and / or edge servers. A network adapter card or network interface in each computing / processing device receives the computer-readable program instructions from the network and forwards them to the computer-readable storage media in the respective computing / processing device.
[0111] The computer program instructions used to perform the operations of this invention may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages such as Smalltalk, C++, Python, etc., and conventional procedural programming languages such as "C" or similar languages. The computer-readable program instructions may be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or may be connected to an external computer (e.g., via the Internet using an Internet service provider). In some embodiments, electronic circuitry, such as programmable logic circuitry, field-programmable gate arrays (FPGAs), or programmable logic arrays (PLAs), is personalized by utilizing state information from the computer-readable program instructions. This electronic circuitry can execute the computer-readable program instructions to implement various aspects of the invention.
[0112] Various aspects of the present invention are described herein with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer-readable program instructions.
[0113] These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus to produce a machine such that, when executed by the processor of the computer or other programmable data processing apparatus, they create means for implementing the functions / actions specified in one or more blocks of the flowchart and / or block diagram. These computer-readable program instructions can also be stored in a computer-readable storage medium that causes a computer, programmable data processing apparatus, and / or other device to operate in a particular manner; thus, the computer-readable medium storing the instructions comprises an article of manufacture that includes instructions for implementing aspects of the functions / actions specified in one or more blocks of the flowchart and / or block diagram.
[0114] Computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable data processing apparatus, or other device to produce a computer-implemented process, thereby causing the instructions executed on the computer, other programmable data processing apparatus, or other device to perform the functions / actions specified in one or more boxes of a flowchart and / or block diagram.
[0115] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of an instruction containing one or more executable instructions for implementing a specified logical function. In some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions. It will be known to those skilled in the art that implementation in hardware, implementation in software, and implementation using a combination of software and hardware are equivalent.
[0116] The various embodiments of the present invention have been described above. These descriptions are exemplary and not exhaustive, and are not limited to the disclosed embodiments. Many modifications and variations will be apparent to those skilled in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles, practical application, or technical improvements to the embodiments in the market, or to enable others skilled in the art to understand the embodiments disclosed herein. The scope of the invention is defined by the appended claims.
Claims
1. A method for constructing a vegetation drought index by integrating soil information and multi-source remote sensing data, comprising the following steps: Construct an index dataset of multi-source remote sensing data for a given historical year. The index dataset includes a precipitation condition index dataset, a temperature condition index dataset, a vegetation condition index dataset, a water body index dataset, a temperature-vegetation-drought index dataset, and a soil moisture-drought index dataset. The index dataset is centered using Z-score, and pixels with Z-scores greater than a set threshold are identified as outliers based on the data distribution. This process removes outliers from the data, resulting in a preprocessed dataset. Principal component analysis was performed on the preprocessed dataset to extract the first principal component as the comprehensive drought index. Pixels in the original image that were outliers were then filled into the comprehensive drought index result to obtain the regional comprehensive drought index. Based on the regional comprehensive drought index, assess the drought risk of the target area.
2. The method according to claim 1, characterized in that, Principal component analysis of the preprocessed index dataset includes: The preprocessed index dataset is standardized to obtain standardized data; For the standardized data, calculate the covariance matrix; The covariance matrix is decomposed into eigenvalues to obtain eigenvalues and corresponding eigenvectors, where the eigenvalues represent the variance of the data in the direction of the eigenvectors. Based on the magnitude of the obtained eigenvalues, the eigenvectors corresponding to the k largest eigenvalues are selected as principal components. The original data is then projected onto the principal components to obtain a dimensionality-reduced data matrix, which serves as the comprehensive drought index, expressed as follows: X new =XV k Where k is the dimension after dimensionality reduction, and X represents the original data before dimensionality reduction. new It is the data matrix after dimensionality reduction.
3. The method according to claim 1, characterized in that, The precipitation condition index dataset is constructed according to the following steps: NASA / GPM_L3 data was acquired, synthesized every 10 days, and the spatial resolution of the data was forcibly resampled to 1km to form a historical precipitation condition index dataset with a 10-day interval. The PCI value was calculated pixel by pixel according to the following formula: PCI=(P i -P min ) / (P max -P min ) Among them, P i It is the rainfall of pixel i, P min P is the minimum value of the historical dataset for cell i. max It is the maximum value of the historical dataset for pixel i.
4. The method according to claim 1, characterized in that, The temperature condition index dataset, the vegetation condition index dataset, the water body index dataset, and the temperature-vegetation-drought index dataset are constructed according to the following steps: Obtain MOD13Q1, MYD13Q1, MOD11A2, and MYD11A2 from the MODIS data, merge identical datasets, and take the average value for 10-day periods to calculate the vegetation condition index dataset and the temperature condition index dataset. Obtain MOD13Q1, MYD13Q1, MOD11A2, and MYD11A2 from MODIS data, merge identical datasets, take the average value for 10-day periods, obtain the spatial distribution of NDVI-LST based on historical datasets, and then calculate the temperature-vegetation-drought index dataset. Obtain MOD09GA and MYD09GA from the MODIS data, filter and synthesize the cloud cover data, merge the datasets, and calculate the water body index dataset by taking the average value over 10 days.
5. The method according to claim 1, characterized in that, The soil moisture drought index dataset was constructed according to the following steps: Obtain soil sand content, clay content, organic matter content, and soil bulk density datasets for 0cm, 10cm, 30cm, 60cm, and 100cm depths from the global soil data Soilgrids. Correct the soil data for 0-100cm depth as follows: soil bulk density SBK * 0.01 * depth of the corresponding soil layer multiplied by the content of sand, clay, and organic matter in the corresponding soil layer. Add the corrected data together and divide by the soil bulk density for 0-100cm depth. Based on the soil bulk density data, the soil sand content, clay content, organic matter content and soil bulk density data of 0-100cm were corrected, and the field water holding capacity and wilting point of the soil in 0-100cm were calculated. The soil moisture 0-100cm dataset is obtained from the active and passive soil moisture observation data, and the data is averaged every 10 days to obtain the soil moisture dataset for historical years, and then the soil moisture drought index dataset is calculated.
6. The method according to claim 4, characterized in that, The temperature condition index dataset, the vegetation condition index dataset, the water body index dataset, and the temperature-vegetation-drought index dataset are respectively calculated based on the following formula: TCI=(LST max -LST i ) / (LST max -LST min VCI = (NDVI) i -NDVI min ) / (NDVI max -NDCI min TVDI = (LST) max -LST i ) / (LST max -LST min NDWI = (G + NIR) / (G - NIR) Where TCI represents the Temperature Condition Index, LST max LST represents the maximum value in the historical data for the same period. max LST is the minimum value among historical data for the same period. i It represents the current surface temperature; VCI stands for Vegetation Condition Index, NDVI i It is the NDVI value of pixel i, NDVI min It is the minimum value of the NDVI historical dataset for the same period. max This represents the maximum value of the NDVI historical dataset for the same period; NDWI represents the water index, G is the green light reflectance, NIR is the near-infrared reflectance; TVDI represents the temperature vegetation drought index, and LST... i This is the current surface temperature, LST. max =a*NDVI max +b, LST min =d*NDVI min +e, where a and d are the slopes of the corresponding terms, and b and e are the intercepts of the corresponding terms, NDVI max NDVI min For different LST thresholds, NDVI represents the maximum and minimum values.
7. The method according to claim 2, characterized in that, Eigenvalue decomposition is performed according to the following formula: Cvi = λ i v i Where C is a symmetric covariance matrix, λ1, λ2, ..., λ d Let v1, v2, ..., v represent the eigenvalues of the covariance matrix. d This represents the corresponding feature vector.
8. The method according to claim 1, characterized in that, Before performing Z-score centralization on the index dataset, during the data preprocessing process, the vegetation condition index dataset and the water body index data are lagging by a predetermined number of days compared to other index datasets.
9. A computer-readable storage medium having a computer program stored thereon, wherein, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 8.
10. A computer device comprising a memory and a processor, wherein a computer program capable of running on the processor is stored in the memory, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 8.