Unmanned aerial vehicle multi-spectral remote sensing water body fine classification extraction method

By constructing a joint feature matrix and a multilayer perceptron model, combined with morphological processing, the accuracy problem of UAV multispectral remote sensing water body extraction in complex terrain scenarios was solved, and high-fidelity water body classification was achieved.

CN121904641BActive Publication Date: 2026-06-23SICHUAN PASTEUR ENVIRONMENTAL PROTECTION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SICHUAN PASTEUR ENVIRONMENTAL PROTECTION TECH CO LTD
Filing Date
2026-03-19
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

Existing UAV multispectral remote sensing water extraction methods suffer from decreased accuracy in complex terrain scenarios with overlapping features, especially in areas shaded by buildings and tall trees where misclassification is common. Furthermore, the lack of standardized post-processing optimization procedures leads to noise and boundary distortion.

Method used

A multilayer perceptron model is used to combine spectral features, grayscale texture features and elevation difference features into a joint feature matrix. Combined with morphological dilation and erosion filtering, K-Means clustering and feature importance selection are used to optimize the parameters of the multilayer perceptron model, perform pixel-level classification of water bodies and perform morphological processing.

Benefits of technology

It improves the accuracy of classifying complex water bodies, eliminates shadow artifacts, smooths boundaries, and achieves high-fidelity water body extraction.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121904641B_ABST
    Figure CN121904641B_ABST
Patent Text Reader

Abstract

The application discloses a kind of unmanned vehicle multispectral remote sensing water fine classification extraction method, it is related to remote sensing image processing technical field, the joint feature matrix including spectral feature, gray texture feature and elevation difference feature is creatively constructed in the present application, effectively eliminates redundant interference information, provides high-quality input with very distinguishing degree for multilayer perception machine model.Simultaneously, illumination geometry and digital elevation model are deeply coupled in the feature construction process, not only can accurately identify and reverse inhibit the shadow pseudo water body feature projected by complex terrain or building, break through the failure limitation of traditional single elevation filtering in flat shadow area, but also cooperate with the standardization closed operation process based on morphological dilation and corrosion filtering, eliminate the small holes and jagged artifacts in prediction result, realize the high-fidelity restoration of water body boundary, so as to improve the classification accuracy of complex water body such as narrow river, fragmented pit pond and urban waterlogging.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of remote sensing image processing technology, and specifically relates to a method for fine classification and extraction of water bodies using UAV multispectral remote sensing. Background Technology

[0002] In recent years, UAV multispectral remote sensing technology, with its unique advantages such as ultra-high spatial resolution, flexible and convenient operation, controllable monitoring costs, and rapid emergency response, has gradually become a research hotspot and application focus in the field of refined water environment monitoring. By equipping itself with multispectral sensors, UAVs can keenly capture the spectral reflectance characteristics of ground objects in different bands, providing a massive and high-value basic data source for refined analysis of the water environment in complex scenarios, playing an increasingly important role in regional water ecological protection and daily inspections. However, existing UAV-based multispectral water extraction methods still have many technical shortcomings. In complex scenarios with intermingling ground objects such as vegetation shadows, exposed bare soil, and building reflections, the accuracy of spectral feature model classification decreases significantly. In particular, the large-area shadows cast by buildings and tall trees have spectral characteristics highly similar to real water bodies and are easily misclassified as water pixels. On the other hand, existing elevation-assisted identification methods often fail when faced with tree canopy shadows or building roof shadows on flat ground because the local elevation changes are close to zero. Not only is it difficult to accurately remove these deep shadow artifacts, but most extraction methods also lack standardized post-processing optimization procedures for classification results. This results in the initial classification results being filled with salt and pepper noise, small holes, and jagged boundaries. The extracted water body boundaries are severely distorted, and the final classification results are difficult to directly apply to engineering practice and actual management work. Summary of the Invention

[0003] To address the shortcomings of existing technologies, this invention provides a method for fine-grained classification and extraction of water bodies using UAV multispectral remote sensing, thereby solving the aforementioned technical problems.

[0004] A method for refined classification and extraction of water bodies using UAV multispectral remote sensing includes the following steps:

[0005] Step S1: Acquire multispectral remote sensing data of the water body in the target area and preprocess it to generate multispectral reflectance data and digital elevation model data;

[0006] Step S2: Based on the multispectral reflectance data, extract initial sample data including water samples and non-water samples, perform spectral feature selection based on feature importance selection on the initial sample data, and obtain the selection results as intermediate sample data;

[0007] Step S3: Process the intermediate sample data using the K-Means clustering algorithm to obtain the classification result labels for water bodies and non-water bodies;

[0008] Step S4: Extract grayscale texture features from intermediate sample data based on the multispectral reflectance data, extract elevation difference features from intermediate sample data based on the digital elevation model data, construct a joint feature matrix containing spectral features, grayscale texture features and elevation difference features, use the classification result label as the target and the joint feature matrix as the input to train and optimize the parameters of the multilayer perceptron model to obtain a trained multilayer perceptron model.

[0009] Step S5: Use the trained multilayer perceptron model to perform pixel-level classification prediction of the water body in the target area, and perform morphological dilation and erosion filtering closed-loop operations on the prediction results to obtain the final water body extraction result.

[0010] Preferably, step S1, when acquiring multispectral remote sensing data of the water body in the target area, specifically includes the following steps:

[0011] Using drones equipped with multispectral imaging sensors, on-site image collection is carried out on the target area's water body and surrounding area according to preset flight routes, flight altitudes, and overlap.

[0012] Acquire multispectral remote sensing data including green band and near-infrared characteristic bands.

[0013] Preferably, step S1, in which multispectral reflectance data and digital elevation model data are generated, specifically includes the following steps:

[0014] Correction processing is performed on multispectral remote sensing data;

[0015] The multispectral remote sensing data that has undergone the aforementioned correction process are fused and stitched together to generate multispectral reflectance data with absolute radiance values, and digital elevation model data for the same region are extracted simultaneously.

[0016] Preferably, the correction process includes at least one of radiometric calibration, geometric correction, and atmospheric correction.

[0017] Preferably, step S2, when extracting initial sample data based on the multispectral reflectance data, specifically includes the following steps:

[0018] Based on the green band and near-infrared band in the multispectral reflectance data, the normalized water index of each pixel is calculated.

[0019] Using the normalized water index segmentation threshold, pixels are initially divided into water body pixels and non-water body pixels;

[0020] For the water body pixels and non-water body pixels, the full-band spectral gray values, as well as the spectral mean and spectral variance features calculated based on the local spatial window, are extracted and integrated to form the initial sample data.

[0021] Preferably, step S2, when performing spectral feature screening based on feature importance selection on the initial sample data and obtaining the screening results as intermediate sample data, specifically includes the following steps:

[0022] Random forest or gradient boosting decision tree algorithms are used to calculate the importance score of each feature in the initial sample data for distinguishing between water bodies and non-water bodies, and this importance score is used as the contribution of each feature.

[0023] Sort the contributions by size and set a contribution threshold;

[0024] Features whose contribution is higher than a preset contribution threshold are selected to form a feature subset, and the sample data containing the feature subset is used as intermediate sample data.

[0025] Preferably, step S3, when processing the intermediate sample data using the K-Means clustering algorithm to obtain the classification result labels, specifically includes the following steps:

[0026] Use intermediate sample data as the input dataset;

[0027] The dataset was divided into water body clusters and non-water body clusters by iteratively optimizing the cluster centers using the K-Means clustering algorithm.

[0028] The normalized water index mean of the two cluster centers is calculated and compared. The cluster with the higher mean is identified as a water cluster and labeled as a water cluster, while the cluster with the lower mean is identified as a non-water cluster and labeled as a non-water cluster, thereby obtaining the classification result labels for all samples.

[0029] Preferably, step S4, in which a joint feature matrix comprising spectral features, grayscale texture features, and elevation difference features is constructed, specifically includes the following steps:

[0030] The original multispectral reflectance data is fused into a single-channel grayscale image according to the preset band weights, and the grayscale co-occurrence matrix is ​​calculated to extract the grayscale texture features of each pixel.

[0031] Based on digital elevation model data, the elevation variation range or mean difference between each pixel and its preset neighboring pixels is calculated as the elevation difference feature.

[0032] Index extraction is performed at the sample location points, and the intermediate sample data, the grayscale texture features, and the elevation difference features are channel-wise spliced ​​and fused at the pixel scale to construct the joint feature matrix.

[0033] Preferably, step S4, which involves training and optimizing the multilayer perceptron model using the classification result label as the target and the joint feature matrix as the input, specifically includes the following steps:

[0034] A multilayer perceptron model is constructed using the classification result labels as supervision signals and the joint feature matrix as input features.

[0035] The number of hidden layer neurons, learning rate, and number of iterations of the multilayer perceptron model are optimized using grid search or Bayesian optimization algorithms as hyperparameter optimization algorithms until the loss function converges, thus obtaining the trained multilayer perceptron model.

[0036] Preferably, in step S5, when the trained multilayer perceptron model is used to classify and predict the water body in the target area, and the classification and prediction results are sequentially processed by morphological dilation and corrosion filtering closing operations to obtain the final water body extraction result, the specific steps include:

[0037] The predicted results are first subjected to an expansion calculation to fill the small holes and narrow gaps in the water area;

[0038] Then, an erosion operation is performed on the expanded result to smooth the water body boundary, and the final water body extraction result after filtering and noise reduction is output.

[0039] The beneficial effects of this invention are as follows: This invention creatively constructs a joint feature matrix that includes spectral features, grayscale texture features, and elevation difference features, effectively eliminating redundant interference information and providing a highly discriminative, high-quality input for the multilayer perceptron model. Simultaneously, it deeply couples illumination geometry and digital elevation models during feature construction, which not only accurately identifies and suppresses pseudo-water features cast by complex terrain or buildings, overcoming the limitations of traditional single elevation filtering methods in flat shadow areas, but also, in conjunction with a standardized closed-loop operation process based on morphological dilation and erosion filtering, eliminates small holes and jagged artifacts in the prediction results, achieving high-fidelity restoration of water body boundaries, thereby improving the classification accuracy of complex water bodies such as narrow rivers, fragmented ponds, and urban flooding. Attached Figure Description

[0040] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0041] Figure 1 This is a schematic diagram of the structure of a UAV multispectral remote sensing water body fine classification and extraction method provided by the present invention;

[0042] Figure 2 The graph shows the variation of the loss function of the multilayer perceptron model with training rounds in the UAV multispectral remote sensing water body fine classification and extraction method provided by the present invention.

[0043] Figure 3 The graph shows the change in accuracy of the multilayer perceptron model with training rounds in the UAV multispectral remote sensing water body fine classification and extraction method provided by the present invention.

[0044] Figure 4 The original multispectral remote sensing data provided by this invention is a method for fine classification and extraction of water bodies using UAV multispectral remote sensing.

[0045] Figure 5 The image shows the water body classification effect of a UAV multispectral remote sensing water body fine classification and extraction method provided by the present invention. Detailed Implementation

[0046] The following disclosure provides many different embodiments or examples for implementing various embodiments of the invention. To simplify the disclosure, specific embodiments are described below. Of course, these are merely examples and are not intended to limit the scope of the invention.

[0047] The embodiments of the invention will now be described in detail with reference to the accompanying drawings.

[0048] like Figure 1 As shown, a method for refined classification and extraction of water bodies using UAV multispectral remote sensing includes the following steps:

[0049] Step S1: Acquire multispectral remote sensing data of the water body in the target area and preprocess it to generate multispectral reflectance data and digital elevation model data;

[0050] Step S2: Based on the multispectral reflectance data, extract initial sample data including water samples and non-water samples, perform spectral feature selection based on feature importance selection on the initial sample data, and obtain the selection results as intermediate sample data;

[0051] Step S3: Process the intermediate sample data using the K-Means clustering algorithm to obtain the classification result labels for water bodies and non-water bodies;

[0052] Step S4: Extract grayscale texture features from intermediate sample data based on the multispectral reflectance data, extract elevation difference features from intermediate sample data based on the digital elevation model data, construct a joint feature matrix containing spectral features, grayscale texture features and elevation difference features, use the classification result label as the target and the joint feature matrix as the input to train and optimize the parameters of the multilayer perceptron model to obtain a trained multilayer perceptron model.

[0053] Step S5: Use the trained multilayer perceptron model to perform pixel-level classification prediction of the water body in the target area, and perform morphological dilation and erosion filtering closed-loop operations on the prediction results to obtain the final water body extraction result.

[0054] In practical implementation, the first step is to obtain, such as Figure 4 The multispectral imagery and digital elevation model (DEM) data of the urban river section shown are used to calculate relevant water body indices and extract initial water and land samples. A random forest algorithm is then used to evaluate the contribution of each feature, and redundant bands are removed to output dimensionality-reduced intermediate sample data. Subsequently, K-Means clustering is used to perform unsupervised segmentation of the intermediate samples, automatically generating water and non-water body labels. Based on this, the gray-level co-occurrence matrix texture features and the local neighborhood elevation range features based on the DEM are calculated. These are then combined with the selected spectral features to construct a multidimensional joint feature matrix, which is used as input features to feed into a multilayer perceptron model for supervised parameter optimization training. After the model outputs pixel-level prediction results, morphological dilation is performed sequentially using preset matrix structuring elements to fill small holes in the water surface. Then, erosion filtering is used to smooth the jagged shoreline, completing the closing operation and outputting the final result. Figure 5 The classification effect diagram is shown. The core principle of this design lies in breaking through the information bottleneck of a single optical dimension. By introducing physical elevation parameters and spatial texture attributes, the low-dimensional spectrum is mapped to a high-dimensional joint feature space. The nonlinear fitting capability of the multilayer perceptron is used to decouple the high confusion between real water bodies and shadows in optical representation, and spatial topological morphology correction is added. The design principle of this scheme is to introduce three-dimensional spatial entity constraints using elevation range features, enabling the multilayer perceptron to nonlinearly separate the shadows of buildings with "low reflectivity and abrupt elevation changes" from real water bodies with "low reflectivity and flat elevation" in the hidden feature space. At the same time, morphological structural elements are used to physically fill isolated holes. Compared with existing techniques that only use spectral threshold segmentation, this method cuts off the spectral deception path of the shadow area when processing scenes with tall buildings or tree canopies, smooths the jagged boundaries caused by mixed pixels, and outputs a high-fidelity water mask without holes.

[0055] More specifically, in step S1, when acquiring multispectral remote sensing data of the target area's water body, the following steps are included:

[0056] Using drones equipped with multispectral imaging sensors, on-site image collection is carried out on the target area's water body and surrounding area according to preset flight routes, flight altitudes, and overlap.

[0057] Acquire multispectral remote sensing data including green band and near-infrared characteristic bands.

[0058] More specifically, in step S1, generating multispectral reflectance data and digital elevation model data includes the following steps:

[0059] Correction processing is performed on multispectral remote sensing data;

[0060] The multispectral remote sensing data that has undergone the aforementioned correction process are fused and stitched together to generate multispectral reflectance data with absolute radiance values, and digital elevation model data for the same region are extracted simultaneously.

[0061] More specifically, the correction process includes at least one of radiometric calibration, geometric correction, and atmospheric correction.

[0062] In the specific implementation of complementary remote sensing monitoring of complex urban river networks, a UAV equipped with a multispectral imaging sensor was used. A preset relative flight altitude and a high image overlap of 80% in the heading and 70% in the lateral directions were set. Raw image data of blue, green, red and near-infrared bands were collected along the planned flight path to cover the target area. Subsequently, radiometric calibration of the raw digital quantization values ​​was performed using a ground standard reflectance calibration board. Atmospheric correction was performed using the dark pixel method to eliminate absorption and scattering interference from atmospheric molecules and aerosols. Geometric orthorectification was performed by introducing geographic coordinates and elevation control points. Finally, the single-scene image sequence was fused and stitched into a multispectral orthoreflectance image with absolute physical radiometric values. Based on the motion recovery structure algorithm of the high overlap image, high-precision digital elevation model data of the same area was simultaneously calculated and output. The core principle of this front-end data acquisition and preprocessing design lies in the fact that the radiant energy received by the raw sensors is significantly heterogeneous in terms of spatiotemporal variation due to the influence of ground illumination conditions, atmospheric attenuation, and the coupling of sensor dark current. By introducing radiometric calibration and systematic corrections for the atmosphere and geometry, the relative radiance values ​​are forcibly converted into absolute reflectance, which characterizes the inherent physical and optical properties of ground objects. Furthermore, the three-dimensional terrain spatial benchmark is reconstructed using the principle of multi-viewpoint stereo photogrammetry. Compared to the traditional crude technique of directly classifying uncalibrated RGB images or raw quantized values, this implementation scheme completely eliminates the radiative distortion introduced by drastic changes in illumination and atmospheric fluctuations, as well as the geometric projection difference caused by terrain undulations. This allows downstream feature engineering and model training to be based on unified and objective physical dimensions.

[0063] More specifically, in step S2, when extracting initial sample data based on the multispectral reflectance data, the following steps are included:

[0064] Based on the green band and near-infrared band in the multispectral reflectance data, the normalized water index of each pixel is calculated.

[0065] Using the normalized water index segmentation threshold, pixels are initially divided into water body pixels and non-water body pixels;

[0066] For the water body pixels and non-water body pixels, the full-band spectral gray values, as well as the spectral mean and spectral variance features calculated based on the local spatial window, are extracted and integrated to form the initial sample data.

[0067] In the specific implementation of extracting initial sample data, the normalized water index of each pixel is first calculated based on the green and near-infrared bands in the multispectral reflectance data. The core principle is to utilize the strong absorption characteristics of water in the near-infrared band and the difference in reflectance in the green band. By extracting the difference in reflectance between the two bands and dividing by their sum, the water characteristics are effectively enhanced and the interference from vegetation and soil background is suppressed. To address the issue of index distribution shift caused by differences in lighting conditions and water quality during different cruises, this embodiment abandons the traditional fixed zero-value segmentation and adopts the Otsu method to traverse the normalized water index histogram of all pixels, adaptively solving the problem with the goal of maximizing the inter-class variance. The optimal normalized water index segmentation threshold is used to initially and objectively divide panoramic pixels into water pixels and non-water pixels. Then, considering the physical limitations of individual pixels being easily affected by local noise and lacking spatial dimension representation, the full-band spectral gray values ​​of the segmented samples are not only extracted, but also a 3×3 or 5×5 local spatial window centered on the target pixel is constructed. The specific gray values ​​of all pixels in each band within this neighborhood are extracted, and the spectral variance and spectral mean, which reflect the degree of numerical deviation and discrete distribution characteristics within this local spatial range, are statistically calculated. Finally, the above multi-dimensional features are integrated and stitched together to form the initial sample data.

[0068] More specifically, in step S2, when performing spectral feature screening based on feature importance selection on the initial sample data and obtaining the screening results as intermediate sample data, the following steps are specifically included:

[0069] Random forest or gradient boosting decision tree algorithms are used to calculate the importance score of each feature in the initial sample data for distinguishing between water bodies and non-water bodies, and this importance score is used as the contribution of each feature.

[0070] Sort the contributions by size and set a contribution threshold;

[0071] Features whose contribution is higher than a preset contribution threshold are selected to form a feature subset, and the sample data containing the feature subset is used as intermediate sample data.

[0072] In the specific implementation process, after acquiring multidimensional initial sample data, the sample matrix integrating the full-band spectrum, local mean, and variance is input into the random forest algorithm to construct multiple decision trees. By calculating the average reduction in Gini impurity of each feature variable at the split node, the importance score of each feature dimension in distinguishing the boundary between water and non-water bodies is quantified and defined as its contribution. Subsequently, all feature dimensions are sorted in descending order according to their contribution, and a preset contribution threshold is introduced for hard truncation, forcibly removing redundant features with scores below the threshold. Finally, the core features that meet the contribution threshold are selected to form a compact feature subset, and the samples containing only this subset are output as intermediate sample data downstream. It should be noted that the preset contribution threshold refers to a numerical cutoff point set manually after the random forest or gradient boosting decision tree outputs the feature contribution ranking. After fusing multi-band and spatial neighborhood features, UAV remote sensing data is highly susceptible to the curse of dimensionality and multicollinearity in high-dimensional space. Utilizing the data information gain evaluation mechanism built into the ensemble learning tree model, an effective dimension that substantially supports classification decisions can be objectively decoupled and selected in a purely data-driven manner. Compared to the traditional, crude technique of indiscriminately inputting all features into the classifier, this implementation scheme eliminates interfering bands that contribute little to water body identification or easily lead to model overfitting. This not only significantly reduces the dimensionality of the input matrix and lowers the computational cost of subsequent multilayer perceptron models, but also effectively prevents redundant noise generated by complex terrain environments from perturbing the model's decision boundary on inefficient feature dimensions.

[0073] More specifically, in step S3, when processing the intermediate sample data using the K-Means clustering algorithm to obtain the classification result labels, the following steps are included:

[0074] Use intermediate sample data as the input dataset;

[0075] The dataset was divided into water body clusters and non-water body clusters by iteratively optimizing the cluster centers using the K-Means clustering algorithm.

[0076] The normalized water index mean of the two cluster centers is calculated and compared. The cluster with the higher mean is identified as a water cluster and labeled as a water cluster, while the cluster with the lower mean is identified as a non-water cluster and labeled as a non-water cluster, thereby obtaining the classification result labels for all samples.

[0077] In the specific implementation process, a multi-dimensional feature matrix is ​​used as input, and the K-Means clustering algorithm is used to perform iterative optimization under the Euclidean distance measure in the feature space to divide the massive pixel samples into two mutually exclusive clusters. Then, the mean values ​​of the centers of the two clusters in the normalized water index dimension are calculated and compared. Based on the significant high index value effect caused by the physical characteristics of water bodies in the green band high reflectivity and near-infrared band strong absorption, the clusters with higher mean values ​​are identified as water bodies and labeled as such, while the clusters with lower mean values ​​are identified as non-water bodies and labeled as such. This achieves automated labeling of the classification labels for all samples. Unsupervised clustering algorithms are used to capture the natural distribution topology of data in the multi-dimensional feature space, and by introducing the normalized water index with clear geophysical significance as a logical anchor, the abstract mathematical clustering results are mapped into category labels with geoscientific meaning. This implementation plan ensures that the label generation process is highly matched with the real-time lighting and water quality conditions of the current UAV flight through data-driven self-organizing mapping, which significantly improves the efficiency and objectivity of sample labeling.

[0078] More specifically, in step S4, when constructing the joint feature matrix containing spectral features, grayscale texture features, and elevation difference features, the following steps are included:

[0079] The original multispectral reflectance data is fused into a single-channel grayscale image according to the preset band weights, and the grayscale co-occurrence matrix is ​​calculated to extract the grayscale texture features of each pixel.

[0080] Based on digital elevation model data, the elevation variation range or mean difference between each pixel and its preset neighboring pixels is calculated as the elevation difference feature.

[0081] Index extraction is performed at the sample location points, and the intermediate sample data, the grayscale texture features, and the elevation difference features are channel-wise spliced ​​and fused at the pixel scale to construct the joint feature matrix.

[0082] In one implementation of constructing a joint feature matrix incorporating spectral, textural, and topographical features, the original multidimensional multispectral reflectance data is first fused into a single-channel grayscale image according to preset band weights, establishing a benchmark for subsequent texture calculations. As a preferred embodiment, the near-infrared band is directly extracted as a grayscale image, utilizing the strong absorption characteristics of water in the near-infrared band. Alternatively, a weighted average method is used to assign corresponding proportional weights to the blue, green, red, and near-infrared bands for linear summation, or the near-infrared band is assigned a maximum coefficient based on the water's absorptivity. One method for setting the preset band weights is: green band -0.587, near-infrared band -0.299; this setting is based on empirical weights. The generated grayscale image preserves the contrast of the land-water boundary to the maximum extent. Subsequently, texture features such as the second moment of angle, contrast, and entropy of each pixel are extracted from the obtained grayscale image by calculating the grayscale co-occurrence matrix. Simultaneously, based on digital elevation model data, the elevation variation range or mean difference is calculated in the 3×3 or 5×5 local neighborhood of each pixel to characterize the flatness and vertical drop of the local terrain. Finally, through pixel position indexing, the selected core spectral bands, the extracted grayscale texture features, and the elevation difference features are spliced ​​and fused in the channel dimension to construct a multidimensional joint feature matrix, which is then input into the multilayer perceptron model. In this scheme, by extending the one-dimensional spectral attributes to a joint feature space composed of spatial texture information and physical elevation parameters, the uniformity of texture and the local flatness of terrain provide strong constraints for the classifier. Compared to traditional extraction techniques that rely solely on spectral reflectance or a single water body index, this approach can accurately distinguish and eliminate low-reflectance pseudo-water body signals generated by the tops of tall buildings or the edges of steep terrain, while grayscale texture features further compensate for the differences in the representation of subtle roughness between the water surface and shadows.

[0083] In another implementation, when constructing the joint feature matrix, in addition to extracting conventional grayscale texture features and basic elevation difference features, an analysis process for an anti-shading water body feature layer is further introduced. The specific implementation process includes the following steps:

[0084] The intermediate sample data is converted into single-channel grayscale images according to preset band weights, and grayscale texture features are extracted by calculating the grayscale co-occurrence matrix. Simultaneously, based on digital elevation model data, the elevation variation range or mean difference between each pixel and its preset neighboring pixels is calculated as the basic elevation difference feature. Metadata of the UAV multispectral remote sensing data is parsed concurrently to extract the precise latitude and longitude coordinates and timestamp at the time of image acquisition, and based on this, the solar azimuth and solar altitude angles of the target area at the moment of capture are calculated.

[0085] Using the digital elevation model data as a three-dimensional reference, each global pixel is taken as the starting point, and a step-by-step line-of-sight scan is performed along the elevation surface against the direction of the solar azimuth angle. During the scan, the relative elevation angle between the obstacle pixel and the starting pixel on the line-of-sight path is calculated. When an obstacle pixel with a relative elevation angle greater than the solar altitude angle is found on the search path, it is determined that the direct sunlight of the starting pixel is blocked, and it is marked as a theoretical shadow area in a state of light blocking. The pixel is assigned a corresponding shadow weight according to the maximum elevation angle difference on the path. For pixels that are not blocked, they are marked as non-shadow areas, thereby generating a theoretical shadow occlusion mask covering the entire surface.

[0086] Then, the theoretical shadow occlusion mask is used as a spatial constraint weight to perform feature coupling constraints on the spectral water features of pixels, such as the normalized water index, and the base elevation difference. Specifically, the feature coupling constraints include: for pixels in a light-occluded state and assigned a high shadow weight, even if they exhibit high spectral values ​​similar to water bodies and have locally flat elevations, the shadow weight is used to inversely suppress their water body spectral features, significantly weakening their input feature values; while for pixels in non-shadowed areas, their features are positively preserved, thus outputting an anti-shadow water body feature layer that effectively suppresses shadow pseudo-features.

[0087] Finally, the extracted grayscale texture features and basic elevation difference features are combined with the generated anti-shading water feature layer at the pixel channel dimension to obtain the final joint feature matrix, which serves as the input for training and parameter optimization of the multilayer perceptron model.

[0088] The construction of the aforementioned joint matrix aims to overcome the problem that elevation filtering methods cannot identify tree canopy shadows or building roof shadows on flat ground. In actual UAV low-altitude remote sensing scenarios, urban high-rise buildings or steep mountain terrain easily generate large areas of shadow. Because the shadowed areas lack direct sunlight, their multispectral reflectance characteristics are highly similar to those of real water bodies, and their local elevation changes are also close to zero, which can easily lead to serious misjudgments by the multilayer perceptron model. The introduction of the anti-shadow water body feature layer can deeply integrate simple data-driven features with a realistic physical space model. Before the feature is input into the model, the pseudo-water body features of shadows on flat terrain are suppressed at the three-dimensional physical level, giving the multilayer perceptron model clear prior constraints on illumination space, thereby significantly improving the accuracy of water body extraction in complex surface environments such as urban and mountainous areas.

[0089] More specifically, in step S4, when training and optimizing the multilayer perceptron model using the classification result label as the target and the joint feature matrix as the input, the following steps are included:

[0090] A multilayer perceptron model is constructed using the classification result labels as supervision signals and the joint feature matrix as input features.

[0091] The number of hidden layer neurons, learning rate, and number of iterations of the multilayer perceptron model are optimized using grid search or Bayesian optimization algorithms as hyperparameter optimization algorithms until the loss function converges, thus obtaining the trained multilayer perceptron model.

[0092] In the specific implementation process, the constructed joint feature matrix containing spectral, grayscale texture, and elevation difference features is used as a multidimensional input vector, and the classification result labels generated by K-Means clustering are mapped as logistic regression supervision signals. Subsequently, a multilayer perceptron network topology consisting of an input layer, multiple hidden layers, and an output layer is constructed, and a Bayesian optimization algorithm is used to perform adaptive optimization within a preset hyperparameter space. Heuristic sampling and utility evaluation are performed on core parameters such as the number of hidden layer neurons, the initial learning rate, and the maximum number of iterations. The cross-entropy loss function between the predicted category and the supervision label is calculated, and a backpropagation algorithm is executed. Figure 2 As shown, the network weight matrix is ​​dynamically updated until the loss function curve tends to converge smoothly. Figure 3 As shown, the optimal global parameter combination is finally locked, and the trained classification prediction model is derived, improving the model's accuracy. The core principle of this implementation method lies in utilizing the nonlinear approximation capability of deep neural networks to map heterogeneous multi-source physical features to a high-dimensional linearly separable decision space, and forcibly avoiding empirical biases caused by manually setting hyperparameters through an automated optimization mechanism. Compared with traditional extraction techniques based on fixed threshold discrimination or simple linear kernel functions, this implementation method uses an optimized nonlinear perception mechanism to accurately capture the subtle topological boundary differences between water bodies and complex building shadows under multi-dimensional feature combinations, solving the risks of insufficient model training and overfitting caused by the increase in feature dimensions.

[0093] More specifically, in step S5, when the trained multilayer perceptron model is used to classify and predict the water bodies in the target area, and the classification and prediction results are sequentially processed by morphological dilation and corrosion filtering closing operations to obtain the final water body extraction result, the following steps are specifically included:

[0094] The predicted results are first subjected to an expansion calculation to fill the small holes and narrow gaps in the water area;

[0095] Then, an erosion operation is performed on the expanded result to smooth the water body boundary, and the final water body extraction result after filtering and noise reduction is output.

[0096] In the specific implementation of water body extraction in the target area using a trained multilayer perceptron model, the joint feature matrix of the region to be identified is first input into the model for pixel-level inference to generate a preliminary binary classification probability map. For discrete voids and physical breaks in the shoreline caused by floating duckweed, localized strong reflections, or sensor dark current noise in the probability map, morphological dilation is first performed using pre-sized square structuring elements (e.g., 3×3 or 5×5 pixels). This fills small non-water body pores within the water body and connects narrow channel gaps through local neighborhood maxima filtering logic. Subsequently, based on the dilation result, morphological erosion is performed using structuring elements of the same size to shrink redundant edge pixels generated in the first step of dilation and remove independent salt-and-pepper noise points, thus completing the closing operation and outputting the final refined water body vector boundary. The core principle of this design lies in utilizing the spatial topological constraint mechanism in mathematical morphology to perform nonlinear spatial domain filtering. Through a geometric logic of first increasing and then decreasing, the connectivity of the target is repaired and the spatial heterogeneity is eliminated while maintaining the basic geometric dimensions of the target object. Compared to the traditional method of directly outputting pixel-level prediction results after classification, the post-processing mechanism of this implementation scheme complements the multi-source feature deep learning at the front end, realizing a closed-loop process from high-dimensional attribute recognition to spatial morphology correction, which greatly improves the visual fidelity and mapping accuracy of water body extraction in complex environments.

[0097] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features therein. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention, and they should all be covered within the scope of the claims and specification of the present invention.

Claims

1. A method for fine classification and extraction of water bodies by unmanned aerial vehicle multi-spectral remote sensing, characterized in that, The method comprises the following steps: Step S1: obtaining multispectral remote sensing data of a target area water body and preprocessing to generate multispectral reflectance data and digital elevation model data; Step S2: extracting initial sample data containing water body samples and non-water body samples based on the multispectral reflectance data, performing spectral feature screening based on feature importance selection on the initial sample data, and obtaining a screening result as intermediate sample data; Step S3: processing the intermediate sample data using a K-Means clustering algorithm to obtain a classification result label of water body and non-water body; Step S4: extracting gray texture features of the intermediate sample data based on the multispectral reflectance data, extracting elevation difference features of the intermediate sample data based on the digital elevation model data, constructing a joint feature matrix containing spectral features, gray texture features and elevation difference features, taking the classification result label as a target and the joint feature matrix as an input, training and parameter optimizing a multilayer perception machine model to obtain a trained multilayer perception machine model; Step S5: using the trained multilayer perception machine model to perform pixel-level classification prediction on the water body of the target area, and sequentially performing a closing operation processing of morphological dilation and corrosion filtering on the prediction result to obtain a final water body extraction result; In step S4, when constructing the joint feature matrix containing spectral features, gray texture features and elevation difference features, the following steps are specifically included: Fusing the original multispectral reflectance data into a single-channel gray image according to a preset band weight, calculating a gray co-occurrence matrix to extract gray texture features of each pixel; Based on the digital elevation model data, calculating the elevation variation range or mean difference of each pixel and the pixels in its preset neighborhood as the elevation difference features; Index extraction is performed on the sample position points, and the intermediate sample data, the gray texture features and the elevation difference features are fused on the pixel scale to construct the joint feature matrix.

2. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 1, characterized in that, In step S1, when obtaining the multispectral remote sensing data of the target area water body, the following steps are specifically included: Using a drone carrying a multispectral imaging sensor to collect images on site according to a preset flight route, flight height and overlap for the target area water body and the surrounding area; Obtaining multispectral remote sensing data containing green band and near-infrared characteristic band.

3. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 2, characterized in that, In step S1, when generating multispectral reflectance data and digital elevation model data, the following steps are specifically included: Performing correction processing on the multispectral remote sensing data; Fusing and splicing the multispectral remote sensing data after the correction processing to generate multispectral reflectance data with absolute radiation values, and synchronously extracting digital elevation model data of the same area.

4. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 3, characterized in that, The correction processing includes at least one of radiation calibration, geometric correction and atmospheric correction.

5. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 2, characterized in that, In step S2, when extracting initial sample data based on the multispectral reflectance data, the following steps are specifically included: Based on the green band and near-infrared band in the multispectral reflectance data, calculating the normalized water body index of each pixel; Using a normalized water body index segmentation threshold to preliminarily divide the pixels into water body pixels and non-water body pixels; For the water body pixels and non-water body pixels, the full-band spectral gray values, as well as the spectral mean and spectral variance features calculated based on the local spatial window, are extracted and integrated to form the initial sample data.

6. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 5, characterized in that, In step S2, when performing spectral feature screening based on feature importance selection on the initial sample data and obtaining the screening results as intermediate sample data, the specific steps include: Random forest or gradient boosting decision tree algorithms are used to calculate the importance score of each feature in the initial sample data for distinguishing between water bodies and non-water bodies, and this importance score is used as the contribution of each feature. Sort the contributions by size and set a contribution threshold; Features with a contribution rate higher than a preset contribution rate threshold are selected to form a feature subset, and sample data containing the feature subset are used as intermediate sample data.

7. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 1, characterized in that, In step S3, when processing the intermediate sample data using the K-Means clustering algorithm to obtain the classification result labels, the specific steps include: Use intermediate sample data as the input dataset; The dataset was divided into water body clusters and non-water body clusters by iteratively optimizing the cluster centers using the K-Means clustering algorithm. The normalized water index mean of the two cluster centers is calculated and compared. The cluster with the higher mean is identified as a water cluster and labeled as a water cluster, while the cluster with the lower mean is identified as a non-water cluster and labeled as a non-water cluster, thereby obtaining the classification result labels for all samples.

8. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 1, characterized in that, In step S4, when training and optimizing the multilayer perceptron model using the classification result label as the target and the joint feature matrix as the input, the specific steps include: A multilayer perceptron model is constructed using the classification result labels as supervision signals and the joint feature matrix as input features. The number of hidden layer neurons, learning rate, and number of iterations of the multilayer perceptron model are optimized using grid search or Bayesian optimization algorithms as hyperparameter optimization algorithms until the loss function converges, thus obtaining the trained multilayer perceptron model.

9. The unmanned aerial vehicle multi-spectral remote sensing water body refined classification extraction method according to claim 1, characterized in that, In step S5, the trained multilayer perceptron model is used to classify and predict the water bodies in the target area. The classification and prediction results are then processed by morphological dilation and corrosion filtering closing operations to obtain the final water extraction results. This process specifically includes the following steps: The predicted results are first subjected to an expansion calculation to fill the small holes and narrow gaps in the water area; Then, an erosion operation is performed on the expanded result to smooth the water body boundary, and the final water body extraction result after filtering and noise reduction is output.