An InSAR geological disaster deformation identification method based on deep learning

CN121962839BActive Publication Date: 2026-06-19JINAN SATELLITE IND DEV GRP CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
JINAN SATELLITE IND DEV GRP CO LTD
Filing Date
2026-04-03
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Traditional geological hazard deformation identification methods lack the accuracy and spatial continuity of deformation inversion in complex geological environments, making it difficult to provide reliable confidence information, especially in the monitoring of geological hazards such as landslides and ground subsidence.

Method used

A deep learning-based InSAR geological hazard deformation identification method is adopted. By fusing multi-source data at the pixel level, extracting deep features at multiple scales, combining deep learning-based three-dimensional inversion with elastic mechanical physical constraints, and quantifying uncertainty, the accuracy and spatial continuity of deformation inversion are improved, and reliable confidence information is provided.

Benefits of technology

It significantly improves the accuracy and reliability of geological hazard deformation identification in complex geological environments, enhances the practicality of geological hazard monitoring and risk assessment, and outputs quantitative result maps containing three-dimensional deformation and confidence level assessment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121962839B_ABST
    Figure CN121962839B_ABST
Patent Text Reader

Abstract

This invention discloses a deep learning-based InSAR geological hazard deformation identification method, belonging to the field of geological hazard monitoring and prevention technology. The method includes the following steps: S1, multi-source data acquisition; S2, establishing a multi-source data spatiotemporal registration and pixel-level fusion model; S3, constructing a joint extraction network for multi-scale three-dimensional deformation features; S4, introducing elastic mechanical constraints to optimize the three-dimensional deformation field inversion; S5, fusing the physical model and deep learning output uncertainty quantification results, finally outputting the three-dimensional deformation of each pixel and its corresponding confidence assessment, forming a quantitative deformation result map for geological hazard risk assessment. This deep learning-based InSAR geological hazard deformation identification method employs surface deformation monitoring via synthetic aperture radar interferometry, multi-source remote sensing data fusion, and intelligent deformation inversion and risk assessment combining deep learning and physical constraints. It can be applied to the identification, monitoring, and risk assessment of geological hazards.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention specifically relates to a deep learning-based InSAR geological disaster deformation identification method, belonging to the field of geological disaster monitoring and prevention technology. Background Technology

[0002] Synthetic Aperture Radar Interferometry (InSAR) is a crucial remote sensing technique for monitoring surface deformation and identifying geological hazards. It analyzes the phase variations of SAR images of the same area at different times to obtain micro-motion information along the radar line of sight, offering all-weather, wide-swath coverage, and high-precision displacement detection capabilities. In recent years, with the improvement of multi-track SAR data acquisition capabilities, time-series InSAR technology can reveal the temporal evolution characteristics of surface deformation and has played a significant role in monitoring geological hazards such as landslides, ground subsidence, and fault activity. Simultaneously, high-resolution optical images, with their rich spectral and textural information, provide effective assistance for surface cover classification and deformation semantic understanding, creating a technological foundation for the development of multi-source data fusion and intelligent deformation identification methods. Traditional geological hazard deformation identification mainly relies on a single data source and shallow learning strategies to achieve deformation detection and hazard identification, as illustrated by Chinese patent authorizations. The method for identifying geological hazard risks based on InSAR and semi-supervised learning, disclosed in CN118035879B, includes: initially delineating the spatial location of suspected geological hazard risks within a study area using InSAR technology; learning the geological environment characteristics of identified geological hazards using semi-supervised learning methods; constructing a geological environment characteristic learning model for geological hazards; and then using this model to determine whether the delineated spatial location is a geological hazard risk. It automatically identifies hazards with the support of limited geological hazard survey data, achieving the identification of geological hazard risks over a wide area. However, the above-mentioned technical solution suffers from shortcomings such as insufficient ability to solve three-dimensional deformation fields, limited ability to integrate deformation information under different observation geometric conditions, and neglect of physical continuity constraints. These shortcomings result in poor deformation inversion accuracy and spatial continuity performance in complex geological environments, and make it difficult to provide reliable confidence information for subsequent uncertainty assessment and risk evaluation. Summary of the Invention

[0003] To address the aforementioned issues, this invention proposes a deep learning-based InSAR geological hazard deformation identification method. This method overcomes the shortcomings of traditional techniques in three-dimensional deformation component calculation and physical rationality assurance by employing multi-source data pixel-level fusion, multi-scale deep feature joint extraction, deep learning-based three-dimensional inversion combined with elasticity constraints, and fusion of physical models and uncertainty quantification. It improves the accuracy and spatial continuity of deformation inversion in complex geological environments, provides reliable confidence information, and enhances the accuracy and reliability of geological hazard deformation identification. This provides more comprehensive technical support for the identification, monitoring, and risk assessment of geological hazards such as landslides, ground subsidence, and collapses.

[0004] The present invention provides a deep learning-based InSAR geological hazard deformation identification method, the method comprising the following steps:

[0005] S1. Multi-source data acquisition: Acquire multi-track synthetic aperture radar interferometry (InSAR) data and corresponding high-resolution optical image data in the same monitoring area. The multi-track InSAR data includes multiple SAR image sequences from different observation directions, and the optical image data includes multispectral or panchromatic band images, which are used to provide auxiliary ground cover classification and deformation semantic information.

[0006] S2. Establish a multi-source data spatiotemporal registration and pixel-level fusion model: Spatiotemporal normalization processing is performed on the multi-track InSAR data and optical image data obtained in S1. The image registration algorithm is used to unify the data from different sensors and different time phases into the same geographic coordinate system and grid space. On this basis, a pixel-level multi-source feature fusion module is constructed to jointly extract features and form a multi-channel fusion feature map for subsequent network input.

[0007] S3. Construct a joint extraction network for multi-scale three-dimensional deformation features: The joint extraction network adopts an encoder-decoder structure. The encoder uses a multi-branch convolutional neural network to extract the spatial context information from InSAR interferometric features and optical images of different orbits, and captures multi-scale deformation patterns from local details to regional background through a feature pyramid structure. The decoder fuses multi-level features and enhances the feature response of areas with significant deformation through an attention mechanism, and outputs a feature tensor containing preliminary information on the direction and magnitude of surface deformation.

[0008] S4. Elastic mechanical constraints are introduced to optimize the inversion of the three-dimensional deformation field. Based on the deformation features extracted in S3 and combined with the InSAR deformation measurement equation of multiple geometric observations, a loss function containing elastic mechanical priors is constructed. The surface deformation components in the east-west, north-south, and vertical directions are inverted using a deep learning network. Physical constraints are used to reduce the ill-posedness of the solution and improve the spatial continuity and physical rationality of the three-dimensional deformation field.

[0009] S5. The uncertainty quantification results of the fusion physical model and deep learning output are fused. The deformation forward modeling results based on the elastic half-space or layered medium model are fused with the neural network prediction results. The uncertainty of deformation prediction is estimated by using the Bayesian deep learning framework or the Monte Carlo dropout method. Finally, the three-dimensional deformation of each pixel and its corresponding confidence assessment are output to form a quantitative deformation result map that can be used for geological disaster risk assessment.

[0010] Furthermore, the joint extracted features in S2 include radar amplitude, phase, coherence, and optical spectral and texture features; the deformation forward modeling results in S5 are output by an elastic half-space or layered medium model.

[0011] Furthermore, during the multi-source data acquisition in S1, a unified data acquisition and organization mechanism is constructed for the same monitoring area. By scheduling at least two synthetic aperture radar observation tasks with different orbital parameters, SAR image sequences covering the same spatial range and with consistent time spans are acquired simultaneously. The differences in incident angle and azimuth angle between different orbits provide necessary geometric redundancy for subsequent three-dimensional deformation decomposition. At the same time, optical image data spatially overlapping with the SAR images are acquired within the same time window. The optical images contain multispectral or panchromatic bands to characterize land cover type, structural boundaries, and texture continuity information, and serve as an external observation source for deformation semantic constraints. The geometric sensitivity of multi-orbit InSAR observations is quantitatively characterized during the data acquisition stage by defining an orbital observation sensitivity vector:

[0012] ;

[0013] in, For the first Orbit observation sensitivity vector of orbital SAR imagery, Indicates the first The angle of incidence of orbital SAR images. This vector represents the corresponding azimuth direction and is used to describe the orbit's comprehensive response to east-west, north-south, and vertical deformation components, thus ensuring the observability of multi-orbit combinations for three-dimensional deformation inversion at the data level. Simultaneously, an optical structure complexity index is introduced.

[0014] ;

[0015] in, This represents the total number of bands in the optical image. Indicates the first The radiation intensity distribution of an optical band or panchromatic image. The spatial gradient operator is used to quantify the intensity of changes in the boundaries and textures of ground features in optical imagery, ensuring that the acquired optical data has significant distinguishability between landslide, subsidence, and stable areas. Next, the multi-track SAR image sequence is organized according to a unified time index to construct a temporally consistent observation set.

[0016] ;

[0017] in, Indicates the first The orbit at time SAR observations, This represents the optical images corresponding to the same moment; this set provides a structured input basis for spatiotemporal registration, pixel-level fusion, and joint extraction of multi-scale deformation features in subsequent steps, thus naturally connecting to the next step of multi-source data spatiotemporal normalization and fusion modeling process.

[0018] Furthermore, based on the temporally consistent observation set constructed in S1, a unified spatiotemporal normalization process is first implemented on multi-orbit InSAR data and optical imagery. By introducing a joint spatiotemporal mapping function, observation data from different sensors and at different times are precisely aligned to the same geographic coordinate system and regular grid space to eliminate the impact of orbital differences, imaging geometric differences, and temporal sampling inconsistencies on subsequent feature learning. For this purpose, the cross-modal spatiotemporal registration error energy is defined as follows:

[0019] ;

[0020] in, The energy of cross-modal spatiotemporal registration error (which needs to be minimized to correct orbital errors and improve the accuracy of deformation inversion) For double summation, all orbits of the multi-orbit InSAR and all observation phases / time points are traversed. For multi-orbit InSAR orbit indexing and For time indexing; Represents pixel coordinates in a uniform grid. This represents the spatial mapping operator estimated by the registration model. Indicates the first The orbit at the point in time and spatial location Radar observation signals at the location (including amplitude, phase, and coherence characteristics); Indicates the same point in time, after After mapping and SAR pixels The optical observation signal (including spectral and texture features) at the optical pixel of the counterweight; this energy function simultaneously constrains the structural consistency of radar observation and optical observation at the same spatial location, ensuring the geometric consistency of multi-source information from the data level; after completing spatiotemporal registration, a pixel-level multi-source feature fusion model is constructed for each pixel location, jointly encoding radar amplitude, phase, and coherence features with optical spectral and texture features. To avoid bias in the fusion result caused by differences in the dimensions and statistical distributions of different modes, adaptive mode normalization weights are introduced:

[0021] ;

[0022] For the first The normalized weights of each target item (such as SAR orbit, observation time phase, pixel or feature) take values ​​between (0,1), and the sum of the weights of all target items is 1. Corresponding radar or optical modes, The weighting mechanism, which is automatically learned by the network during training, measures the contribution of each modality to the deformation representation at the current pixel. This mechanism enables the fused features to maintain stable discriminative ability under complex terrain conditions. To The original scores of each target item (such as weight error, feature similarity). Perform exponential calculations to amplify the score differences between different target items; Raw scores for all target items After taking the exponents and summing them, use the sum as the normalized denominator to ensure that the sum of all weights is 1. Based on this, define the pixel-level fused feature vector:

[0023] ;

[0024] in, Indicates the first Modal class in pixels The standardized features extracted are used to create a fusion vector that comprehensively reflects the deformation-sensitive information of multi-track InSAR and the semantic information of ground features in optical images within the same feature space. This results in a spatially continuous and channel-complete multi-channel fusion feature map across the entire monitoring area. This feature map is then directly passed as a structured input to the next step of the multi-scale three-dimensional deformation feature joint extraction network, providing a consistent data foundation for the deep feature learning of the subsequent encoder-decoder structure.

[0025] Furthermore, when constructing a multi-scale joint feature extraction network for 3D deformation expression, S3 uses the multi-channel fused feature map formed by S2 as a unified input to construct the multi-scale joint feature extraction network for 3D deformation expression. The overall structure adopts a symmetrical coupling design between the encoder and decoder. In the encoder stage, a multi-branch convolutional path corresponding to multi-track InSAR observations is set up to fully model the interferometric phase changes, amplitude stability, and coherence evolution under different observation geometry conditions in an independent feature subspace. At the same time, spatial context information from optical images is introduced to enhance the perception of ground object boundaries and continuous deformation regions. To achieve a unified expression of cross-scale deformation patterns, a multi-layer feature pyramid is constructed in the encoder, and features at different resolutions are aggregated through a scale correlation operator. Its scale response is defined as:

[0026] ;

[0027] in, Indicates the first Layer feature pyramid at pixel The coding features at that location For scale-related learnable weights, The expression represents the total number of layer features participating in the weighted fusion, enabling the network to simultaneously capture the spatial consistency of localized minor deformations and large-scale gradual deformations. In the decoder stage, the fusion of high-level semantic information and low-level spatial details is achieved through progressive upsampling and cross-layer connections. Furthermore, a deformation saliency attention mechanism is introduced to enhance the feature response of potential geological hazard areas, with the attention weights as follows:

[0028] ;

[0029] in, Indicates the attention mapping parameters. The normalized activation function is represented by a weight used to suppress background stable region features and highlight key regions where the deformation gradient is continuous and consistent in direction. Based on this, the network constructs a 3D deformation initial representation tensor at the decoding output, whose pixel-level expression is as follows: ;

[0030] in, This represents the deformation mapping matrix learned by the network. This tensor simultaneously encodes deformation amplitude and direction information in the feature space. It provides continuous, differentiable, and physically directional initial deformation feature inputs for the subsequent inversion of the three-dimensional deformation field with elastic mechanical constraints, thus naturally transitioning to the next step of physical constraint optimization modeling.

[0031] Furthermore, after the initial feature tensor of the three-dimensional deformation is output, the deep learning results are explicitly coupled with the multi-track InSAR observation geometry. Elastic mechanical constraints are introduced to jointly invert the three-dimensional deformation field. The ill-posed problem caused by the limitation of the observation dimension is suppressed by physical prior. First, the pixel-level deformation features output by the network are mapped into a continuous three-dimensional deformation vector field. and the orbital observation sensitivity vector of S1 Establish consistent observation relationships to construct multi-geometric InSAR measurement consistency constraints, whereby... Positions A continuous three-dimensional deformation vector field in the east, north, and elevation directions at the location. As a vector transpose, its pixel-level error term is defined as:

[0032] ;

[0033] in, Indicates the first The orbit at time Corresponding pixels InSAR line-of-sight deformation at the location, This is the error calculation formula for iterative weighted least squares; this constraint term ensures that the inverted three-dimensional deformation remains consistent with the actual interferometric measurement results in the observation space; based on this, elasticity priors are introduced to constrain the spatial continuity and physical rationality of the deformation field. By constructing strain energy density constraints on the three-dimensional deformation field, its elastic regularity term is defined as:

[0034] ;

[0035] in, Indicates the spatial domain of the monitoring area. Let be the divergence of the three-dimensional deformable vector field. For divergence operators, It is a three-dimensional deformation vector field; The gradient matrix of the three-dimensional deformation vector field; and The elastic parameter is used to adjust the penalty intensity for volumetric and shear deformation. This parameter effectively suppresses discontinuous and non-physical abrupt deformation solutions during the optimization process, making the inversion results consistent with the continuous deformation characteristics of the geological medium. Finally, the observation consistency constraint and the elastic mechanical constraint are combined to form an end-to-end optimizable physical constraint loss function.

[0036] ;

[0037] in, The joint loss function is used to balance the accuracy of observation fitting with the strength of physical constraints. It drives the deep learning network to complete the joint inversion of deformation components in the east-west, north-south and vertical directions, so that the obtained three-dimensional deformation field is valid in both the observation space and the physical space. It also provides a stable and reliable deformation inversion basis for the next step of introducing uncertainty quantification and physical model fusion.

[0038] Furthermore, based on the three-dimensional deformation field obtained in S4 that satisfies observation consistency and elasticity constraints, S5 introduces a joint uncertainty quantification mechanism of the physical model and deep learning results to assess the credibility and express the risk of the deformation inversion results. Specifically, as follows: First, based on the medium description consistent with the elastic parameters in S4, a deformation forward modeling operator under the conditions of elastic half-space or layered medium is constructed. The three-dimensional deformation field output by the physical model is denoted as... Meanwhile, the set of prediction results obtained by the deep learning network during multiple random inactivation inference processes is represented as... The meaning of this set of prediction results is: spatial location within the monitoring area. At this point, deep learning networks are in the first to the second... In sub-random inactivation inference, the complete set of all output 3D deformation prediction values; among which For the index of the number of inferences, Indicates the first to the second The range of values ​​for each time; For the first The deformation prediction results corresponding to each inference step; the uncertainty of the network prediction is characterized by the mean and dispersion of this set at the pixel level, and its prediction mean is defined as:

[0039] ;

[0040] This mean reflects the steady-state prediction results of the deep learning model for three-dimensional deformation after physical constraint optimization;

[0041] Furthermore, to achieve complementary fusion of the physical model and the data-driven model, an uncertainty-adaptive fusion weight is introduced, which weights the two types of deformation results at the pixel level. The fused deformation field is defined as follows:

[0042] ;

[0043] in, It is obtained by inversely mapping the variance of the network prediction results; it is used to enhance the contribution of deep learning results in regions of low uncertainty and strengthen the constraint effect of the physical model in regions of high uncertainty, thereby improving the stability and reliability of the overall deformation field; based on this, a pixel-level deformation confidence index is constructed by quantifying the dispersion of the prediction results, and its definition is as follows:

[0044] ;

[0045] in, This represents the variance vector of the three-dimensional deformation components under multiple random inferences. This index is used to map deformation uncertainty into continuous confidence values, so that the final output not only includes the spatial distribution of the three-dimensional deformation, but also provides the corresponding reliability characterization, forming a quantitative deformation result map for geological disaster risk assessment, and providing direct input for subsequent disaster zoning, threshold identification and early warning decision-making.

[0046] Compared with existing technologies, the deep learning-based InSAR geological hazard deformation identification method of the present invention has the following advantages:

[0047] 1. By simultaneously introducing multi-orbit InSAR observations and high-resolution optical images at the data source level, and quantitatively modeling the orbital observation sensitivity and optical structure complexity, the geometric observability and semantic interpretability of the three-dimensional deformation of the Earth's surface are enhanced from the source, overcoming the problem that single-line-of-sight InSAR data is difficult to characterize the true three-dimensional deformation.

[0048] 2. Construct a unified multi-source data spatiotemporal registration and pixel-level fusion model. Through cross-modal spatiotemporal consistency constraints and adaptive modal weighting mechanisms, achieve stable fusion of radar amplitude, phase, coherence and optical spectral texture features in the same feature space. This effectively reduces the impact of differences in geometry, resolution and statistical distribution of multi-sensor data on deformation recognition accuracy.

[0049] 3. By proposing a multi-scale three-dimensional deformation feature joint extraction network, through the collaborative design of multi-branch encoding, feature pyramid and deformation saliency attention mechanism, the network can simultaneously capture local small deformation and regional scale gradual deformation features, which significantly improves the integrity and robustness of deformation pattern recognition in complex geological disaster scenarios.

[0050] 4. In the deep learning inversion process, the prior constraints of elasticity are explicitly introduced, and the multi-track InSAR measurement equations and the elastic properties of the continuous medium are integrated into the end-to-end optimization framework. This effectively reduces the ill-posedness of three-dimensional deformation inversion, avoids the generation of non-physical abrupt solutions, and significantly improves the spatial continuity and physical rationality of the inversion results.

[0051] 5. By integrating physical forward modeling and deep learning prediction results, and combining an uncertainty quantification mechanism based on random inactivation inference, an adaptive trade-off between data-driven and physical mechanisms is achieved in deformation results, ensuring that the inversion results remain stable and reliable in high uncertainty regions, thereby enhancing the credibility of the overall deformation field.

[0052] 6. The final output includes a quantitative deformation result map that simultaneously contains three-dimensional deformation and pixel-level confidence assessment, upgrading geological disaster risk assessment from a single deformation amplitude interpretation to a joint decision-making based on deformation amplitude and confidence, significantly improving the practicality and decision support capabilities of geological disaster monitoring and early warning applications such as landslides and ground subsidence. Attached Figure Description

[0053] Figure 1 This is a schematic diagram of the overall process of InSAR geological disaster deformation identification based on deep learning in this invention.

[0054] Figure 2 This is a schematic diagram of the multi-source data spatiotemporal registration and fusion process of the present invention.

[0055] Figure 3 This is a schematic diagram of the multi-scale three-dimensional deformation feature joint extraction network structure of the present invention.

[0056] Figure 4 This is a schematic diagram of the physical constraint three-dimensional deformation field inversion process of the present invention.

[0057] Figure 5 This is a schematic diagram of the uncertainty quantification and result fusion output process of the present invention. Detailed Implementation

[0058] like Figures 1 to 5 The deep learning-based InSAR geological hazard deformation identification method shown includes the following steps:

[0059] S1. Multi-source data acquisition: Acquire multi-track synthetic aperture radar interferometry (InSAR) data and corresponding high-resolution optical image data in the same monitoring area. The multi-track InSAR data includes multiple SAR image sequences from different observation directions, and the optical image data includes multispectral or panchromatic band images, which are used to provide auxiliary ground cover classification and deformation semantic information.

[0060] S2. Establish a multi-source data spatiotemporal registration and pixel-level fusion model: Spatiotemporal normalization processing is performed on the multi-track InSAR data and optical image data obtained in S1. The image registration algorithm is used to unify the data from different sensors and different time phases into the same geographic coordinate system and grid space. On this basis, a pixel-level multi-source feature fusion module is constructed to jointly extract features and form a multi-channel fusion feature map for subsequent network input.

[0061] S3. Construct a joint extraction network for multi-scale three-dimensional deformation features: The joint extraction network adopts an encoder-decoder structure. The encoder uses a multi-branch convolutional neural network to extract the spatial context information from InSAR interferometric features and optical images of different orbits, and captures multi-scale deformation patterns from local details to regional background through a feature pyramid structure. The decoder fuses multi-level features and enhances the feature response of areas with significant deformation through an attention mechanism, and outputs a feature tensor containing preliminary information on the direction and magnitude of surface deformation.

[0062] S4. Elastic mechanical constraints are introduced to optimize the inversion of the three-dimensional deformation field. Based on the deformation features extracted in S3 and combined with the InSAR deformation measurement equation of multiple geometric observations, a loss function containing elastic mechanical priors is constructed. The surface deformation components in the east-west, north-south, and vertical directions are inverted using a deep learning network. Physical constraints are used to reduce the ill-posedness of the solution and improve the spatial continuity and physical rationality of the three-dimensional deformation field.

[0063] S5. The uncertainty quantification results of the fusion physical model and deep learning output are fused. The deformation forward modeling results based on the elastic half-space or layered medium model are fused with the neural network prediction results. The uncertainty of deformation prediction is estimated by using the Bayesian deep learning framework or the Monte Carlo dropout method. Finally, the three-dimensional deformation of each pixel and its corresponding confidence assessment are output to form a quantitative deformation result map that can be used for geological disaster risk assessment.

[0064] The joint extracted features in S2 include radar amplitude, phase, coherence, and optical spectral and texture features; the deformation forward modeling results in S5 are output by an elastic half-space or layered medium model.

[0065] During the multi-source data acquisition in S1, a unified data acquisition and organization mechanism is constructed for the same monitoring area. At least two synthetic aperture radar (SAR) observation tasks with different orbital parameters are scheduled to simultaneously acquire SAR image sequences covering the same spatial range and with a consistent time span. The differences in incident angle and azimuth angle between different orbits provide necessary geometric redundancy for subsequent three-dimensional deformation decomposition. Simultaneously, optical image data spatially overlapping with the SAR images are acquired within the same time window. These optical images contain multispectral or panchromatic bands to characterize land cover type, structural boundaries, and texture continuity information, and serve as external observation sources for deformation semantic constraints. The geometric sensitivity of multi-orbit InSAR observations is quantitatively characterized during the data acquisition phase by defining an orbital observation sensitivity vector:

[0066] ;

[0067] in, For the first Orbit observation sensitivity vector of orbital SAR imagery, Indicates the first The angle of incidence of orbital SAR images. This vector represents the corresponding azimuth direction and is used to describe the orbit's comprehensive response to east-west, north-south, and vertical deformation components, thus ensuring the observability of multi-orbit combinations for three-dimensional deformation inversion at the data level. Simultaneously, an optical structure complexity index is introduced. ;

[0068] in, This represents the total number of bands in the optical image. Indicates the first The radiation intensity distribution of an optical band or panchromatic image. The spatial gradient operator is used to quantify the intensity of changes in the boundaries and textures of ground features in optical imagery, ensuring that the acquired optical data has significant distinguishability between landslide, subsidence, and stable areas. Next, the multi-track SAR image sequence is organized according to a unified time index to construct a temporally consistent observation set.

[0069] ;

[0070] in, Indicates the first The orbit at time SAR observations, This represents the optical images corresponding to the same moment; this set provides a structured input basis for spatiotemporal registration, pixel-level fusion, and joint extraction of multi-scale deformation features in subsequent steps, thus naturally connecting to the next step of multi-source data spatiotemporal normalization and fusion modeling process.

[0071] Based on the time-consistent observation set constructed in S1, a unified spatiotemporal normalization process is first implemented on multi-orbit InSAR data and optical imagery. By introducing a joint spatiotemporal mapping function, observation data from different sensors and at different times are precisely aligned to the same geographic coordinate system and regular grid space to eliminate the impact of orbital differences, imaging geometric differences, and temporal sampling inconsistencies on subsequent feature learning. For this purpose, the cross-modal spatiotemporal registration error energy is defined as follows: ;

[0072] in, The energy of cross-modal spatiotemporal registration error (which needs to be minimized to correct orbital errors and improve the accuracy of deformation inversion) For double summation, all orbits of the multi-orbit InSAR and all observation phases / time points are traversed. For multi-orbit InSAR orbit indexing and For time indexing; Represents pixel coordinates in a uniform grid. This represents the spatial mapping operator estimated by the registration model. Indicates the first The orbit at the point in time and spatial location Radar observation signals at the location (including amplitude, phase, and coherence characteristics); Indicates the same point in time, after After mapping and SAR pixels The optical observation signal (including spectral and texture features) at the optical pixel of the counterweight; this energy function simultaneously constrains the structural consistency of radar observation and optical observation at the same spatial location, ensuring the geometric consistency of multi-source information from the data level; after completing spatiotemporal registration, a pixel-level multi-source feature fusion model is constructed for each pixel location, jointly encoding radar amplitude, phase, and coherence features with optical spectral and texture features. To avoid bias in the fusion result caused by differences in the dimensions and statistical distributions of different modes, adaptive mode normalization weights are introduced:

[0073] ;

[0074] For the first The normalized weights of each target item (such as SAR orbit, observation time phase, pixel or feature) take values ​​between (0,1), and the sum of the weights of all target items is 1. Corresponding radar or optical modes, The weighting mechanism, which is automatically learned by the network during training, measures the contribution of each modality to the deformation representation at the current pixel. This mechanism enables the fused features to maintain stable discriminative ability under complex terrain conditions. To The original scores of each target item (such as weight error, feature similarity). Perform exponential calculations to amplify the score differences between different target items; Raw scores for all target items After taking the exponents and summing them, use the sum as the normalized denominator to ensure that the sum of all weights is 1. Based on this, define the pixel-level fused feature vector:

[0075] ;

[0076] in, Indicates the first Modal class in pixels The standardized features extracted are used to create a fusion vector that comprehensively reflects the deformation-sensitive information of multi-track InSAR and the semantic information of ground features in optical images within the same feature space. This results in a spatially continuous and channel-complete multi-channel fusion feature map across the entire monitoring area. This feature map is then directly passed as a structured input to the next step of the multi-scale three-dimensional deformation feature joint extraction network, providing a consistent data foundation for the deep feature learning of the subsequent encoder-decoder structure.

[0077] When constructing a multi-scale joint feature extraction network for 3D deformation expression, S3 uses the multi-channel fused feature map formed by S2 as a unified input to construct the multi-scale joint feature extraction network for 3D deformation expression. The overall structure adopts a symmetrical coupling design between the encoder and decoder. In the encoder stage, a multi-branch convolutional path corresponding to multi-track InSAR observations is set up to fully model the interferometric phase changes, amplitude stability, and coherence evolution under different observation geometry conditions in an independent feature subspace. At the same time, spatial context information from optical images is introduced to enhance the perception of ground object boundaries and continuous deformation regions. To achieve a unified expression of cross-scale deformation patterns, a multi-layer feature pyramid is constructed in the encoder, and features at different resolutions are aggregated through a scale correlation operator. Its scale response is defined as:

[0078] ;

[0079] in, Indicates the first Layer feature pyramid at pixel The coding features at that location For scale-related learnable weights, The expression represents the total number of layer features participating in the weighted fusion, enabling the network to simultaneously capture the spatial consistency of localized minor deformations and large-scale gradual deformations. In the decoder stage, the fusion of high-level semantic information and low-level spatial details is achieved through progressive upsampling and cross-layer connections. Furthermore, a deformation saliency attention mechanism is introduced to enhance the feature response of potential geological hazard areas, with the attention weights as follows:

[0080] ;

[0081] in, Indicates the attention mapping parameters. The normalized activation function is represented by a weight used to suppress background stable region features and highlight key regions where the deformation gradient is continuous and consistent in direction. Based on this, the network constructs a 3D deformation initial representation tensor at the decoding output, whose pixel-level expression is as follows: ;

[0082] in, This represents the deformation mapping matrix learned by the network. This tensor simultaneously encodes deformation amplitude and direction information in the feature space. It provides continuous, differentiable, and physically directional initial deformation feature inputs for the subsequent inversion of the three-dimensional deformation field with elastic mechanical constraints, thus naturally transitioning to the next step of physical constraint optimization modeling.

[0083] After the initial feature tensor of the three-dimensional deformation is output, the deep learning results are explicitly coupled with the geometry of multi-track InSAR observations. Elastic constraints are introduced to jointly invert the three-dimensional deformation field. Physical priors are used to suppress ill-posed problems caused by the limitation of observation dimensions. First, the pixel-level deformation features output by the network are mapped into a continuous three-dimensional deformation vector field. and the orbital observation sensitivity vector of S1 Establish consistent observation relationships to construct multi-geometric InSAR measurement consistency constraints, whereby... Positions A continuous three-dimensional deformation vector field in the east, north, and elevation directions at the location. As a vector transpose, its pixel-level error term is defined as:

[0084] ;

[0085] in, Indicates the first The orbit at time Corresponding pixels InSAR line-of-sight deformation at the location, This is the error calculation formula for iterative weighted least squares; this constraint term ensures that the inverted three-dimensional deformation remains consistent with the actual interferometric measurement results in the observation space; based on this, elasticity priors are introduced to constrain the spatial continuity and physical rationality of the deformation field. By constructing strain energy density constraints on the three-dimensional deformation field, its elastic regularity term is defined as:

[0086] ;

[0087] in, Indicates the spatial domain of the monitoring area. Let be the divergence of the three-dimensional deformable vector field. For divergence operators, It is a three-dimensional deformation vector field; The gradient matrix of the three-dimensional deformation vector field; and The elastic parameter is used to adjust the penalty intensity for volumetric and shear deformation. This parameter effectively suppresses discontinuous and non-physical abrupt deformation solutions during the optimization process, making the inversion results consistent with the continuous deformation characteristics of the geological medium. Finally, the observation consistency constraint and the elastic mechanical constraint are combined to form an end-to-end optimizable physical constraint loss function.

[0088] ;

[0089] in, The joint loss function is used to balance the accuracy of observation fitting with the strength of physical constraints. It drives the deep learning network to complete the joint inversion of deformation components in the east-west, north-south and vertical directions, so that the obtained three-dimensional deformation field is valid in both the observation space and the physical space. It also provides a stable and reliable deformation inversion basis for the next step of introducing uncertainty quantification and physical model fusion.

[0090] S5, based on the three-dimensional deformation field obtained in S4 that satisfies observation consistency and elastic mechanical constraints, introduces a joint uncertainty quantification mechanism of the physical model and deep learning results to assess the credibility and express the risk of the deformation inversion results. Specifically: First, based on the medium description consistent with the elastic parameters in S4, a deformation forward modeling operator under elastic half-space or layered medium conditions is constructed. The three-dimensional deformation field output by the physical model is denoted as... Meanwhile, the set of prediction results obtained by the deep learning network during multiple random inactivation inference processes is represented as... The meaning of this set of prediction results is: spatial location within the monitoring area. At this point, deep learning networks are in the first to the second... In sub-random inactivation inference, the complete set of all output 3D deformation prediction values; among which For the index of the number of inferences, Indicates the first to the second The range of values ​​for each time; For the first The deformation prediction results corresponding to each inference step; the uncertainty of the network prediction is characterized by the mean and dispersion of this set at the pixel level, and its prediction mean is defined as:

[0091] ;

[0092] This mean reflects the steady-state prediction results of the deep learning model for three-dimensional deformation after physical constraint optimization;

[0093] To achieve complementary fusion of the physical model and the data-driven model, an uncertainty-adaptive fusion weight is introduced, which weights the two types of deformation results at the pixel level. The fused deformation field is defined as follows:

[0094] ;

[0095] in, It is obtained by inversely mapping the variance of the network prediction results; it is used to enhance the contribution of deep learning results in regions of low uncertainty and strengthen the constraint effect of the physical model in regions of high uncertainty, thereby improving the stability and reliability of the overall deformation field; based on this, a pixel-level deformation confidence index is constructed by quantifying the dispersion of the prediction results, and its definition is as follows:

[0096] ;

[0097] in, This represents the variance vector of the three-dimensional deformation components under multiple random inferences. This index is used to map deformation uncertainty into continuous confidence values, so that the final output not only includes the spatial distribution of the three-dimensional deformation, but also provides the corresponding reliability characterization, forming a quantitative deformation result map for geological disaster risk assessment, and providing direct input for subsequent disaster zoning, threshold identification and early warning decision-making.

[0098] Example 1:

[0099] This embodiment uses a landslide-prone area in a mountainous region in Southwest China as the monitoring area, covering approximately 50 km². The area has significant topographic relief, complex geological structures, and frequent landslides. Traditional InSAR methods in this region suffer from low deformation inversion accuracy and poor physical plausibility. The method of this invention is used for geological hazard deformation identification, and the specific implementation process is as follows:

[0100] Multi-source remote sensing data acquisition: Two SAR satellites in different orbits were used to acquire InSAR data of the monitored area. Orbit 1 parameters: incident angle 38°, azimuth angle 105°; Orbit 2 parameters: incident angle 42°, azimuth angle 150°. The time span was 12 months, and a total of 36 SAR images were acquired, forming two sets of SAR image sequences with different observation directions, providing geometric redundancy for three-dimensional deformation decomposition. High-resolution optical satellite images of the same spatiotemporal range were acquired simultaneously. The image resolution was 2m, including four multispectral bands (red, green, blue, and near-infrared) and one panchromatic band. The optical structure complexity index C was calculated. O =8.2, which meets the ground feature discrimination requirement; calculate the observation sensitivity vectors for the two orbits:

[0101] Orbit 1: S1 = (sin38°cos105°, sin38°sin105°, cos38°);

[0102] Orbit 2: S2 = (sin42°cos150°, sin42°sin150°, cos42°);

[0103] The two sets of SAR images and optical images are aligned with a unified time index to construct a time-consistent observation set.

[0104] Multi-source data spatiotemporal registration and pixel-level fusion: A joint spatiotemporal mapping function is constructed using thin-plate spline interpolation. All data are unified to the WGS84 geographic coordinate system with a grid resolution of 10m. The spatial mapping operator is optimized by minimizing the cross-modal spatiotemporal registration error energy Ereg. The structural consistency error between the radar and optical data after registration is less than 0.05 pixels. Three types of radar features (amplitude, interferometric phase, coherence) and two types of optical features (spectral and gray-level co-occurrence matrix texture) are extracted for each pixel. Adaptive modal normalization weights of the Sigmoid activation function are introduced. The contribution of each feature is automatically learned through training. A 5-channel pixel-level fusion feature vector is constructed to generate a multi-channel fusion feature map of the monitoring area with a size of 5000×1000 pixels.

[0105] Multi-scale 3D deformation feature joint extraction: A multi-scale 3D deformation feature joint extraction network is constructed. The encoder has two radar branches and one optical branch. Each branch consists of three convolutional blocks with a kernel size of 3×3 and a stride of 2. The feature pyramid has three scale layers, corresponding to 10m, 20m, and 40m resolutions, respectively. Cross-scale feature aggregation is achieved through 1×1 convolution. The decoder uses deconvolution for upsampling with a stride of 2 and fuses the encoder's low-level features through cross-layer connections. A channel attention mechanism is introduced to enhance the feature response in regions where the deformation gradient is greater than 0.5mm / mm. The multi-channel fused feature map is input into the network. After training, the network outputs an initial 3D deformation representation tensor with a size of 5000×1000×3, corresponding to the initial deformation features in the east-west, north-south, and vertical directions, respectively.

[0106] Elastic mechanical constraint 3D deformation inversion: Based on the geological survey data of the monitoring area, the elastic parameters of the rock mass were determined: shear modulus 8×10⁹ Pa, Lamé constant 1.2×10¹⁰ Pa; the initial 3D deformation characterization tensor was mapped to a 3D deformation vector field, and a multi-geometric InSAR measurement consistency constraint Eobs was constructed by combining the observation sensitivity vectors of the two orbits. An elastic regularization term Eelas was constructed based on the elastic parameters, and a tradeoff coefficient of 0.3 was set to construct a physical constraint joint loss function; the Adam optimizer was used to train the network end-to-end with a batch size of 32 and 100 iterations. After training, the 3D deformation components in the east-west, north-south, and vertical directions of the monitoring area were obtained through inversion. The root mean square error of the inversion results was less than 0.3 mm, and there was no obvious non-physical abrupt deformation.

[0107] Uncertainty Quantification and Output: An elastic half-space deformation forward model was constructed based on the same elastic parameters to obtain the physical model forward deformation field; 50 Monte Carlo dropout random inactivation inferences were performed on the deep learning network to obtain the prediction result set, and the pixel-level mean and variance vector were statistically analyzed; adaptive fusion weights were set to obtain the final fused three-dimensional deformation field; a pixel-level deformation confidence index was constructed, mapping the variance to a confidence value of 0~1; finally, a quantitative deformation result map with confidence of the monitoring area was output. The map clearly identified three core landslide deformation zones, with the maximum vertical deformation being 15.2 mm / month, the east-west deformation being 8.5 mm / month, and the north-south deformation being 6.3 mm / month. The confidence of the core deformation zones was greater than 0.9, providing accurate data support for the risk level classification and early warning of landslide disasters in this area.

[0108] Example 2:

[0109] This embodiment uses a land subsidence area in a plain in eastern my country as the monitoring area, covering approximately 200 km². The main geological hazard is regional land subsidence caused by groundwater over-extraction. The deformation identification method of this invention is used, and the difference from Embodiment 1 is as follows:

[0110] During the data acquisition phase, SAR satellite data from three different orbits were used to further improve the geometric redundancy of the three-dimensional deformation inversion; 0.5m resolution panchromatic optical images were used to improve the identification accuracy of urban area boundaries and construct a time-series observation set.

[0111] In the elastic mechanical constraint stage, a layered medium model is used to construct elastic regularization terms to adapt to the geological structure characteristics of loose sedimentary layers in the plain area. The elastic parameters are set in layers according to the lithological test results of different strata.

[0112] In the uncertainty quantification stage, a Bayesian deep learning framework is used to replace the Monte Carlo dropout method to improve the accuracy of uncertainty estimation.

[0113] The remaining steps are completely consistent with Example 1, and finally high-precision identification of ground subsidence in the plain area is achieved. The correlation coefficient between the vertical deformation obtained by inversion and the measured data of leveling reaches 0.96, which is significantly better than the traditional InSAR method.

[0114] The above embodiments are merely preferred embodiments of the present invention. Therefore, all equivalent changes or modifications made to the structure, features and principles described in the claims of the present invention are included within the scope of the present invention.

Claims

1. A deep learning-based InSAR geological disaster deformation identification method, characterized by: Includes the following steps: S1. Multi-source data acquisition: Acquire multi-track InSAR data and corresponding spatiotemporal optical image data of the same monitoring area. The multi-track InSAR data includes multiple SAR image sequences from different observation directions, and the optical image data includes multispectral or panchromatic images. S2. Establish a multi-source data spatiotemporal registration and pixel-level fusion model: Spatiotemporal normalization processing is performed on the multi-track InSAR data and optical image data obtained in S1. The image registration algorithm is used to unify the data from different sensors and different time phases into the same geographic coordinate system and grid space. Then, a pixel-level multi-source feature fusion module is constructed to jointly extract features and generate a multi-channel fusion feature map. S3. Construct a joint extraction network for multi-scale three-dimensional deformation features: The joint extraction network adopts an encoder-decoder structure. The encoder extracts the spatial context information from InSAR interferometric features and optical images from different orbits through a multi-branch convolutional neural network, and captures multi-scale deformation patterns from local details to regional background through a feature pyramid structure. The decoder fuses multi-level features and enhances the feature response of areas with significant deformation through an attention mechanism, and outputs a feature tensor containing preliminary information on the direction and magnitude of surface deformation. S4. Elastic mechanical constraints are introduced to optimize the inversion of the three-dimensional deformation field. Based on the deformation features extracted in S3, combined with the InSAR deformation measurement equation of multiple geometric observations, a loss function containing elastic mechanical priors is constructed, and the surface deformation components in the east-west, north-south, and vertical directions are inverted using a deep learning network. S5. The uncertainty quantification results of the physical model and deep learning output are integrated. The deformation forward modeling results are fused with the neural network prediction results. The uncertainty of deformation prediction is estimated by using a Bayesian deep learning framework or Monte Carlo dropout method. Finally, the three-dimensional deformation of each pixel and its corresponding confidence assessment are output to form a quantitative deformation result map that can be used for geological disaster risk assessment. During the multi-source data acquisition in S1, a unified data acquisition and organization mechanism is constructed for the same monitoring area. This is achieved by scheduling at least two synthetic aperture radar (SAR) observation tasks with different orbital parameters, and by quantitatively characterizing the geometric sensitivity of multi-orbit InSAR observations by defining an orbital observation sensitivity vector. ; in, For the first Orbit observation sensitivity vector of orbital SAR imagery, Indicates the first The angle of incidence of orbital SAR images. This vector represents the corresponding azimuth direction and is used to describe the orbit's comprehensive response to east-west, north-south, and vertical deformation components, thus ensuring the observability of multi-orbit combinations for three-dimensional deformation inversion at the data level. Simultaneously, an optical structure complexity index is introduced. ; in, This represents the total number of bands in the optical image. Indicates the first The radiation intensity distribution of an optical band or panchromatic image. The spatial gradient operator is used to quantify the intensity of changes in the boundaries and textures of ground features in optical imagery, ensuring that the acquired optical data has significant distinguishability between landslide, subsidence, and stable areas. Next, the multi-track SAR image sequence is organized according to a unified time index to construct a temporally consistent observation set. ; wherein, represents the first orbit at time SAR observation, represents the corresponding optical image at the same time; In S4, the three-dimensional deformation initial characterization tensor output by S3 is explicitly coupled with the multi-track InSAR observation geometry, and the elastic mechanics constraint is introduced to jointly invert the three-dimensional deformation field, as follows: first, the pixel-level deformation features output by the network are mapped to a continuous three-dimensional deformation vector field , and the track observation sensitivity vector of S1 is established to establish a consistent observation relationship, thereby constructing a multi-geometry InSAR measurement consistency constraint, wherein, are the eastward, northward and elevation continuous three-dimensional deformation vector fields at positions , is the vector transpose, and the pixel-level error term is defined as: ; in, Indicates the first The orbit at time Corresponding pixels InSAR line-of-sight deformation at the location, The error calculation formula for iterative weighted least squares is given. Next, elasticity priors are introduced to constrain the spatial continuity and physical rationality of the deformation field. By constructing strain energy density constraints on the three-dimensional deformation field, its elastic regularity term is defined as: ; in, Indicates the spatial domain of the monitoring area. Let be the divergence of the three-dimensional deformable vector field. For divergence operators, It is a three-dimensional deformation vector field; The gradient matrix of the three-dimensional deformation vector field; and As elastic parameters, the observation consistency constraint and elastic mechanical constraint are ultimately combined to form an end-to-end optimizable physical constraint loss function: ,in, For the weighting factor; S5 is specifically as follows: First, based on the medium description consistent with the elastic parameters in S4, a deformation forward modeling operator under the conditions of elastic half-space or layered medium is constructed, and the three-dimensional deformation field output by the physical model is denoted as... Meanwhile, the set of prediction results obtained by the deep learning network during multiple random inactivation inference processes is represented as... The meaning of this set of prediction results is: spatial location within the monitoring area. At this point, deep learning networks are in the first to the second... In sub-random inactivation inference, the complete set of all output 3D deformation prediction values; among which For the index of the number of inferences, Indicates the first to the second The range of values ​​for each time; For the first The deformation prediction results corresponding to each inference step; the uncertainty of the network prediction is characterized by the mean and dispersion of this set at the pixel level, and its prediction mean is defined as: ; This mean reflects the steady-state prediction results of the deep learning model for three-dimensional deformation after physical constraint optimization; After calculating the predicted mean, the two types of deformation results are weighted and combined at the pixel level, and the fused deformation field is defined as: ; in, It is obtained by inverse mapping of the variance of the network prediction results; after weighted combination, the dispersion of the prediction results is quantified to construct a pixel-level deformation confidence index, which is defined as follows: ; in, This represents the variance vector of the three-dimensional deformation components under multiple random inferences.

2. The InSAR geological hazard deformation identification method based on deep learning according to claim 1, characterized in that: The joint extracted features in S2 include radar amplitude, phase, coherence, and optical spectral and texture features; the deformation forward modeling results in S5 are output by an elastic half-space or layered medium model.

3. The InSAR geological hazard deformation identification method based on deep learning according to claim 1, characterized in that, During the spatiotemporal normalization process of S2, the cross-modal spatiotemporal registration error energy is defined as follows: ; in, For double summation, all orbits of multi-orbit InSAR and all observation phases are traversed. For multi-orbit InSAR orbit indexing and For time indexing; Represents pixel coordinates in a uniform grid. This represents the spatial mapping operator estimated by the registration model. Indicates the first The orbit at the point in time and spatial location Radar observation signals at the location; Indicates the same point in time, after After mapping and SAR pixels The optical observation signal at the optical pixel of the counterweight; this energy function simultaneously constrains the structural consistency of radar observation and optical observation at the same spatial location, and introduces adaptive modal normalization weights during pixel-level fusion: ; in, For the first Normalized weights of each objective term, Corresponding radar or optical modes, It is automatically learned by the network during training and is used to measure the contribution of each modality to the deformation representation at the current pixel. Raw scores for all target items Take the exponents and sum them; then construct the pixel-level fused feature vector: ; in, Indicates the first Modal class in pixels The standardized features extracted from the sample are used to form a spatially continuous and channel-complete multi-channel fusion feature map across the entire monitoring area using this fusion feature vector.

4. The InSAR geological hazard deformation identification method based on deep learning according to claim 3, characterized in that, When constructing a multi-scale joint feature extraction network for 3D deformation expression, S3 uses the multi-channel fused feature map output by S2 as a unified input. The overall design employs an encoder-decoder structure, constructing a multi-layer feature pyramid within the encoder and aggregating features at different resolutions through a scale correlation operator. Its scale response is as follows: ; in, Indicates the first Layer feature pyramid at pixel The coding features at that location To provide scale-dependent learnable weights, a deformation saliency attention mechanism is introduced in the decoder stage to enhance the characteristic responses of potential geological hazard areas. Its attention weights are as follows: ; in, Indicates the attention mapping parameters. Let represent the normalized activation function. Then, extract the initial 3D deformation representation tensor constructed at the decoding output of the network. Its pixel-level expression is as follows: ; in, This represents the deformation mapping matrix learned by the network. This tensor encodes both deformation amplitude and direction information in the feature space.