Disaster risk identification and early warning method and system based on multi-modal remote sensing image data

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By using registration, quality assessment, and hierarchical gating hybrid expert models for multimodal remote sensing images, the problems of data quality fluctuations and modal response inconsistencies in disaster risk identification of multimodal remote sensing images were solved, achieving highly stable and reliable disaster risk identification and early warning.

CN122244625APending Publication Date: 2026-06-19CHEM IND GEOTECHN ENG

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: CHEM IND GEOTECHN ENG
Filing Date: 2026-05-22
Publication Date: 2026-06-19

Application Information

Patent Timeline

22 May 2026

Application

19 Jun 2026

Publication

CN122244625A

IPC: G06V10/80; G06V10/98; G06V10/82; G06T7/30; G06N3/045; G06N3/0442

AI Tagging

Application Domain

Image analysis Character and pattern recognition

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122244625A_ABST

Patent Text Reader

Abstract

This invention discloses a disaster risk identification and early warning method and system based on multimodal remote sensing image data, belonging to the field of remote sensing image processing. To address the problems of large quality fluctuations and inconsistent modal responses in multimodal remote sensing images due to cloud cover, nighttime imaging, speckle noise, and geometric distortion, leading to false alarms and missed alarms in fusion and a lack of reliability quantification, this invention performs the following steps: multimodal image registration preprocessing; quality assessment outputs a multidimensional quality vector and quality uncertainty map; hierarchical gating and hybrid expert generation of gating weights and outputs a disaster risk evidence map; generation of a conflict map based on evidence differences and secondary gating adjustment based on quality uncertainty; execution of two-stage evidence fusion within and between modalities and output of risk and uncertainty maps; confidence calibration by quality and conflict grouping; and generation of early warning results. This achieves the technical effects of adaptively selecting reliable information sources, explicitly providing confidence and conflict area prompts, and improving the stability and accuracy of early warnings.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of remote sensing image processing, and in particular to a method and system for disaster risk identification and early warning based on multimodal remote sensing image data. Background Technology

[0002] Disaster risk identification and early warning is one of the important directions of remote sensing applications. With the development of satellite and airborne remote sensing sensors, optical remote sensing images are widely used for monitoring disasters such as floods, landslides, and forest fires due to their intuitive texture and spectral information; synthetic aperture radar remote sensing images have all-weather, day-and-night imaging capabilities and can acquire surface information under conditions of cloud cover or nighttime; thermal infrared remote sensing images can reflect surface thermal anomalies and temperature field changes, and are often used for fire monitoring and high-temperature anomaly identification. To improve the completeness and robustness of disaster identification, existing technologies generally adopt multimodal remote sensing image joint analysis, and are gradually evolving from traditional threshold discrimination and rule-based methods to feature extraction and fusion methods based on machine learning and deep learning, including pixel-level fusion, feature-level fusion, decision-level fusion, and adaptive weighted fusion based on attention mechanisms.

[0003] However, existing technologies still have shortcomings in multimodal disaster risk identification and early warning: First, different modal images are significantly affected by cloud and fog obstruction, low illumination, speckle noise, geometric distortion, and changes in imaging conditions. Data quality and availability fluctuate greatly in different regions and time periods. Existing fusion methods struggle to adaptively select more reliable information sources in spatial location, easily leading to false alarms, missed alarms, or boundary deviations. Second, different modalities may respond inconsistently or even contradictoryly to the same ground feature or disaster phenomenon. Existing methods typically lack explicit characterization and suppression mechanisms for intermodal conflicts, resulting in insufficient stability of the fusion results. Third, existing identification results mainly output categories or risk levels directly, lacking quantitative expression or calibration of the output reliability. This makes it difficult to provide reliable confidence information and conflict area indications for early warning issuance, thus affecting the availability of operational decisions and emergency response.

[0004] Therefore, there is a need for a disaster risk identification and early warning method and system that can overcome the shortcomings of the existing technologies. Summary of the Invention

[0005] One objective of this invention is to propose a disaster risk identification and early warning method and system based on multimodal remote sensing image data. Addressing the problems in existing technologies for the joint application of multimodal images (optical, synthetic aperture radar, and thermal infrared) due to cloud cover, low illumination, speckle noise, and geometric distortion leading to large spatiotemporal fluctuations in data quality, inconsistent responses between different modalities causing conflicts, and a lack of reliable quantitative expression of fusion results, this invention proposes a technical solution including: image registration preprocessing; quality assessment outputting a multidimensional quality vector and quality uncertainty map; using hierarchical gating and hybrid experts to generate gating weights and output a disaster risk evidence map; generating a conflict map based on evidence differences and performing secondary adjustment of the gating weights based on quality uncertainty; performing two-stage evidence fusion within and between modalities and outputting risk and uncertainty maps; and performing confidence calibration by quality and conflict grouping and generating early warning results. This invention possesses the technical effects of adaptively selecting reliable information sources, suppressing the impact of modal conflicts, reducing false alarms and missed alarms, and outputting confidence levels and conflict area prompts that can be used for early warning issuance, thereby improving the stability and accuracy of early warnings.

[0006] This invention provides a disaster risk identification and early warning method based on multimodal remote sensing image data, including: S1. Acquire at least two different modal remote sensing image data of the same monitoring area within the same monitoring period, and preprocess them to obtain spatially registered images; S2. Output multidimensional quality vectors and quality uncertainty maps for each modal remote sensing image using a quality assessment model; S3. Based on the spatially registered images, multidimensional quality vectors, and quality uncertainty maps, generate gating weights using a hierarchical gated hybrid expert model, setting at least two expert branches for each modal remote sensing image, with each expert branch outputting a disaster risk evidence map; S4. Calculate different modal remote sensing image data based on the disaster risk evidence map. The evidence differences between states are used to generate a conflict map, and the gating weights are adjusted with the quality uncertainty map to reweight the disaster risk evidence map; S5, under the adjusted gating weights, a two-stage evidence fusion is performed on the disaster risk evidence map to obtain a fused evidence map. The first stage is fusion within the same mode, and the second stage is fusion between different modes. The disaster risk map and uncertainty map are output based on the fused evidence map. The posterior probability is grouped and calibrated with the conflict map according to the multidimensional quality vector to obtain a confidence map; S6, an early warning result is generated based on the disaster risk map, confidence map, and conflict map.

[0007] Optionally, S1 includes: Acquire multimodal remote sensing images of the same monitoring area within the same monitoring period, and acquire the imaging time, sensor parameters, and geographic location parameters corresponding to each modal remote sensing image, wherein the imaging time difference of the multimodal remote sensing images does not exceed a preset threshold Δt. Radiometric consistency processing is performed on the multimodal remote sensing images respectively. The radiometric consistency processing includes radiometric calibration and radiometric normalization based on the same reference image. The radiometric normalization includes either histogram matching normalization or linear regression normalization. Geometric correction is performed on each modal remote sensing image after radiometric consistency processing, including orthorectification based on the geolocation parameters; After geometric correction, the remote sensing images of each modality are unified to the same coordinate reference and the same spatial resolution, and spatial registration is performed using the same spatial grid to obtain the registered image set. The remote sensing images of each modality in the registered image set correspond one-to-one in terms of pixel position.

[0008] Optionally, S2 includes: For each modal remote sensing image in the registered image set, a quality feature map is extracted using the quality assessment model, and a multidimensional quality vector and a quality uncertainty map of the modal remote sensing image are generated based on the quality feature map. The multidimensional quality vector is obtained by spatial convergence of the quality feature map and includes optical quality indicators for characterizing the degree of cloud and fog obstruction, shadow degree and low illumination degree, synthetic aperture radar quality indicators for characterizing the intensity of speckle noise and the degree of geometric distortion, and at least one thermal infrared quality indicator for characterizing the degree of thermal infrared saturation, thermal infrared noise intensity, thermal infrared strip artifact intensity and abnormal high temperature artifact degree. The mass uncertainty diagram is used to characterize the spatial uncertainty of the mass index corresponding to the multidimensional mass vector; The mass vector set is obtained by collecting the multidimensional mass vectors corresponding to each modal remote sensing image, and the mass uncertainty map set is obtained by collecting the mass uncertainty map corresponding to each modal remote sensing image.

[0009] Optionally, S3 includes: In the hierarchical gated hybrid expert model, at least two expert branches are set for each modality of remote sensing image in the registered image set; Generate corresponding modal feature maps based on each modal remote sensing image; A first gating weight set is generated based on the set of quality vectors and the set of quality uncertainty maps. The first gating weight set includes modal gating weights for each modal remote sensing image and expert gating weights for each expert branch of the same modal remote sensing image. The modal gating weights and the expert gating weights are determined by a monotonically decreasing function of mass uncertainty, such that the greater the mass uncertainty, the smaller the corresponding gating weights. The monotonically decreasing function includes at least one of the following: The output of the modal feature map is an exponential decay form, which is the exponent of the mass uncertainty multiplied by a negative λ times the natural constant, where λ is a parameter greater than zero. The output of the reciprocal decay form is the ratio of one to one plus λ times the mass uncertainty, where λ is a parameter greater than zero. The normalization form based on softmax first maps the mass uncertainty to a gate score negatively correlated with the mass uncertainty, and then performs softmax normalization on the gate score to obtain the gate weights. The modal feature map is then subjected to hierarchical weighted fusion based on the first set of gate weights, and each expert branch outputs the corresponding disaster risk evidence map to obtain the first set of evidence maps.

[0010] Optionally, S4 includes: Based on the first set of evidence maps, calculate the evidence difference of each modal remote sensing image at the same pixel location. The evidence difference includes any one of Jensen-Shannon divergence, Kullback-Leibler divergence, L1 distance, and L2 distance. The differences in the evidence are spatially aggregated within a preset neighborhood window to generate a conflict graph. The spatial aggregation includes either mean pooling or max pooling. High-conflict pixels are pixels in the conflict map whose conflict value is greater than a preset threshold Tconf. The preset threshold Tconf is a fixed threshold or an adaptive threshold determined based on the statistical quantiles of the conflict value. The first gating weight set is subjected to suppressive adjustment based on the conflict map to generate a second gating weight set. The suppressive adjustment includes reducing the gating weight corresponding to the high conflict pixel position at the high conflict pixel position, and reducing the gating weight corresponding to the high uncertainty pixel position at the pixel position indicated by the quality uncertainty map set. The pixels with higher uncertainty are those whose quality uncertainty in the quality uncertainty map is greater than the threshold Tu. The threshold Tu is a fixed threshold or an adaptive threshold determined based on the statistical quantile of quality uncertainty. The second set of gated weights is obtained by normalizing the first set of gated weights after multiplying it by the conflict attenuation factor and the uncertainty attenuation factor respectively. The second set of gated weights includes modal gated weights for different modal remote sensing images and expert gated weights for each expert branch of the same modal remote sensing image. The conflict attenuation factor is a monotonically decreasing function of the conflict value, and the uncertainty attenuation factor is a monotonically decreasing function of the mass uncertainty; The conflict attenuation factor and the uncertainty attenuation factor each include at least one of the following: an exponential attenuation form, the output of which is the exponent of the input value as a negative λ times the natural constant, where λ is a parameter greater than zero; a reciprocal attenuation form, the output of which is the ratio of one to one plus λ times the input value, where λ is a parameter greater than zero; a softmax-based normalization form, which first maps the input value to an attenuation score negatively correlated with the input value, and then performs softmax normalization on the attenuation score to obtain the attenuation factor; and reweights the first evidence graph set according to the second gating weight set to generate the second evidence graph set. Furthermore, the conflict map further includes a conflict type discrimination map to distinguish between quality-driven conflict and semantic-driven conflict. The quality-driven conflict is a pixel whose conflict value and quality uncertainty are both greater than the corresponding threshold. The semantic-driven conflict is a pixel whose conflict value is greater than the corresponding threshold and whose quality uncertainty is not greater than the corresponding threshold. The magnitude of the reduction in modal gating weight for the semantic-driven conflict pixel is smaller than the magnitude of the reduction in modal gating weight for the quality-driven conflict pixel.

[0011] Optionally, S5 includes: A two-stage evidence fusion process is performed based on the second set of evidence maps. In the first stage, for the same modal remote sensing image, the disaster risk evidence map corresponding to the modal remote sensing image is fused according to the expert gating weight in the second set of gating weights to obtain the modal evidence map of the modal remote sensing image. In the second stage, the modal evidence maps of each modal remote sensing image are fused according to the modal gating weight in the second set of gating weights to obtain the fused evidence map. The fusion operation includes weighted summation or weighted averaging of the evidence vector according to the corresponding gate weights, weighted superposition of the Dirichlet distribution parameter vector according to the corresponding gate weights, and at least one of the Dempster-Shafer evidence combination rules. The disaster risk evidence map includes evidence vectors output by pixels. Each component of the evidence vector corresponds to a disaster risk level and is non-negative. A Dirichlet distribution parameter vector is constructed based on the evidence vector. The evidence quantity is the sum of the components of the Dirichlet distribution parameter vector, or a monotonic function of the sum of the components; The disaster risk map is determined based on the normalization result of the fused Dirichlet distribution parameter vector; The uncertainty map is determined based on the fused evidence map, and the uncertainty includes at least one of the uncertainty determined based on the Shannon entropy of the probability distribution of disaster risk level and the uncertainty determined based on the reciprocal of the sum of the components of the Dirichlet distribution parameter vector; The pixels in the disaster risk map are divided into multiple groups based on the quality vector set and the conflict map, and the posterior probability corresponding to each group is calibrated with confidence based on the group calibration parameters to obtain the confidence map. The posterior probability is the expected probability of the disaster risk level probability distribution obtained from the fused Dirichlet distribution parameter vector, or the maximum class probability in the probability distribution; The group calibration parameters are obtained offline through historical labeled samples or validation sets. The offline learning includes any one of temperature scaling, Platt scaling and order-preserving regression, with the goal of minimizing the log-likelihood loss or the expected calibration error. Furthermore, the two-stage evidence fusion processing further includes quality prior injection: determining the Dirichlet prior parameter vector corresponding to each modality remote sensing image based on the multidimensional quality vector, and superimposing the Dirichlet prior parameter vector with the Dirichlet distribution parameter vector constructed from the disaster risk evidence map before performing the fusion operation, so that the amount of evidence corresponding to the higher quality modality increases after fusion.

[0012] Optionally, S6 includes: Based on the disaster risk map, early warning areas are extracted and the corresponding early warning levels for the early warning areas are determined; The corresponding confidence level of the warning area is determined based on the confidence level map, and the corresponding confidence level is the statistical value of the pixel confidence level within the warning area; Based on the conflict map, conflict region hints are extracted, and the conflict region hints include the area range formed by high conflict pixels in the conflict map; The warning area, the warning level, the corresponding confidence level, and the conflict area prompt are compiled and output according to the preset warning release rules to generate the warning result. The preset warning release rules include comparing the warning level with a preset level threshold and comparing the corresponding confidence level with a preset confidence threshold. The warning result is output when both the level release condition and the confidence release condition are met.

[0013] On the other hand, the present invention also provides a disaster risk identification and early warning system based on multimodal remote sensing image data, including: The system comprises the following modules: an image generation module, which acquires at least two different modal remote sensing images within the same monitoring area and time period and generates spatially registered images; a quality assessment module, which outputs multidimensional quality vectors and quality uncertainty maps corresponding to each modality; an expert module, which generates gating weights based on the spatially registered images, multidimensional quality vectors, and quality uncertainty maps, and sets at least two expert branches for each modality to output a disaster risk evidence map; a gating adjustment module, which generates a conflict map from the disaster risk evidence map and adjusts the gating weights in conjunction with the quality uncertainty map to reweight the disaster risk evidence map; a fusion and calibration module, which performs intra-modal fusion and inter-modal fusion under the adjusted gating weights to obtain a fusion evidence map, and outputs a disaster risk map, an uncertainty map, and a confidence map obtained by grouping and calibrating according to the multidimensional quality vectors and conflict maps; and an early warning generation module, which outputs early warning results based on the disaster risk map, confidence map, and conflict map.

[0014] The beneficial effects of this invention are: 1. The quality assessment model outputs a multidimensional quality vector and a quality uncertainty map, and the gating weights are generated in the hierarchical gating hybrid experts with monotonically decreasing quality uncertainty. This enables the fusion process to adaptively suppress low-quality modes and unreliable expert branches at the pixel scale, thereby improving the stability and accuracy of disaster risk identification under conditions of cloud cover, low illumination, speckle noise and geometric distortion, and reducing false alarms and missed alarms.

[0015] 2. By calculating the evidence differences from the disaster risk evidence map to generate a conflict map, and using the conflict map and the quality uncertainty map to make secondary adjustments to the gating weights, the system can explicitly characterize and suppress contradictory information between modes, reduce the fusion boundary bias, and output conflict area prompts, thereby improving the interpretability and business usability of the early warning conclusions.

[0016] 3. By fusing evidence in two stages, including intra-modal and inter-modal, a fused evidence map is obtained, and an uncertainty map is output based on the evidence-based uncertainty. At the same time, the confidence level of the posterior probability is calibrated according to the multidimensional quality vector and conflict map grouping. This can quantify the reliability of the output and provide a reliable confidence map, which is convenient for formulating early warning release rules and emergency response decisions. Attached Figure Description

[0017] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings: Figure 1 This is a flowchart illustrating the overall process of a disaster risk identification and early warning method based on multimodal remote sensing image data. Figure 2 This is a flowchart of the execution of step S4 of the present invention, which involves multimodal evidence conflict modeling and secondary adjustment of gating weights. Detailed Implementation

[0018] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.

[0019] refer to Figures 1-2 Disaster risk identification and early warning methods based on multimodal remote sensing image data include: S1. Acquire at least two different modal remote sensing image data of the same monitoring area within the same monitoring period, and preprocess them to obtain spatially registered images; S2. Output multidimensional quality vectors and quality uncertainty maps for each modal remote sensing image using a quality assessment model; S3. Based on the spatially registered images, multidimensional quality vectors, and quality uncertainty maps, generate gating weights using a hierarchical gated hybrid expert model, setting at least two expert branches for each modal remote sensing image, with each expert branch outputting a disaster risk evidence map; S4. Calculate different modal remote sensing image data based on the disaster risk evidence map. The evidence differences between states are used to generate a conflict map, and the gating weights are adjusted with the quality uncertainty map to reweight the disaster risk evidence map; S5, under the adjusted gating weights, a two-stage evidence fusion is performed on the disaster risk evidence map to obtain a fused evidence map. The first stage is fusion within the same mode, and the second stage is fusion between different modes. The disaster risk map and uncertainty map are output based on the fused evidence map. The posterior probability is grouped and calibrated with the conflict map according to the multidimensional quality vector to obtain a confidence map; S6, an early warning result is generated based on the disaster risk map, confidence map, and conflict map.

[0020] In this specific embodiment, S1 includes: The system simultaneously acquires optical remote sensing images, synthetic aperture radar remote sensing images, and thermal infrared remote sensing images from the same monitoring area and within the same monitoring time period as multimodal remote sensing images. It also reads image metadata along with the images to obtain the imaging time corresponding to each modality of remote sensing image. Sensor parameters and geolocation parameters, among which This represents a modality index that corresponds one-to-one with each modality of remote sensing image; imaging temporal consistency is determined by a threshold. Constraints and Take 2 hours and calculate the imaging time difference for each modality using the timestamps in the metadata. If the imaging time difference for any modality is greater than 2 hours... The modal image will not be included in this processing to ensure the comparability of information within the same monitoring period; Radiometric consistency processing is performed on each modal remote sensing image involved in the processing. Radiometric calibration is performed pixel-by-pixel to obtain the radiometric value using the gain and bias provided by the metadata. The radiometric calibration satisfies the following: ; in Representing modes The Each band at the pixel position The calibrated radiation at the location Representing modes The Radiation calibration gain coefficients corresponding to each band. Representing modes The Each band at the pixel position The original digital quantization value at that location, Representing modes The The radiation calibration bias coefficients corresponding to each band. This indicates the band index and is determined by the band settings of each sensor. This represents the pixel location index and corresponds one-to-one with the row and column coordinates of the image; After radiometric calibration, optical remote sensing imagery was used as the reference image, and radiometric normalization based on the same reference imagery was performed on the remaining modes. Radiometric normalization employed histogram matching normalization, and each mode and band was processed separately. The number of bins in the histogram was then calculated. Take 2048 and This represents the number of discrete bins within the range of grayscale or radiometric values for each band. By calculating the cumulative histograms of the corresponding bands of the image to be normalized and the reference image within the same monitoring area and constructing a monotonic mapping, the band values of the image to be normalized are mapped to a cumulative distribution consistent with that of the reference image to eliminate radiometric scale differences across sensors and imaging conditions. After radiometric consistency processing, geometric correction is performed on each modal remote sensing image. This geometric correction employs orthorectification based on geolocation parameters and combines orbit, attitude, and sensor imaging geometric parameters with a digital elevation model (DEM) to establish the projection relationship between pixels and ground coordinates. The DEM has a spatial resolution... Take 30m and The ground sampling interval of the digital elevation model grid is represented. Each pixel in the orthorectified output is assigned ground coordinates under a unified coordinate reference and the geometric distortion caused by terrain undulation and side view imaging is eliminated. After orthorectification, the remote sensing images of each modality are unified to the same coordinate reference and the same spatial resolution. Take 10m and This indicates the ground sampling interval of the output image. Each modal image is generated using bilinear resampling to produce a target resolution raster and is cropped to the same monitoring area boundary. Finally, using the projection grid of the reference image as a unified spatial grid, the remaining modal images are resampled onto this unified spatial grid through grid alignment operations under the same coordinate reference, resulting in a registered image set and ensuring that each modal remote sensing image in the registered image set corresponds one-to-one with the pixel position.

[0021] In this specific embodiment, S2 includes: For the registered image set, the system uses modal indexing. Read the corresponding registration images one by one It inputs a quality assessment model to output the quality feature map, multidimensional quality vector, and quality uncertainty map for that mode. Corresponding one-to-one with each modality of remote sensing imagery, the quality assessment model is an encoder-decoder convolutional neural network deployed independently for each modality, employing the same network structure and fixed parameter scale across different modalities. The encoder contains four scale levels, and each level is composed of two... Convolution, batch normalization, and ReLU activation constitute the process, and downsampling is achieved through convolution with a stride of 2. The number of channels in the four layers is set sequentially to... The decoder contains four scale levels symmetrical to the encoder and recovers spatial resolution through bilinear upsampling. It also uses skip connections to fuse encoder features at corresponding scales to preserve spatial details. The network output has two types of convolutional heads: a quality index branch and an uncertainty branch, both of which employ... Convolution generates a channel map that is output pixel by pixel; Quality indicator branch outputs quality feature graph and Indicates the position of a cell The quality score vector is normalized by applying sigmoid activation to each component. Uncertainty branch outputs and quality indices correspond one-to-one in uncertainty characteristic diagrams and Indicates the position of a cell The mass uncertainty vector at a given point is activated by softplus to ensure that each component is positive. The dimensions of the quality indicators are used Indicates and The modality is determined and, in this embodiment, fixed as an optical mode. Synthetic Aperture Radar Mode Thermal infrared mode Among them, optical modes The three components represent the degree of cloud and fog obstruction, the degree of shadow, and the degree of low illumination, respectively. A larger component value indicates a more severe corresponding quality problem. (This refers to the synthetic aperture radar mode.) The two components characterize the speckle noise intensity and geometric distortion degree, respectively, with larger component values indicating more severe quality problems. (Thermal infrared mode...) The four components represent thermal infrared saturation, thermal infrared noise intensity, thermal infrared strip artifact intensity, and abnormal high temperature artifact intensity, respectively. The larger the component value, the more serious the corresponding quality problem. The multidimensional mass vector From the quality characteristic map The spatial convergence within the monitoring area is obtained by using global average convergence and satisfies the following: ; in Representing modes The corresponding multidimensional mass vector with dimension 1 , This indicates the modal registration image. The set of pixels covering the monitoring area under a unified spatial grid. Represents a set of pixels The number of pixels in Indicates the cell location index. Indicates cell position The quality index score vector at the location; The mass uncertainty diagram is denoted as Furthermore, the uncertainty of the quality index corresponding to the multidimensional quality vector in spatial location is output in the form of a scalar graph, and the uncertainty feature map is used to output the uncertainty. The mean value is obtained by taking the mean value across the indicator dimensions, so that each pixel position corresponds to only one uncertainty value and maintains the pixel-level alignment with the subsequent gating weights. The mass uncertainty map is represented at the pixel location. The uncertainty value at and The larger the value, the more unstable the quality assessment at that pixel is, and the stronger the suppression it will be in subsequent steps; In obtaining each mode and Then, the system will generate multidimensional mass vectors for all modes. Collect them into a set of mass vectors and plot the mass uncertainty of all modes. They are compiled into a set of quality uncertainty diagrams.

[0022] In this specific embodiment, S3 includes: The system targets each modality in the registered image set. Hierarchical gated hybrid expert models are constructed respectively, and a first gating weight set and a first evidence graph set are generated, wherein the modality index is... and and Consistent and corresponding registered images Multidimensional mass vector and mass uncertainty diagram ; Registration image for each modality Generate modal feature maps Modal feature map The modality feature extraction network is used, and in this embodiment, the network consists of one input adaptation layer and three residual convolutional blocks connected in series. The input adaptation layer adopts... Convolution will The number of channels is mapped to 32 and a stride of 1 is used with zero padding to maintain the spatial size. Each block of the residual convolutional block contains two... Convolution is performed, and after each convolution, batch normalization and ReLU activation are sequentially connected, and identity residual connections are set. The number of output channels for the three groups of residual convolution blocks are as follows: This yields modal feature maps that are aligned one-to-one with the registered images on a unified spatial grid. And used for subsequent gating and expert branch input; Then, a first gating weight set is generated based on the mass vector set and the mass uncertainty map set. The first gating weight set includes modal gating weights output per pixel. And expert gating weights output per pixel ,in Represents the cell location index under a unified spatial grid, expert branch index. Take 1 and 2 and represent the same mode. The two expert branches below; When calculating the gating weights, the system first uses a multidimensional mass vector. Calculate the quality severity scalar for this mode. ,in for The arithmetic mean of each component and The larger the value, the more serious the overall quality problem of that mode. With mass uncertainty diagram The gating score is jointly determined, and the gating weight is generated accordingly. The calculation satisfies the following: ; in Representing modes At the cell position Modal gating weights at the location and for the mode set Each internal mode satisfies weight normalization. Representing modes The Each expert branch at the pixel location The expert gating weights at each location and their value ranges are as follows: This represents an exponential function with the natural constant as its base. Represents a logical function and Representing the mass uncertainty diagram At the cell position Uncertainty value at, Representing modes The corresponding quality severity scalar, This represents the set of modal indices participating in this processing. This represents the modal index used for normalized summation. Denotes the uncertainty decay coefficient and takes This represents the severity attenuation coefficient and takes... Denotes the expert uncertainty attenuation coefficient and takes Indicates the first The gating bias of each expert branch and take and This ensures that when As the gate size increases, the gating score monotonically decreases and the corresponding and Both decrease accordingly, thereby suppressing modal and expert contributions in high uncertainty regions at the pixel scale; After obtaining the first set of gating weights, the system weights the modal feature maps according to hierarchical gating and drives the expert branch to output a disaster risk evidence map, specifically at the cell location. First, the modal feature map is generated. according to Modal weighting is performed to obtain gated modal features, and then the gated modal features are input into the modal functions respectively. The two expert branches output corresponding disaster risk evidence vectors and form a disaster risk evidence map. These two expert branches employ different receptive fields to achieve complementary feature extraction, and both have a 4-layer convolutional network structure. The first three layers of the first expert branch are... Convolution with a dilatancy of 1, followed by batch normalization and ReLU after each layer; the first 3 layers of the second expert branch are all... Convolution with a dilatancy of 2, and batch normalization and ReLU concatenated after each layer; the fourth layer of both expert branches is... Convolution is used to output a disaster risk evidence vector, with a fixed dimension of 4, corresponding to the four disaster risk levels. ReLU activation is used at the output to ensure that each component of the evidence vector is non-negative. Finally, the disaster risk evidence map output by each expert branch is multiplied by the corresponding expert gating weight according to the pixel position. The first evidence map of the modality is obtained, and the first evidence maps of all modalities and all experts are gathered to obtain the first evidence map set.

[0023] In this specific embodiment, S4 includes: The system takes the first set of evidence maps as input and constructs a conflict map at the pixel scale to perform suppressive adjustments on the first set of gating weights, thereby obtaining the second set of gating weights and the second set of evidence maps; The system first targets each mode At each pixel location The first modality evidence vector of the modality is obtained by intramodal convergence of the first evidence graphs corresponding to the two expert branches of the modality. ,in It is a non-negative vector of length 4, and its four components are... Each corresponds to one of the four disaster risk levels. The convergence method employs a component-wise summation of the evidence vectors from the two expert branches, thereby... This mode is also included at the cell location. Multiple expert evidence contributions; To eliminate the impact of differences in the dimensions of evidence on the calculation of the degree of difference, the system will The evidence distribution vector is obtained by normalizing the components. ,in Given a non-negative vector of length 4 whose component sum is 1, a numerical stability constant is introduced during normalization. And order To avoid division by zero when the sum of the components is zero; Then the system at each cell location For any two different modalities, calculate the evidence difference and use... Distance is used as a measure of evidence difference. The system averages the evidence differences of all modal pairs to obtain a pixel-level difference map, and then... The collision graph is obtained by mean pooling within the neighborhood window centered on the center. The size of the neighborhood window is fixed at 1. And the set of pixels within the window is denoted as The conflict graph The larger the value, the more inconsistent the multimodal evidence at that location; The calculation of the conflict graph satisfies: ; in Indicates cell position Conflict values at the location, Indicated by cell position Centered The set of cell locations within the neighborhood window Represents a set The number of pixels in This indicates the cell position index within the neighborhood window. This represents the set of modality indices participating in the fusion. Indicates the number of modes. and Represents the modal index of two different modes and satisfies Used to enumerate all non-repeating modal pairs. and Representing modes With mode At the cell position The evidence distribution vector at that location, Representing vectors The norm is the sum of the absolute values of its components; After obtaining the conflict diagram Then, the system uses a fixed threshold. Determine high-conflict pixels and order That is, when Time will pixel position Marked as high-collision pixels; To achieve the aforementioned conflict type discrimination, the system averages the mass uncertainty maps across the modal dimension to obtain a joint mass uncertainty map. and with a fixed threshold Determine the pixels with higher uncertainty and let ,in Indicates cell position The joint mass uncertainty value at the location and The larger the value, the more unstable the quality assessment at that location. and The system will determine the cell location in the collision type discrimination map. Marked as a quality-driven conflict, when and The system will determine the cell location in the collision type discrimination map. Marked as a semantically driven conflict; The system then uses the conflict graph. A suppressive adjustment is performed on the first gating weight set based on the quality uncertainty map set to generate a second gating weight set. This suppressive adjustment applies to both modal gating weights and expert gating weights, and reduces the corresponding gating weights at high-conflict pixel locations and pixels with high uncertainty. Conflict suppression employs an exponential decay mechanism and introduces two sets of conflict decay intensity parameters to distinguish between quality-driven and semantic-driven conflicts. The system sets the conflict decay coefficient corresponding to quality-driven conflicts as... Set the conflict attenuation coefficient corresponding to semantic-driven conflict to This ensures that the reduction in modal gating weights for semantically driven conflict pixels is less than the reduction in modal gating weights for quality-driven conflict pixels. Quality uncertainty suppression also employs an exponential decay mechanism, with the uncertainty decay coefficient set to... To further reduce the weight of regions with high uncertainty; The system gates each modal weight in the first set of gated weights. With each expert's gating weight Multiply by respectively The determined conflict attenuation factor and the mass uncertainty diagram The determined uncertainty attenuation factor yields the unnormalized second gating weights. Then, the modal gating weights for all modes are normalized along the modal dimension to apply this normalization at each pixel location. The place satisfies the The summation is 1, and the expert gating weights of the two expert branches within the same modality are normalized along the expert dimension so that they are at each pixel position. to The sum is 1; Finally, the system reweights the first evidence map set according to the second gating weight set to generate the second evidence map set. The reweighting occurs at each pixel position. The first set of evidence diagrams belongs to the modality. And it belongs to the expert branch The evidence vector is multiplied by the product of the second modality gating weight and the second expert gating weight corresponding to the pixel position, and the conflict map and conflict type discrimination map are retained simultaneously.

[0024] In this specific embodiment, S5 includes: The system takes the second set of evidence graphs and the second set of gating weights as input, performs two-stage evidence fusion, and outputs a disaster risk graph, an uncertainty graph, and a confidence graph obtained by group calibration. The second set of evidence images is located at the pixel position. The element at the location is denoted as Representing modes The Each expert branch at the pixel location The output disaster risk evidence vector is non-negative and is connected to the output. The two sets of gating weights correspond to four disaster risk levels and include the second modal gating weights. With second expert gating weight And both types of weights are from Normalization yields and satisfies the modal index set respectively. Summation equal to 1 and expert branch index within the same mode The sum is 1; The system performs quality prior injection before the two-stage fusion to inject the multidimensional quality vector. Transform into a modal-level Dirichlet prior parameter vector ,in and The system will be consistent and its components represent the severity of quality problems in that mode. The mean of the components is denoted as and with Calculate modal reliability This allows modes with less quality issues to obtain larger amounts of prior empirical data. The same component is taken across the four disaster risk levels to ensure that the prior only expresses modal reliability without introducing class bias. The system will then with by The constructed Dirichlet distribution parameter vectors are summed to obtain an expert-level Dirichlet distribution parameter vector. Then, the Dirichlet distribution parameter vectors are weighted and superimposed according to a two-stage gating weight to complete intra-modal and inter-modal fusion, thus obtaining the fused Dirichlet distribution parameter vector at the pixel scale. And based on this, the posterior probability distribution is obtained. Its calculation process satisfies: , , , , , ; in Representing modes Reliability, This represents an exponential function with the natural constant as its base. Represents the reliability attenuation coefficient and takes Represents a multidimensional mass vector The mean of the components, Representing modes The Dirichlet prior parameter vector, Denotes the prior strength coefficient and takes This represents a vector of length 4 consisting entirely of 1s. Representing modes The Each expert branch at the pixel location The expert-level Dirichlet distribution parameter vector at the location. Representing modes At the cell position The modal-level Dirichlet distribution parameter vector obtained through intramodal fusion. Indicates the position of a cell The fused Dirichlet distribution parameter vector obtained through intermodal fusion. Indicates cell position Belongs to the first The posterior probability of each disaster risk level and express The One portion, This represents the set of modality indices participating in the fusion. Indicates cell position The second modal gating weights at the location, Indicates cell position The second expert gating weight, Indicates modal index, This represents the expert branch index and takes 1 and... This represents the cell location index under a unified spatial grid; System basis Generate a disaster risk map and use the disaster risk level with the highest posterior probability as the cell location. Output the disaster risk level at the location, and calculate the amount of evidence at the same time. As The sum of each component and and An uncertainty plot is generated jointly, where the uncertainty consists of two parts, and the first part adopts... The Shannon entropy was calculated and normalized to the logarithm of the number of categories (4) to limit the numerical range. The second part uses the number of categories (4) and the amount of evidence. The ratio represents the uncertainty caused by insufficient evidence, and the ratio is truncated and linearly normalized to limit it to... The two uncertainties are weighted by 0.5 and 0.5 respectively to obtain the final uncertainty value, thus forming an uncertainty diagram. After obtaining the disaster risk map, the system performs confidence calibration based on quality and conflict grouping to generate a confidence map, where uncalibrated confidence is determined by pixel location. The maximum posterior probability at point is denoted as . Grouping based on conflict diagram With fusion quality severity jointly determined and From each mode According to the second modality gating weight The system will perform a weighted summation to obtain the result. By threshold set Divided into three levels and By threshold set The system is divided into three levels, resulting in a total of nine groups, and a set of calibration parameters is stored for each group. The calibration model uses Platt scaling and each group corresponds to a set of scalar parameters. and ,in The group index is 1 to 9. During the offline learning phase, a validation set with historical annotations is used, and the objective within each group is to minimize the log-likelihood loss. and Perform iterative optimization and initialize parameters to and The online inference stage will include uncalibrated confidence levels. Input by and The parameterized logistic regression obtains the calibrated confidence score and outputs it according to the pixel position to form a confidence score map. This ensures that the confidence score map is consistent with the actual accuracy under different quality and conflict conditions and is output together with the disaster risk map and uncertainty map.

[0025] In this specific embodiment, S6 includes: The system generates early warning results using a disaster risk map, confidence map, and conflict map as inputs, where the disaster risk map is denoted as... and Indicates cell position The disaster risk level is determined by the value of [missing information]. The confidence level is denoted as and Indicates cell position The calibrated confidence level is then taken as the value. The conflict diagram is as follows and Indicates cell position Conflict values at the location; The system first extracts early warning areas and determines early warning levels based on the disaster risk map, using a fixed threshold as the extraction rule. and will satisfy The selected pixels are used as candidate warning pixels. Then, connected component labeling is performed on these candidate warning pixels to obtain a set of warning regions. Connectivity is determined using 8-neighborhood connectivity to avoid region fragmentation caused by diagonal breaks. For each warning region, the area pixel count is calculated and set to a fixed threshold. Area filtering is used to remove small noise areas, where... Indicates the minimum number of pixels allowed to be issued in the warning area; For each warning area filtered by area Determine the corresponding warning level ,in This represents the index of the warning area, which corresponds one-to-one with the elements in the area set, and the warning level. The maximum value of the disaster risk level within the region is used to ensure that the most severe risk within the region is covered, that is, when the region has When there are only a few pixels in the area, the warning level for that area is 4. And does not exist When the number of pixels reaches a certain threshold, the warning level for that area is 3. The system then determines the corresponding confidence level for each warning area based on the confidence level map and uses the arithmetic mean of the cell confidence levels within the area as the statistical value to obtain a stable area-level confidence level. The calculation satisfies: ; in Indicates the warning area The corresponding confidence level, Indicates the first The set of pixel locations corresponding to each warning area Represents a set The number of pixels in This represents the cell location index under a unified spatial grid. Indicates cell position Confidence value at; The system extracts conflict area hints based on the conflict diagram and satisfies... The pixels are considered high-collision pixels, among which The system performs 8-neighbor connected component labeling on high-collision pixels to obtain a set of conflict regions, and then compares it with each warning region. occur The conflict areas where spaces intersect serve as conflict area indicators for the warning area, highlighting locations where modal inconsistencies may exist in the warning conclusions. The system outputs the warning area as a geographic vector boundary for publication and display. Specifically, it defines each warning area under a unified coordinate reference. The outer boundary is extracted from the set of pixels and a closed polygon is generated. The outer boundary extraction adopts the boundary tracing from raster to vector and performs Douglas-Peucker simplification on the generated polygon to reduce the number of vertices. The simplification tolerance is set to 1 pixel side length to ensure that the boundary error does not exceed one pixel. The system generates warning results according to preset warning release rules, which include level release conditions and confidence level release conditions, and the parameters are fixed at level thresholds. With confidence threshold The system provides warnings for each warning area. The judgment is made if and only if both conditions are met. and The warning area will be classified as a valid warning area, and its warning area polygon warning level will be output. Corresponding confidence level The warning area will be a set of polygons that intersect with it and indicate the conflict area. Warning areas that do not meet both of the above conditions will not be issued and will only be saved as internal candidate results for tracking and review.

[0026] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

[0027] This invention addresses the problems of large quality fluctuations, inconsistent modal responses, and the lack of reliable quantitative evidence in multimodal remote sensing images across different regions and time periods. It synergistically couples quality assessment, gated fusion, evidence-based uncertainty modeling, and confidence calibration. First, a quality assessment model outputs a multidimensional quality vector and a quality uncertainty map, characterizing the usability of each modality at the pixel scale. Then, a hierarchical gated hybrid expert model maps quality information to modal and expert gate weights, adaptively suppressing low-quality modalities and unstable expert branches during fusion. Next, a conflict map is constructed based on the differences in evidence from different modalities, and the gate weights are adjusted a second time, with the evidence reweighted to prioritize consistent and reliable information sources when modal contradictions exist. Finally, two-stage evidence fusion yields fused evidence and outputs a disaster risk map and uncertainty map. Simultaneously, confidence calibration of the posterior probability is performed by combining quality and conflict grouping, forming a confidence map and conflict area indication that can be used for early warning issuance. This reduces false alarms, missed alarms, and boundary bias, and improves the stability and interpretability of early warning conclusions.

[0028] Compared to existing multimodal fusion methods, this invention makes several structural improvements to address the aforementioned technical problems: First, it expands the quality representation from a single quality score to a quality vector containing multiple indicators and introduces a quality uncertainty graph, making the gating decision spatially sensitive and able to reflect the uncertainty of the quality assessment itself; Second, it adopts a hierarchical gating structure to simultaneously gating modalities and expert branches, and designs a secondary gating mechanism for conflict feedback, so that the fusion is not only quality-driven but also explicitly suppresses contradictory information between modalities; Third, it employs a two-stage evidence fusion method involving both intra-modal and inter-modal aspects, outputting conflict and uncertainty graphs, so that risk conclusions and reliability are given simultaneously; Fourth, it improves the consistency between confidence and true accuracy through quality-conditional grouping calibration, thereby better serving early warning issuance rules and emergency decision-making. These structural improvements work together to enable this invention to more stably and accurately complete disaster risk identification and early warning under complex imaging conditions.

Claims

1. A disaster risk identification and early warning method based on multimodal remote sensing image data, characterized in that, include: S1. Acquire at least two different modal remote sensing image data of the same monitoring area within the same monitoring period, and perform preprocessing to obtain spatially registered images; S2. Output multidimensional quality vectors and quality uncertainty diagrams for each modality of remote sensing image using the quality assessment model; S3. Based on spatially registered images, multidimensional quality vectors, and quality uncertainty maps, a hierarchical gated hybrid expert model is used to generate gate weights. At least two expert branches are set for each modal remote sensing image, and each expert branch outputs a disaster risk evidence map. S4. Calculate the evidence differences between different modes based on the disaster risk evidence map to generate a conflict map, and adjust the gating weights with the quality uncertainty map to reweight the disaster risk evidence map; S5. Under the adjusted gating weights, perform two-stage evidence fusion on the disaster risk evidence map to obtain a fused evidence map. The first stage is fusion within the same modality, and the second stage is fusion between different modalities. Output the disaster risk map and uncertainty map based on the fused evidence map. Group and calibrate the posterior probability according to the multidimensional quality vector and conflict map to obtain the confidence map. S6. Generate early warning results based on disaster risk maps, confidence maps, and conflict maps.

2. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 1, characterized in that, S1 includes: Acquire multimodal remote sensing images of the same monitoring area within the same monitoring period, and acquire the imaging time, sensor parameters, and geographic location parameters corresponding to each modal remote sensing image, wherein the imaging time difference of the multimodal remote sensing images does not exceed a preset threshold Δt. Radiometric consistency processing is performed on the multimodal remote sensing images respectively. The radiometric consistency processing includes radiometric calibration and radiometric normalization based on the same reference image. The radiometric normalization includes either histogram matching normalization or linear regression normalization. Geometric correction is performed on each modal remote sensing image after radiometric consistency processing, including orthorectification based on the geolocation parameters; After geometric correction, the remote sensing images of each modality are unified to the same coordinate reference and the same spatial resolution, and spatial registration is performed using the same spatial grid to obtain the registered image set. The remote sensing images of each modality in the registered image set correspond one-to-one in terms of pixel position.

3. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 1, characterized in that, S2 include: For each modal remote sensing image in the registered image set, a quality feature map is extracted using the quality assessment model, and a multidimensional quality vector and a quality uncertainty map of the modal remote sensing image are generated based on the quality feature map. The multidimensional quality vector is obtained by spatial convergence of the quality feature map and includes optical quality indicators for characterizing the degree of cloud and fog obstruction, shadow degree and low illumination degree, synthetic aperture radar quality indicators for characterizing the intensity of speckle noise and the degree of geometric distortion, and at least one thermal infrared quality indicator for characterizing the degree of thermal infrared saturation, thermal infrared noise intensity, thermal infrared strip artifact intensity and abnormal high temperature artifact degree. The mass uncertainty diagram is used to characterize the spatial uncertainty of the mass index corresponding to the multidimensional mass vector; The mass vector set is obtained by collecting the multidimensional mass vectors corresponding to each modal remote sensing image, and the mass uncertainty map set is obtained by collecting the mass uncertainty map corresponding to each modal remote sensing image.

4. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 1, characterized in that, S3 include: In the hierarchical gated hybrid expert model, at least two expert branches are set for each modality of remote sensing image in the registered image set; Generate corresponding modal feature maps based on each modal remote sensing image; A first gating weight set is generated based on the set of quality vectors and the set of quality uncertainty maps. The first gating weight set includes modal gating weights for each modal remote sensing image and expert gating weights for each expert branch of the same modal remote sensing image. The modal gating weights and the expert gating weights are determined by a monotonically decreasing function of mass uncertainty, such that the greater the mass uncertainty, the smaller the corresponding gating weights. The monotonically decreasing function includes at least one of the following: The exponential decay form outputs an exponent of the mass uncertainty that is negative λ times the natural constant, where λ is a parameter that is greater than zero. The inverse decay form outputs the ratio of one to one times the mass uncertainty, where λ is a parameter greater than zero. Based on the normalization form of softmax, it first maps the quality uncertainty to a gate score that is negatively correlated with the quality uncertainty, and then performs softmax normalization on the gate score to obtain the gate weight. The modal feature maps are fused in a hierarchical weighted manner based on the first gating weight set, and each expert branch outputs the corresponding disaster risk evidence map to obtain the first evidence map set.

5. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 1, characterized in that, S4 includes: Based on the first set of evidence maps, calculate the evidence difference of each modal remote sensing image at the same pixel location. The evidence difference includes any one of Jensen-Shannon divergence, Kullback-Leibler divergence, L1 distance, and L2 distance. The differences in the evidence are spatially aggregated within a preset neighborhood window to generate a conflict graph. The spatial aggregation includes either mean pooling or max pooling. High-conflict pixels are pixels in the conflict map whose conflict value is greater than a preset threshold Tconf. The preset threshold Tconf is a fixed threshold or an adaptive threshold determined based on the statistical quantiles of the conflict value. The first gating weight set is subjected to suppressive adjustment based on the conflict map to generate a second gating weight set. The suppressive adjustment includes reducing the gating weight corresponding to the high conflict pixel position at the high conflict pixel position, and reducing the gating weight corresponding to the high uncertainty pixel position at the pixel position indicated by the quality uncertainty map set. The pixels with higher uncertainty are those whose quality uncertainty in the quality uncertainty map is greater than the threshold Tu. The threshold Tu is a fixed threshold or an adaptive threshold determined based on the statistical quantile of quality uncertainty. The second set of gated weights is obtained by normalizing the first set of gated weights after multiplying it by the conflict attenuation factor and the uncertainty attenuation factor respectively. The second set of gated weights includes modal gated weights for different modal remote sensing images and expert gated weights for each expert branch of the same modal remote sensing image. The conflict attenuation factor is a monotonically decreasing function of the conflict value, and the uncertainty attenuation factor is a monotonically decreasing function of the mass uncertainty; The conflict attenuation factor and the uncertainty attenuation factor each include at least one of the following: The exponential decay form outputs a negative λ times the input value, with the natural constant as the base, where λ is a parameter greater than zero. The reciprocal decay form outputs the ratio of one to one times the input value plus λ, where λ is a parameter greater than zero. Based on the normalization form of softmax, it first maps the input value to a decay score that is negatively correlated with the input value, and then performs softmax normalization on the decay score to obtain the decay factor. The first evidence graph set is reweighted based on the second gating weight set to generate the second evidence graph set.

6. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 1, characterized in that, S5 include: A two-stage evidence fusion process is performed based on the second set of evidence maps. In the first stage, for the same modal remote sensing image, the disaster risk evidence map corresponding to the modal remote sensing image is fused according to the expert gating weight in the second set of gating weights to obtain the modal evidence map of the modal remote sensing image. In the second stage, the modal evidence maps of each modal remote sensing image are fused according to the modal gating weight in the second set of gating weights to obtain the fused evidence map. The fusion operation includes weighted summation or weighted averaging of the evidence vector according to the corresponding gate weights, weighted superposition of the Dirichlet distribution parameter vector according to the corresponding gate weights, and at least one of the Dempster-Shafer evidence combination rules. The disaster risk evidence map includes evidence vectors output by pixels. Each component of the evidence vector corresponds to a disaster risk level and is non-negative. A Dirichlet distribution parameter vector is constructed based on the evidence vector. The evidence quantity is the sum of the components of the Dirichlet distribution parameter vector, or a monotonic function of the sum of the components; The disaster risk map is determined based on the normalization result of the fused Dirichlet distribution parameter vector; The uncertainty map is determined based on the fused evidence map, and the uncertainty includes at least one of the uncertainty determined based on the Shannon entropy of the probability distribution of disaster risk level and the uncertainty determined based on the reciprocal of the sum of the components of the Dirichlet distribution parameter vector; The pixels in the disaster risk map are divided into multiple groups based on the quality vector set and the conflict map, and the posterior probability corresponding to each group is calibrated with confidence based on the group calibration parameters to obtain the confidence map. The posterior probability is the expected probability of the disaster risk level probability distribution obtained from the fused Dirichlet distribution parameter vector, or the maximum class probability in the probability distribution; The grouped calibration parameters are obtained offline through historical labeled samples or validation sets. The offline learning includes any one of temperature scaling, Platt scaling, and order-preserving regression, with the goal of minimizing the log-likelihood loss or the desired calibration error.

7. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 1, characterized in that, S6 include: Based on the disaster risk map, early warning areas are extracted and the corresponding early warning levels for the early warning areas are determined; The confidence level of the warning area is determined based on the confidence level map, and the corresponding confidence level is the statistical value of the pixel confidence level within the warning area; Based on the conflict map, conflict region hints are extracted, and the conflict region hints include the area range formed by high conflict pixels in the conflict map; The warning area, the warning level, the corresponding confidence level, and the conflict area prompt are compiled and output according to the preset warning release rules to generate the warning result. The preset warning release rules include comparing the warning level with a preset level threshold and comparing the corresponding confidence level with a preset confidence threshold. The warning result is output when both the level release condition and the confidence release condition are met.

8. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 5, characterized in that, The conflict map further includes a conflict type discrimination map to distinguish between quality-driven conflict and semantic-driven conflict. The quality-driven conflict is a pixel whose conflict value and quality uncertainty are both greater than the corresponding threshold. The semantic-driven conflict is a pixel whose conflict value is greater than the corresponding threshold and whose quality uncertainty is not greater than the corresponding threshold. The modal gating weight is reduced for the semantic-driven conflict pixel by a smaller amount than that for the quality-driven conflict pixel.

9. The disaster risk identification and early warning method based on multimodal remote sensing image data according to claim 6, characterized in that, The two-stage evidence fusion process further includes quality prior injection: the Dirichlet prior parameter vector corresponding to each modality remote sensing image is determined based on the multidimensional quality vector, and the Dirichlet prior parameter vector is superimposed with the Dirichlet distribution parameter vector constructed from the disaster risk evidence map before the fusion operation is performed, so that the amount of evidence corresponding to the higher quality modality increases after fusion.

10. A disaster risk identification and early warning system based on multimodal remote sensing image data, used to execute the disaster risk identification and early warning method based on multimodal remote sensing image data as described in any one of claims 1 to 9, characterized in that, include: The image generation module acquires at least two different modal remote sensing images of the same monitoring area and the same monitoring period and generates spatially registered images. The quality assessment module outputs multidimensional mass vectors and mass uncertainty diagrams for each mode. The expert module generates gating weights based on the spatially registered images, multidimensional mass vectors, and mass uncertainty diagrams, and sets at least two expert branches for each mode to output a disaster risk evidence map; The gating adjustment module generates a conflict diagram from the disaster risk evidence diagram and adjusts the gating weights in conjunction with the quality uncertainty diagram to reweight the disaster risk evidence diagram; The fusion and calibration module performs intra-modal fusion and inter-modal fusion under adjusted gating weights to obtain a fusion evidence map, and outputs a disaster risk map, an uncertainty map, and a confidence map obtained by grouping and calibrating according to multidimensional mass vectors and conflict maps. The early warning generation module outputs early warning results based on the disaster risk map, confidence map, and conflict map.