Pulse-suppressed landslide range enhancement extraction method and electronic equipment

By combining an encoder with impulse response characteristics and a cross-layer perception decoder, the problem of imbalance between landslide pixels and background pixels in large-scale landslide extraction is solved, and high-precision landslide feature extraction in complex scenes is achieved.

CN121789059BActive Publication Date: 2026-06-30AEROSPACE INFORMATION RES INST CAS

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
AEROSPACE INFORMATION RES INST CAS
Filing Date
2025-12-30
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

In large-scale landslide extraction, existing technologies suffer from an imbalance between landslide pixels and background pixels, making it difficult for models to accurately capture landslide features at different scales in complex scenes. In particular, the recognition performance is poor under conditions of extreme class imbalance.

Method used

An encoder with impulse response characteristics is used to encode candidate landslide images, generating multiple levels of image features. A cross-layer perceptual decoder is then used for decoding to suppress background interference and enhance landslide feature extraction.

Benefits of technology

It effectively focuses on the response of landslide areas, suppresses large-area background interference, improves the accuracy and fine perception of landslide area identification, and is suitable for complex remote sensing scenarios.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN121789059B_ABST
    Figure CN121789059B_ABST
Patent Text Reader

Abstract

This invention provides a landslide range enhancement extraction method and electronic device with impulse spatial suppression, applicable to the fields of image processing technology and deep learning technology. The method includes: preprocessing the remote sensing image to be extracted to obtain candidate landslide images; encoding the candidate landslide images using an encoder with impulse response characteristics to generate multiple levels of image features, wherein the encoder with impulse response characteristics is used to improve the extraction of landslide features in the candidate landslide images and to suppress background interference features in the candidate landslide images; and decoding the L levels of image features using a cross-layer perceptual decoder to obtain the target landslide image.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of image processing technology and deep learning technology, and more specifically, to a method and electronic device for enhancing the extraction of landslide range with impulse spatial suppression. Background Technology

[0002] Landslides are among the most common and serious geological hazards, and their frequent occurrence severely impacts regional infrastructure and land resources. Therefore, quickly and accurately identifying landslide areas is not only fundamental to emergency response but also crucial for landslide causation analysis, risk monitoring, and long-term remediation.

[0003] Remote sensing technology, due to its ability to acquire large-scale, multi-temporal, and high-coverage surface information, has become an important data source for landslide identification. In recent years, the combination of image processing and artificial intelligence technologies has driven the development of automatic landslide identification methods.

[0004] However, most current research focuses on landslide identification in specific small areas. In practical applications, when extracting large-scale landslides, the number of landslide pixels in remote sensing images is far less than the number of background pixels. The model is prone to ignoring small-area or blurred-boundary landslide areas, making it difficult to accurately capture landslide features at different scales in complex scenarios. Summary of the Invention

[0005] In view of this, the present invention provides a method and electronic device for enhancing the extraction of landslide range by pulse spatial suppression.

[0006] One aspect of the present invention provides a method for enhancing landslide extent extraction using impulse spatial suppression, comprising: preprocessing a remote sensing image to be extracted to obtain a candidate landslide image; encoding the candidate landslide image using an encoder with impulse response characteristics to generate multiple levels of image features, including: repeatedly performing the following operations until the Lth level image feature corresponding to the Lth impulse driving module is generated, resulting in L levels of image features, wherein the encoder with impulse response characteristics includes L impulse driving modules, each impulse driving module generating its own corresponding level image feature; for the i-th impulse driving module, performing layer normalization and linear mapping on the (i-1)-th level image feature to generate a query feature, key feature, and value feature corresponding to the i-th impulse driving module, and their... In this context, L is a positive integer ≥ 3, and 2 ≤ i ≤ L. Based on query features and key features, the i-th first intermediate image feature is determined. The i-th first intermediate image feature is encoded with impulse response to obtain the i-th second intermediate image feature. Based on the value features and the i-th second intermediate image feature, the i-th level image feature corresponding to the i-th impulse drive module is generated. When i=1, the first level image feature is generated by encoding the candidate landslide image using the first impulse drive module. The encoder with impulse response characteristics is used to improve the extraction of landslide features in the candidate landslide image and to suppress background interference features in the candidate landslide image. The L level image features are decoded using a cross-layer perceptual decoder to obtain the target landslide image.

[0007] According to an embodiment of the present invention, each pulse driving module includes a K-layer pulse driving sub-module; the ith first intermediate image feature is encoded with an impulse response to obtain the ith second intermediate image feature, including: performing a Gaussian convolution operation on the ith first intermediate image feature for the k-th layer of the K-layer pulse driving sub-module of the ith pulse driving module to obtain the k-th layer convolutional image feature; and performing attention sorting and normalization processing on the k-th layer convolutional image feature based on a selective attention mechanism to obtain the ith second intermediate image feature of the k-th layer, thus obtaining the ith second intermediate image feature of the K-layer, where K is a positive integer ≥ 1, and 1 ≤ k ≤ K.

[0008] According to an embodiment of the present invention, generating an i-th layer image feature corresponding to the i-th pulse driving module based on the value feature and the i-th second intermediate image feature includes: multiplying the i-th second intermediate image feature of the k-th layer with the value feature to obtain a third intermediate image feature of the k-th layer, thus obtaining a third intermediate image feature of the K-th layer; and stitching the third intermediate image feature of the K-th layer to generate an i-th layer image feature corresponding to the i-th pulse driving module.

[0009] According to an embodiment of the present invention, the cross-layer sensing decoder includes a first decoding module, a second decoding module, and a third decoding module; the cross-layer sensing decoder is used to decode at least two levels of image features to obtain a target landslide image, including: using the first decoding module to decode the Lth level image feature corresponding to the Lth pulse driving module to obtain the Lth image decoding feature; repeating this operation until a second image decoding feature is generated; using the second decoding module to decode the i-th image decoding feature and the (i-1)th level image feature corresponding to the (i-1)th pulse driving module to obtain the (i-1)th image decoding feature; using the third decoding module to decode the second image decoding feature and the first level image feature corresponding to the first pulse driving module to obtain the first image decoding feature; and obtaining the target landslide image based on the first image decoding feature.

[0010] According to an embodiment of the present invention, the first decoding module includes a multi-scale context-aware module and a high-efficiency upsampling module; using the first decoding module, the Lth level image feature corresponding to the Lth pulse driving module is decoded to obtain the Lth image decoding feature, including: using the multi-scale context-aware module to perform multi-scale feature extraction on the Lth level image feature to obtain the Lth intermediate image decoding feature; and using the high-efficiency upsampling module to upsample the Lth intermediate image decoding feature to obtain the Lth image decoding feature.

[0011] According to an embodiment of the present invention, the second decoding module includes a cross-scale cross-fusion module, a multi-scale context-aware module, and an efficient upsampling module. Using the second decoding module, the i-th image decoding feature and the i-1th level image feature corresponding to the (i-1)th pulse-driven module are decoded to obtain the (i-1)th image decoding feature. This includes: using the cross-scale cross-fusion module to perform cross-scale cross-feature fusion on the i-th image decoding feature and the i-1th level image feature corresponding to the (i-1)th pulse-driven module to obtain the (i-1)th image fusion feature; using the multi-scale context-aware module to perform multi-scale feature extraction on the (i-1)th image fusion feature to obtain the (i-1)th intermediate image decoding feature; and using the efficient upsampling module to upsample the (i-1)th intermediate image decoding feature to obtain the (i-1)th image decoding feature.

[0012] According to an embodiment of the present invention, the third decoding module includes a cross-scale cross-fusion module and a multi-scale context-aware module. The third decoding module decodes the second image decoding feature and the first-level image feature corresponding to the first pulse-driven module to obtain the first image decoding feature. This includes: using the cross-scale cross-fusion module to perform cross-scale cross-feature fusion on the second image decoding feature and the first-level image feature corresponding to the first pulse-driven module to obtain the first image fusion feature; and using the multi-scale context-aware module to extract multi-scale features from the first image fusion feature to obtain the first image decoding feature.

[0013] According to an embodiment of the present invention, obtaining a target landslide image based on a first image decoding feature includes: obtaining a target landslide image based on a first image decoding feature, an (i-1)th intermediate image decoding feature, and an Lth intermediate image decoding feature.

[0014] Another aspect of the present invention provides an electronic device comprising:

[0015] One or more processors;

[0016] Memory, used to store one or more programs.

[0017] Specifically, when one or more programs are executed by one or more processors, the one or more processors implement the above method.

[0018] Another aspect of the present invention provides a computer-readable storage medium having executable instructions stored thereon, which, when executed by a processor, cause the processor to perform the method described above.

[0019] According to an embodiment of the present invention, candidate landslide images are obtained by preprocessing the remote sensing image to be extracted; the candidate landslide images are then encoded using an encoder with impulse response characteristics. The encoder includes L impulse drive modules, each generating a corresponding layer-level image feature, until the Lth layer-level image feature corresponding to the Lth impulse drive module is generated, resulting in L layer-level image features. These L layer-level image features are then decoded using a cross-layer perception decoder to obtain the target landslide image. By employing an encoder with impulse response characteristics to extract landslide features from the candidate landslide image and effectively suppressing background interference, the technology effectively focuses on the response to the landslide area and suppresses large-area background interference. Simultaneously, the use of a cross-layer perception decoding module enhances the fine perception and expression of landslide features, improving the accuracy of landslide area identification. Attached Figure Description

[0020] The above and other objects, features and advantages of the present invention will become more apparent from the following description of embodiments of the invention with reference to the accompanying drawings, in which:

[0021] Figure 1 An exemplary system architecture for landslide extent enhancement extraction, which can be applied to a method based on pulse spatial suppression according to an embodiment of the present invention, is shown;

[0022] Figure 2 A flowchart of a landslide range enhancement extraction method with pulse spatial suppression according to an embodiment of the present invention is shown;

[0023] Figure 3 A schematic flowchart illustrating the process of obtaining candidate landslide images according to an embodiment of the present invention is shown;

[0024] Figure 4 A schematic diagram is shown showing the generation of the i-th level image features corresponding to the i-th pulse driving module using the network structure of the i-th pulse driving module;

[0025] Figure 5 A schematic diagram of the structure of a multi-scale context-aware module according to an embodiment of the present invention is shown;

[0026] Figure 6 A schematic diagram of the structure of a high-efficiency upsampling module according to an embodiment of the present invention is shown;

[0027] Figure 7 A schematic diagram of the cross-scale cross-fusion module in the second decoding module according to an embodiment of the present invention is shown;

[0028] Figure 8 A schematic diagram of the landslide extraction method according to an embodiment of the present invention is shown;

[0029] Figure 9(A) shows the original image of a landslide according to an embodiment of the present invention;

[0030] Figure 9(B) is a schematic diagram of the landslide range enhancement extraction result after pulse spatial suppression of the original image in Figure 9(A);

[0031] Figure 9(C) shows an original image of a landslide according to another embodiment of the present invention;

[0032] Figure 9(D) is a schematic diagram of the landslide extent enhancement extraction result after pulse spatial suppression of the original image in Figure 9(C);

[0033] Figure 10 A block diagram of a landslide range enhancement extraction device with pulse spatial suppression according to an embodiment of the present invention is shown;

[0034] Figure 11A block diagram of an electronic device suitable for implementing the landslide range enhancement extraction method with pulse spatial suppression described above, according to an embodiment of the present invention, is shown. Detailed Implementation

[0035] Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. However, it should be understood that these descriptions are exemplary only and are not intended to limit the scope of the invention. In the following detailed description, numerous specific details are set forth to provide a thorough understanding of the embodiments of the invention for ease of explanation. However, it will be apparent that one or more embodiments may be practiced without these specific details. Furthermore, descriptions of well-known structures and techniques are omitted in the following description to avoid unnecessarily obscuring the concept of the invention.

[0036] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. The terms “comprising,” “including,” etc., as used herein indicate the presence of features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.

[0037] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.

[0038] When using expressions such as "at least one of A, B and C", they should generally be interpreted in accordance with the meaning that is commonly understood by those skilled in the art (e.g., "a system having at least one of A, B and C" should include, but is not limited to, a system having A alone, a system having B alone, a system having C alone, a system having A and B, a system having A and C, a system having B and C, and / or a system having A, B and C, etc.).

[0039] In the embodiments of this invention, the collection, updating, analysis, processing, use, transmission, provision, disclosure, and storage of data (e.g., including but not limited to user personal information) comply with relevant laws and regulations, are used for legitimate purposes, and do not violate public order and good morals. In particular, necessary measures have been taken to prevent unauthorized access to user personal information data and to maintain the security of user personal information and network security.

[0040] In the embodiments of the present invention, the user's authorization or consent is obtained before acquiring or collecting the user's personal information.

[0041] Remote sensing technology, due to its ability to acquire large-scale, multi-temporal, and high-coverage surface information, has become an important data source for landslide extraction. In recent years, the combination of image processing and artificial intelligence technologies has driven the development of automatic landslide identification methods. Among these methods, pixel-based and object-based extraction strategies are widely used. However, these methods often rely on manually set feature thresholds, making them difficult to adapt to varied and complex terrain backgrounds. Machine learning methods, by learning the feature patterns in labeled samples, can achieve automatic differentiation between landslide and non-landslide areas to a certain extent, but they still rely heavily on manually constructed features and struggle to cope with the extraction challenges in environments with high background interference.

[0042] To further enhance the adaptive capabilities and recognition accuracy of models, deep learning has been rapidly applied in landslide extraction tasks. Through end-to-end training, it automatically learns feature representations without requiring manual feature rule design. This allows for more effective capture of complex landslide morphologies and differences between multiple landform sources, improving the model's robustness and generalization ability under various landform types and observation conditions, thus becoming an important development direction for landslide remote sensing identification. A multi-scale feature fusion scene parsing framework is proposed in related technologies to effectively extract landslide features. This framework, by integrating local and global information, can capture the details and overall structure of landslide areas at different scales, thereby improving the accuracy and robustness of landslide extraction.

[0043] However, the aforementioned methods primarily focus on specific, small-scale areas. In large-scale landslide extraction, landslides, as sparse targets, occupy only a tiny fraction of pixels in remote sensing images, while background features are diverse and numerous, leading to a severe imbalance in the ratio of landslide to non-landslide categories. When facing large areas with complex landform types, landslide pixels are far fewer than background pixels in remote sensing images. Models easily overlook small-area landslides or those with blurred boundaries. Furthermore, in complex terrain, background elements such as bare soil and rocks share similar landslide features, making boundary identification and extraction difficult and fragmented. Existing pseudo-sample generation, prototype learning, or domain adaptation methods can alleviate some of these problems, but synthetic samples lack realistic texture, prototype learning cannot fully learn the diversity of landslides, and domain adaptation may blur boundaries or enhance irrelevant backgrounds, making it difficult to accurately capture landslide features at different scales in complex scenes. Therefore, improving the model's sensitivity and discriminative power for landslide areas under large-scale, highly disturbed background conditions, especially maintaining stable performance under extreme class imbalances, has become the core challenge and development direction of current landslide remote sensing extraction research.

[0044] To address the aforementioned issues, this invention utilizes landslide candidate region extraction based on principal component analysis to eliminate large-scale background interference. Simultaneously, it introduces pulse-driven sparse feature learning to adaptively suppress background response and enhance the capture of landslides with blurred boundaries at the model level, thereby achieving high-precision extraction of large-scale landslides.

[0045] In view of this, embodiments of the present invention provide a landslide range enhancement extraction method with impulse spatial suppression, comprising: preprocessing the remote sensing image to be extracted to obtain a candidate landslide image; encoding the candidate landslide image using an encoder with impulse response characteristics to generate at least two levels of image features, wherein the encoder with impulse response characteristics is used to improve the extraction of landslide features in the candidate landslide image and to suppress background interference features in the candidate landslide image; and decoding the at least two levels of image features using a cross-layer perceptual decoder to obtain a target landslide image.

[0046] Figure 1 An exemplary system architecture for landslide extent enhancement extraction using a pulse spatial suppression method according to an embodiment of the present invention is shown. It should be noted that... Figure 1 The examples shown are merely examples of system architectures that can be applied to embodiments of the present invention, in order to help those skilled in the art understand the technical content of the present invention, but do not mean that embodiments of the present invention cannot be used in other devices, systems, environments or scenarios.

[0047] like Figure 1 As shown, the system architecture according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the first terminal device 101, the second terminal device 102, the third terminal device 103, and the server 105. The network 104 may include various connection types, such as wired and / or wireless communication links, etc.

[0048] Users can use the first terminal device 101, the second terminal device 102, and the third terminal device 103 to interact with the server 105 via the network 104 to receive or send messages, etc. Various communication client applications can be installed on the first terminal device 101, the second terminal device 102, and the third terminal device 103, such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, and / or social media platform software, etc. (for example only).

[0049] The first terminal device 101, the second terminal device 102, and the third terminal device 103 can be various electronic devices with displays and support web browsing, including but not limited to smartphones, tablets, laptops, and desktop computers.

[0050] Server 105 can be a server that provides various services, such as a backend management server that supports websites browsed by users using the first terminal device 101, the second terminal device 102, and the third terminal device 103 (this is just an example). The backend management server can analyze and process data such as received user requests, and feed back the processing results (such as web pages, information, or data obtained or generated according to user requests) to the terminal devices.

[0051] It should be noted that the landslide range enhancement extraction method with pulse spatial suppression provided in this embodiment of the invention can generally be executed by server 105. Correspondingly, the landslide range enhancement extraction device with pulse spatial suppression provided in this embodiment of the invention can generally be located in server 105. The landslide range enhancement extraction method with pulse spatial suppression provided in this embodiment of the invention can also be executed by a server or server cluster that is different from server 105 and capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and / or server 105. Correspondingly, the landslide range enhancement extraction device with pulse spatial suppression provided in this embodiment of the invention can also be located in a server or server cluster that is different from server 105 and capable of communicating with the first terminal device 101, the second terminal device 102, the third terminal device 103, and / or server 105. Alternatively, the landslide range enhancement extraction method with pulse spatial suppression provided in this embodiment of the invention can also be executed by the first terminal device 101, the second terminal device 102, or the third terminal device 103, or by other terminal devices different from the first terminal device 101, the second terminal device 102, or the third terminal device 103. Accordingly, the landslide range enhancement extraction device with pulse spatial suppression provided in this embodiment of the invention can also be set in the first terminal device 101, the second terminal device 102 or the third terminal device 103, or in other terminal devices different from the first terminal device 101, the second terminal device 102 or the third terminal device 103.

[0052] For example, the remote sensing image to be extracted may originally be stored in any one of the first terminal device 101, the second terminal device 102, or the third terminal device 103 (e.g., the first terminal device 101, but not limited thereto), or it may be stored on an external storage device and imported into the first terminal device 101. Then, the first terminal device 101 may locally execute the landslide range enhancement extraction method with pulse spatial suppression provided in this embodiment of the invention, or send the remote sensing image to be extracted to other terminal devices, servers, or server clusters, and have the other terminal devices, servers, or server clusters that receive the remote sensing image to be extracted execute the landslide range enhancement extraction method with pulse spatial suppression provided in this embodiment of the invention.

[0053] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.

[0054] Figure 2 A flowchart of a landslide range enhancement extraction method with pulse spatial suppression according to an embodiment of the present invention is shown.

[0055] like Figure 2 As shown, the method includes operations S201~S203.

[0056] In operation S201, the remote sensing image to be extracted is preprocessed to obtain candidate landslide images.

[0057] According to an embodiment of the present invention, the remote sensing image to be extracted is an image with a large number of background area features and sparse landslide area features.

[0058] According to embodiments of the present invention, the ground terrain of the remote sensing image to be extracted can be digitally simulated to calculate the slope of each region of the remote sensing image. Based on a set slope threshold, regions with slopes less than the set slope threshold are removed, while regions with slopes greater than the set slope threshold are retained. For example, regions with slopes greater than the set slope threshold can be regions with a certain tilt angle and obvious terrain undulations.

[0059] According to an embodiment of the present invention, the snow cover and ice cover regions of the remote sensing image to be extracted can be identified by calculating the Normalized Difference Snow Index (NDSI), and the identified snow cover and ice cover regions can be masked to avoid confusion between the high reflectivity of the snow cover and ice cover regions and landslide characteristics.

[0060] According to an embodiment of the present invention, vegetation cover areas in the remote sensing image to be extracted can be identified by calculating the Normalized Difference Vegetation Index (NDVI), and the identified vegetation cover areas can be masked.

[0061] According to an embodiment of the present invention, adjacent pre-temporal remote sensing images of the same region as the remote sensing image to be extracted can be acquired. Principal Component Analysis (PCA) is then used to perform principal component decomposition of multi-band, multi-temporal pixel reflectance data on both the remote sensing image to be extracted and the adjacent pre-temporal remote sensing images of the same region. This extracts the principal component images of the two images along the direction of maximum variance, i.e., the variation images, to highlight areas of significant surface disturbance. These disturbed areas reflect the degree of change in the surface state between the adjacent pre-temporal remote sensing image and the remote sensing image to be extracted at those two time points, and can serve as a potential indicator of landslide occurrence.

[0062] According to an embodiment of the present invention, based on the verification constraints of the surface disturbance areas, slopes, normalized difference snow index and normalized vegetation index highlighted in the above-mentioned change image, areas with obvious surface disturbance, terrain conditions that meet the probability of landslide occurrence (i.e., slope conditions) and not covered by snow or vegetation are extracted from the remote sensing image to be extracted as candidate landslide images, so as to eliminate large-scale background interference.

[0063] According to embodiments of the present invention, the method based on the above-mentioned verification constraints can significantly reduce background interference features in the remote sensing image to be extracted, improve the distinction between landslide features and non-landslide features, and thus effectively alleviate the impact of the landslide feature extraction task caused by the imbalance between the categories of landslide areas and background areas in the remote sensing image.

[0064] In operation S202, the candidate landslide image is encoded using an encoder with impulse response characteristics to generate multiple levels of image features. The encoder with impulse response characteristics is used to improve the extraction of landslide features in the candidate landslide image and to suppress background interference features in the candidate landslide image.

[0065] According to an embodiment of the present invention, an encoder with impulse response characteristics can extract features only from landslide areas of interest with high response when extracting features from candidate landslide images by simulating the nonlinear selective response mechanism of neurons that is "threshold activation, non-activation or inhibition", while suppressing features from low-response background areas.

[0066] According to an embodiment of the present invention, each of the multiple hierarchical image features can be obtained by extracting features from the previous hierarchical image feature using the corresponding pulse drive module in an encoder with impulse response characteristics, and the first hierarchical image feature is obtained by extracting features from the candidate landslide image using the first pulse drive module. Each pulse drive module generates a corresponding hierarchical image feature.

[0067] According to an embodiment of the present invention, the above operation S202 may include: repeatedly performing the following operations until the Lth level image feature corresponding to the Lth pulse driving module is generated, thereby obtaining L level image features: sub-operations S202-1 to sub-operations S202-4.

[0068] In sub-operation S202-1, for the i-th pulse driving module, layer normalization and linear mapping are performed on the (i-1)-th level image features to generate query features, key features and value features corresponding to the i-th pulse driving module, where L is a positive integer ≥ 3 and 2 ≤ i ≤ L.

[0069] In sub-operation S202-2, the i-th first intermediate image feature is determined based on the query feature and the key feature.

[0070] In sub-operation S202-3, the i-th first intermediate image feature is encoded with impulse response to obtain the i-th second intermediate image feature.

[0071] In sub-operation S202-4, based on the value features and the i-th second intermediate image features, the i-th level image feature corresponding to the i-th pulse driving module is generated. When i=1, the first level image feature is generated by encoding the candidate landslide image using the first pulse driving module.

[0072] According to an embodiment of the present invention, an encoder with impulse response characteristics includes L impulse drive modules, each impulse drive module generating a corresponding layer image feature.

[0073] According to an embodiment of the present invention, an encoder with impulse response characteristics may include L pulse drive modules. Each pulse drive module of the encoder with impulse response characteristics can generate corresponding hierarchical image features.

[0074] According to an embodiment of the present invention, a candidate landslide image can be used as the input of the encoder. The candidate landslide image is divided into blocks, and then the block-wise candidate landslide image is input to the first pulse driving module, which outputs the first level image feature corresponding to the first pulse driving module. The first level image feature is input to the second pulse driving module, which outputs the second level image feature corresponding to the second pulse driving module. And so on. For the i-th pulse driving module, the i-1 level image feature output by the i-1 pulse driving module is input to the i-th pulse driving module, which outputs the i-th level image feature corresponding to the i-th pulse driving module, until the L-1 level image feature is input to the L-th pulse driving module, which outputs the L-th level image feature corresponding to the L-th pulse driving module.

[0075] According to an embodiment of the present invention, for the i-th pulse driving module, obtaining the i-th level image feature corresponding to the i-th pulse driving module may include the following operations: after performing layer normalization and linear mapping on the (i-1)-th level image feature, corresponding query features, key features and value features can be generated; matrix multiplication calculation is performed based on the query features and key features to obtain the i-th first intermediate image feature.

[0076] In operation S203, the cross-layer perception decoder is used to decode the image features of L layers to obtain the target landslide image.

[0077] According to embodiments of the present invention, a cross-layer perceptual decoder may include a multi-scale context-aware module, an efficient upsampling module, and a cross-scale cross-fusion module. The multi-scale context-aware module is embedded in each stage of the decoding process to improve the discriminative power of features and the ability to focus on key regions; the efficient upsampling module is inserted in each decoding stage and is dedicated to restoring spatial dimensions and compensating for the lack of low-level semantic information; the cross-scale cross-fusion module fuses multi-layer feature information and establishes connections between different semantic levels through a cross-convolutional fusion strategy.

[0078] According to an embodiment of the present invention, each module in the cross-layer sensing decoder can be used to extract corresponding layer image features from each pulse-driven module feature in the encoder with impulse response characteristics, and decode them to obtain corresponding image decoding features. Then, the obtained image decoding features are fused and weighted to obtain the target landslide image.

[0079] According to an embodiment of the present invention, candidate landslide images are obtained by preprocessing the remote sensing image to be extracted; the candidate landslide images are then encoded using an encoder with impulse response characteristics. The encoder includes L impulse drive modules, each generating a corresponding layer-level image feature, until the Lth layer-level image feature corresponding to the Lth impulse drive module is generated, resulting in L layer-level image features. At least two layer-level image features are then decoded using a cross-layer perception decoder to obtain the target landslide image. By employing an encoder with impulse response characteristics to extract landslide features from the candidate landslide image and effectively suppressing background interference, the technology effectively focuses on the response to the landslide area and suppresses large-area background interference. Simultaneously, the use of a cross-layer perception decoding module enhances the fine perception and expression of landslide features, improving the accuracy of landslide area identification.

[0080] Figure 3 A schematic diagram of the process for obtaining candidate landslide images according to an embodiment of the present invention is shown.

[0081] like Figure 3As shown, a remote sensing image 301 to be extracted and an adjacent pre-temporal remote sensing image 302 of the same region as the remote sensing image to be extracted are acquired. Remote sensing feature extraction 303 is performed on the remote sensing image 301 to be extracted. Remote sensing feature extraction includes slope calculation 303-1 and index calculation 303-2. Index calculation 303-2 includes: calculating the Normalized Difference Snow Index (NDSI) 305 based on the Normalized Difference Snow Index (NDSI) of the remote sensing image to be extracted, and calculating the Normalized Difference Vegetation Index (NDVI) 305 based on the Normalized Difference Vegetation Index (NDVI) of the remote sensing image to be extracted. NDVI) 306; Slope values ​​304 and 305 are obtained by calculating the slope of the remote sensing image to be extracted; Principal component analysis (PCA) 307 is performed on the remote sensing image to be extracted 301 and the adjacent previous-phase remote sensing image 302 in the same region as the remote sensing image to be extracted, to obtain the change image 308 between the two; Based on a preset threshold, threshold segmentation is performed on the above slope value 304, normalized difference snow index (NDSI) 305, normalized difference vegetation index (NDVI) 306 and change image 308 to obtain candidate landslide image 309.

[0082] According to an embodiment of the present invention, each pulse driving module includes a K-layer pulse driving sub-module. Encoding the i-th first intermediate image feature with impulse response to obtain the i-th second intermediate image feature includes: performing a Gaussian convolution operation on the i-th first intermediate image feature at the k-th layer of the K-layer pulse driving sub-module of the i-th pulse driving module to obtain the k-th layer convolutional image feature; and performing attention sorting and normalization processing on the k-th layer convolutional image feature based on a selective attention mechanism to obtain the i-th second intermediate image feature of the k-th layer, thus obtaining the K-layer i-th second intermediate image feature, where K is a positive integer ≥ 1, and 1 ≤ k ≤ K.

[0083] According to an embodiment of the present invention, for the i-th pulse driving module, the i-th first intermediate image feature is input into the K-layer pulse driving sub-modules in the i-th pulse driving module. For each layer pulse driving sub-module, based on the pulse response mechanism, the landslide feature of the i-th first intermediate image feature is made to achieve high response, the landslide feature is extracted, and the background feature is suppressed, thereby obtaining the i-th second intermediate image feature output by each layer pulse driving sub-module.

[0084] Specifically, each pulse-driven submodule can include a Gaussian convolution unit, a selective attention unit, an attention mask unit, and a normalization unit. For the k-th layer of the k-th pulse-driven submodule of the i-th pulse-driven module, a Gaussian convolution unit can be used to perform a Gaussian convolution operation on the i-th first intermediate image feature. Based on the impulse response mechanism, the attenuation distribution modeling method in the Gaussian convolution unit, which decreases with spatial distance, is used to make the feature response closer to the center of the target region receive higher weights, while the influence of regions far from the center is gradually reduced, thereby forming a spatially... A selective attention response map, i.e., the features of the k-th layer convolutional image, is generated. Then, a selective attention unit is used to calculate the attention of these features, yielding the attention weight for each feature element. These attention weights are then sorted from highest to lowest, retaining the top 75% of the feature elements in the k-th layer's connection channels. A sparse activation mask is constructed using an attention mask unit, and normalization is performed using a normalization unit to obtain the i-th second intermediate image feature of the k-th layer. Using this method, the i-th second intermediate image feature corresponding to layer K can be obtained.

[0085] According to an embodiment of the present invention, generating an i-th layer image feature corresponding to the i-th pulse driving module based on the value feature and the i-th second intermediate image feature includes: multiplying the i-th second intermediate image feature of the k-th layer with the value feature to obtain a third intermediate image feature of the k-th layer, thus obtaining a third intermediate image feature of the K-th layer; and stitching the third intermediate image feature of the K-th layer to generate an i-th layer image feature corresponding to the i-th pulse driving module.

[0086] According to an embodiment of the present invention, the i-th second intermediate image feature of the K-th layer can be multiplied with the value feature to obtain the third intermediate image for each layer, that is, the third intermediate image feature of the K-th layer can be obtained. The third intermediate image feature of the K-th layer can be stitched together to obtain the i-th level image feature corresponding to the i-th pulse driving module.

[0087] According to embodiments of the present invention, a sparse activation strategy integrating a pulse-driven Gaussian convolution module and a selective attention module in the encoder can be utilized to trigger attention calculation only in the region of interest, significantly suppressing low-response background regions and improving the response sensitivity and discrimination ability for landslide areas. The final output is the landslide extraction result. This effectively focuses on the response of the landslide area, suppresses large-area background interference, and is suitable for remote sensing scenarios where the landslide and background categories are extremely imbalanced.

[0088] According to an embodiment of the present invention, the two-dimensional Gaussian distribution of the Gaussian convolution unit in each pulse-driven submodule can be used as a weight generation function. Gaussian convolution units are constructed using the weight values ​​calculated by this weight generation function, allowing the response decay rate of the candidate landslide image features to be adjusted by controlling the standard deviation parameter through the Gaussian convolution unit. Weight generation function As shown in equation (1):

[0089] (1);

[0090] Where x represents the horizontal offset distance of the current feature element position relative to the center of the convolution kernel, y represents the vertical offset distance of the current feature element position relative to the center of the convolution kernel, σ is a distribution control parameter used to adjust the deceleration rate of the response value as the spatial distance changes, and d is a shape control parameter.

[0091] According to an embodiment of the present invention, a spatially sensitive weighting function is first introduced into the local feature region at each location in each candidate landslide image to construct a location-related adjustment factor. This factor employs a pixel-centric decay distribution modeling approach, decreasing with spatial distance. This ensures that feature responses closer to the target center receive higher weights, while regions farther from the center have their influence gradually reduced, thus forming a spatially selective attention response map. Through this spatial adjustment mechanism, the model can be guided to focus on key regions while maintaining its receptive field breadth, enhancing the expressive strength of the landslide target and improving the robustness and discriminative performance of the attention module in complex remote sensing images.

[0092] According to an embodiment of the present invention, selective attention units can be used to perform selective attention calculation on the convolutional image features obtained by Gaussian convolution operations based on Gaussian convolution units, generating attention weights for each feature element in the convolutional image features, and arranging them from high to low based on the attention weights. Attention weight ratio parameters can be set based on each pulse-driven submodule, and feature elements whose attention weight ratios satisfy a preset ratio in the convolutional image features are used to construct activation masks to obtain a mask matrix. The attention weight ratio parameters (i.e., top-k ratio parameters) set for each pulse-driven submodule can be different. For example, if the pulse-driven module includes 3 pulse-driven submodules, the attention weight ratio parameter set for the first pulse-driven submodule can be 75%, the second pulse-driven submodule can be 67%, and the third pulse-driven submodule can be 50%. Moreover, activation masks with different sparsities correspond to response ranges under different threshold conditions, and the generated mask matrix is ​​calculated using equation (2), which is shown below:

[0093] (2);

[0094] Where j is the j-th attention head, p is the p-th query position, and k is the k-th layer pulse-driven submodule; The generated mask corresponding to the k-th layer pulse drive submodule; Let be the proportional parameter of the attention weight; this formula means: if The attention score corresponding to the p-th query position in the current query belongs to In the proportional parameter, then A value of 1 indicates that the connection channel is selected, and the feature corresponding to the p-th position is retained; otherwise, a value of 0 indicates that the connection channel is not selected, and the feature corresponding to the p-th position is not retained.

[0095] According to an embodiment of the present invention, based on the above method, a selective attention result mask feature corresponding to each layer of pulse-driven submodule is generated, that is, the i-th second intermediate image feature of each layer. Normalization and feature aggregation are then performed on the i-th second intermediate image feature of each layer to obtain the third intermediate image feature of each layer. For example, the third intermediate image feature of each layer can be obtained using the following formula. As shown in equation (3):

[0096] (3);

[0097] in, The mask feature for the selective attention result corresponding to the k-th layer pulse-driven submodule; is the normalization function; V is the value characteristic.

[0098] According to an embodiment of the present invention, the i-th layer image feature corresponding to the i-th pulse driving module can be obtained by weighted summation of the third intermediate image features of each layer.

[0099] According to an embodiment of the present invention, the above method can be used to determine the i-th level image feature corresponding to each pulse driving module.

[0100] Figure 4 This diagram illustrates the generation of the i-th level image features corresponding to the i-th pulse driving module using the network structure of the i-th pulse driving module.

[0101] like Figure 4As shown, the pulse driving sub-modules included in the i-th pulse driving module can be set to K=3 layers, and the network structure in each layer of pulse driving sub-modules is the same. The (i-1)th level image feature 401 corresponding to the (i-1)th pulse driving module is input into the ith pulse driving module for layer normalization and linear mapping to generate query feature 402, key feature 403, and value feature 404 corresponding to the ith pulse driving module. Query feature 402 and key feature 403 are multiplied by matrix A to obtain the ith first intermediate image feature 405. The ith first intermediate image feature 405 is input into the Gaussian convolution unit 406-1 in the first layer pulse driving submodule 406 for Gaussian convolution to obtain the first layer convolutional image feature. Attention is calculated on the first layer convolutional image feature based on the selective attention module 406-2. Simultaneously, the attention masking module 406-3 masks the first convolutional image feature obtained after attention calculation, and the normalization module 406-4 performs normalization to generate the ith second intermediate image feature 409 of the first layer. Similarly, the ith first intermediate image feature 405 is input into the first layer pulse driving module. The first intermediate image feature 410 of the second layer is generated by inputting the first intermediate image feature 405 into the second layer pulse driving submodule 407; the first intermediate image feature 405 of the first layer is generated by inputting the first intermediate image feature 405 into the third layer pulse driving submodule 408; the first intermediate image feature 409 of the first layer is multiplied by the value feature 404 to obtain the third intermediate image feature 412 of the first layer; similarly, the first intermediate image feature 410 of the second layer is multiplied by the value feature 404 to obtain the third intermediate image feature 413 of the second layer; the first intermediate image feature 411 of the third layer is multiplied by the value feature 404 to obtain the third intermediate image feature 414 of the third layer; the third intermediate image feature 412 of the first layer, the third intermediate image feature 413 of the second layer, and the third intermediate image feature 414 of the third layer are weighted and summed to obtain the first layer image feature 415 corresponding to the first pulse driving module.

[0102] According to an embodiment of the present invention, the cross-layer sensing decoder includes a first decoding module, a second decoding module, and a third decoding module. The cross-layer sensing decoder decodes L levels of image features to obtain a target landslide image, including:

[0103] Using the first decoding module, the Lth level image feature corresponding to the Lth pulse driving module is decoded to obtain the Lth image decoded feature;

[0104] Repeat this operation until the second image decoding feature is generated: using the second decoding module, decode the i-th image decoding feature and the i-1th level image feature corresponding to the i-1th pulse drive module to obtain the i-1th image decoding feature;

[0105] The third decoding module is used to decode the second image decoding feature and the first level image feature corresponding to the first pulse drive module to obtain the first image decoding feature; based on the first image decoding feature, the target landslide image is obtained.

[0106] According to an embodiment of the present invention, the first decoding module corresponds to the last pulse drive module in the encoder with impulse response characteristics. That is, the first decoding module in the cross-layer sensing decoder is used to decode the Lth level image feature output by the last pulse drive module (the Lth pulse drive module). The second decoding module may include multiple modules, each corresponding one-to-one with the 2nd to L-1th pulse drive modules in the encoder with impulse response characteristics, and is used to decode the fusion feature of the level image feature output by the corresponding pulse drive module and the image decoding feature output by the previous second decoding module. The third decoding module corresponds to the first pulse drive module in the encoder with impulse response characteristics. That is, the third decoding module in the cross-layer sensing decoder is used to decode the fusion feature of the first level image feature output by the first pulse drive module and the image decoding feature decoded by the last second decoding module.

[0107] According to an embodiment of the present invention, for example, a first decoding module is used to decode the Lth level image feature to obtain the Lth image decoded feature; then, a second decoding module corresponding to the (L-1)th pulse driving module is used to decode the Lth image decoded feature and the (L-1)th level image feature output by the (L-1)th pulse driving module to obtain the (L-1)th image decoded feature output by the second decoding module corresponding to the (L-1)th pulse driving module; and so on, the second decoding module corresponding to the (i-1)th pulse driving module is used to decode the i-th image decoded feature and the (i-1)th level image feature corresponding to the (i-1)th pulse driving module. Decoding yields the (i-1)th image decoding feature output by the second decoding module corresponding to the (i-1)th pulse drive module. This process continues until the third image decoding feature and the second-level image feature corresponding to the second pulse drive module are decoded using the second decoding module corresponding to the second pulse drive module, yielding the second image decoding feature output by the second decoding module corresponding to the second pulse drive module. Then, the second image decoding feature and the first-level image feature corresponding to the first pulse drive module are decoded using the third decoding module, yielding the first image decoding feature. Based on the first image decoding feature, the target landslide image is obtained.

[0108] According to embodiments of the present invention, fine perception and representation of landslide areas are achieved through cross-layer image feature fusion. This decoder structure, while maintaining semantic consistency, effectively utilizes contextual information at different scales, which helps improve the recognition capability of complex landslide areas and ultimately outputs high-precision landslide recognition results.

[0109] According to an embodiment of the present invention, the first decoding module includes a multi-scale context-aware module and a high-efficiency upsampling module. The first decoding module decodes the Lth-level image features corresponding to the Lth pulse-driven module to obtain the Lth image decoding feature, including: using the multi-scale context-aware module to perform multi-scale feature extraction on the Lth-level image features to obtain the Lth intermediate image decoding feature; and using the high-efficiency upsampling module to upsample the Lth intermediate image decoding feature to obtain the Lth image decoding feature.

[0110] According to embodiments of the present invention, a Multi-scale Convolutional Attention Module (MSCAM) can be embedded in each stage of the decoding process to improve the discriminative power of features and the ability to focus on key regions. This module consists of a Channel Attention Block (CAB), a Spatial Attention Block (SAB), and a Multi-scale Convolutional Block (MSCB).

[0111] According to an embodiment of the present invention, the Efficient Up-convolution Block (EUCB) can be embedded in each stage of the decoding process to recover spatial dimensions and compensate for differences in stratigraphic semantic information, thereby improving the spatial restoration accuracy during the decoding process and enhancing the structural representation capability of shallow features.

[0112] Figure 5 A schematic diagram of the structure of a multi-scale context-aware module according to an embodiment of the present invention is shown; Figure 6 A schematic diagram of the structure of a high-efficiency upsampling module according to an embodiment of the present invention is shown.

[0113] Taking the decoding of the Lth level image features corresponding to the Lth pulse drive module as an example, the Lth level image features are input into the first decoding module, and the multi-scale context awareness module in the first decoding module is used to extract multi-scale features of the Lth level image features.

[0114] Specifically, such as Figure 5As shown, the Channel Attention Block (CAB) 501 in the multi-scale context-aware module is used to process the L-th level image features using Adaptive Average Pooling (AAP) 501-1 and Adaptive Max Pooling (AMP) 501-2, respectively, to obtain the average-pooled image features and the max-pooled image features. Then, the average-pooled image features are subjected to a first 1×1 convolution operation 501-3 to obtain the first convolutional image features. The first convolutional image features are then subjected to a first ReLU activation function calculation 501-4, followed by a second 1×1 convolution operation 501-5 to obtain the third convolutional image features. The max-pooled image features are then subjected to a first 1×1 convolution operation 501-3 to obtain the second convolutional image features. After the second convolution, the image features are subjected to the first ReLU activation function calculation (501-4), and then a second 1×1 convolution operation (501-5) is performed to obtain the fourth convolution image features. The third and fourth convolution image features are then added element-wise (501-6) to obtain the fused image features. The fused image features are then subjected to the Sigmoid activation function calculation (501-7), and then the calculated fused image features are combined with the Lth level image features to calculate the Hadamard product (501-8) to obtain the attention image features output by the channel attention block. The image features with attention are then input into the Spatial Attention Block (SAB) 502 for spatial attention calculation. The image features with spatial attention calculation are then input into the Multi-Scale Convolutional Block (MSCB) 503. The image features with spatial attention calculation are then subjected to a third 1×1 convolution operation 503-1, batch normalization (BN), a second ReLU activation function calculation 503-2, a multi-scale (parallel) depth-wise convolution (MSDC) 503-3, a fourth 1×1 convolution operation 503-4, and a batch normalization operation 503-5. The resulting image features are then added element-wise with the input features of the Multi-Scale Convolutional Block (MSCB) 503 (i.e., the image features with spatial attention calculation) 503-6. The output is the Lth intermediate image decoding feature, which is the output of the multi-scale context-aware module.

[0115] Combination Figure 6As shown, the Lth intermediate image decoding feature is input into the efficient upsampling module. A bilinear upsampling operation 601 is performed on the input Lth intermediate image decoding feature to expand the spatial resolution. Then, a 3×3 depthwise separable convolution (DWC) 602 is applied to extract local features from the upsampled features. Combined with batch normalization (BN) 603 and the third ReLU activation function 604, the expression stability and nonlinearity of the extracted local features are improved. Finally, a fifth 1×1 convolution operation 605 is used to achieve channel integration and feature compression to obtain the Lth image decoding feature.

[0116] According to embodiments of the present invention, the multi-scale context-aware module employs Channel Attention (CAB) to mine the differences in responses between channels, enhancing category-related semantic features; Spatial Attention (SAB) is used to highlight spatially salient regions and suppress irrelevant background information; and the multi-scale convolution module (MSCB) expands the receptive field by introducing depthwise separable convolutions at different scales to adapt to the diversity of landslide boundaries and textures. These three components work synergistically to construct feature representations with strong contextual expressive capabilities, supporting accurate feature extraction from subsequent landslide areas. Furthermore, the joint design of upsampling and convolution operations in the efficient upsampling module effectively enhances the continuity of boundary structures when obtaining the Lth image decoding feature, while alleviating the problems of weak shallow features and blurred target edges during the decoding stage, thereby improving the reconstruction capability of small-scale landslide areas.

[0117] According to an embodiment of the present invention, the second decoding module includes a multi-scale feature aggregation (MSFA) module, a multi-scale context-aware module, and an efficient upsampling module. The structures and feature extraction processes of the multi-scale context-aware module and the efficient upsampling module are the same as those in the first decoding module described above, and will not be repeated here.

[0118] According to an embodiment of the present invention, the i-th image decoding feature and the i-1th level image feature output by the (i-1)th pulse driving module are input into the second decoding module. The cross-scale cross-fusion module in the second decoding module is used to perform cross-scale cross-fusion of the i-th image decoding feature and the i-1th level image feature output by the (i-1)th pulse driving module to obtain the (i-1)th image fusion feature.

[0119] For example, Figure 7 A schematic diagram of the cross-scale cross-fusion module in the second decoding module according to an embodiment of the present invention is shown.

[0120] like Figure 7As shown, taking the obtained (i-1)th image fusion feature as an example, the process of cross-scale cross-fusion module performing cross-scale cross-fusion on the i-th image decoding feature and the (i-1)th level image feature output by the (i-1)th pulse driving module is explained. Specifically, the i-th image decoding feature is subjected to a first 3×3 group convolution operation 701, and the (i-1)th level image feature is subjected to a first 3×3 group convolution operation 701; the features after the i-th image decoding feature is subjected to a 3×3 group convolution operation and the features after the i-th image decoding feature is subjected to a 3×3 group convolution operation are cross-stitched 702 to obtain their respective stitched features, and then the respective stitched features are subjected to a second 3×3 group convolution operation 703, and then subjected to a third ReLU activation function 704 for feature fusion 705, and the fused features are subjected to a sixth 1×1 convolution operation 706 to complete channel compression, and the (i-1)th image fusion feature is output.

[0121] According to an embodiment of the present invention, the (i-1)th image fusion feature is then input into the multi-scale context awareness module for multi-scale feature extraction, and the extracted (i-1)th intermediate image decoding feature is input into the high-efficiency upsampling module for upsampling, finally outputting the (i-1)th image decoding feature. The processing procedures of the multi-scale context awareness module and the high-efficiency upsampling module in this process are the same as those in the first decoding module described above, and will not be repeated here.

[0122] According to embodiments of the present invention, a cross-scale cross-fusion module is used to perform cross-scale feature fusion on two image features at different levels, thereby achieving information interaction between the two image features at different levels and realizing efficient integration between features at different levels. This not only strengthens the linkage between deep semantics and shallow structure, but also improves the accuracy of landslide area identification while maintaining computational efficiency. The cross-scale cross-fusion module effectively compensates for the inconsistency problem in the information fusion process at different scales, which helps to improve the spatial positioning accuracy and boundary integrity of landslide areas, especially showing good adaptability and stability when facing multi-scale landslide targets.

[0123] According to an embodiment of the present invention, the third decoding module includes a cross-scale cross-fusion module and a multi-scale context-aware module. The third decoding module decodes the second image decoding feature and the first-level image feature corresponding to the first pulse-driven module to obtain the first image decoding feature. This includes: using the cross-scale cross-fusion module to perform cross-scale cross-feature fusion on the second image decoding feature and the first-level image feature corresponding to the first pulse-driven module to obtain the first image fusion feature; and using the multi-scale context-aware module to extract multi-scale features from the first image fusion feature to obtain the first image decoding feature.

[0124] According to an embodiment of the present invention, the structure and processing of the cross-scale cross-fusion module and the multi-scale context-aware module in the third decoding module are the same as those of the cross-scale cross-fusion module and the multi-scale context-aware module in the first and second decoding modules described above, and will not be repeated here.

[0125] According to an embodiment of the present invention, the first image decoding feature is essentially the output of the third decoding module, which is obtained by multi-scale feature extraction of the first image fusion feature obtained by cross-scale cross-feature fusion of the second image decoding feature and the first level image feature corresponding to the first pulse driving module.

[0126] According to an embodiment of the present invention, obtaining a target landslide image based on a first image decoding feature includes: obtaining a target landslide image based on a first image decoding feature, an (i-1)th intermediate image decoding feature, and an Lth intermediate image decoding feature.

[0127] According to an embodiment of the present invention, for each pulse driving module, the corresponding decoding module outputs the corresponding image decoding features on one hand, and the intermediate image decoding features obtained by multi-scale feature extraction by the multi-scale context awareness module on the other hand.

[0128] For example, for the first decoding module corresponding to the Lth pulse driving module, the Lth image decoding feature is output for decoding by the second decoding module corresponding to the (L-1)th pulse driving module. The Lth intermediate image decoding feature is also output after multi-scale feature extraction by the multi-scale context awareness module in the first decoding module. Similarly, the (L-1)th intermediate image decoding feature obtained after multi-scale feature extraction by the multi-scale context awareness module in the second decoder module corresponding to the (L-1)th pulse driving module, the (i-1)th intermediate image decoding feature obtained after multi-scale feature extraction by the multi-scale context awareness module in the second decoder module corresponding to the (i-1)th pulse driving module, and so on, until the second intermediate image decoding feature obtained after multi-scale feature extraction by the multi-scale context awareness module in the second decoder module corresponding to the second pulse driving module, and the first image decoding feature output by the first decoding module corresponding to the first pulse driving module.

[0129] According to an embodiment of the present invention, the first image decoding feature to the Lth intermediate image decoding feature are weighted and fused to obtain the target landslide image.

[0130] Figure 8 A schematic diagram of the structure of the landslide range enhancement extraction method with pulse spatial suppression according to an embodiment of the present invention is shown.

[0131] like Figure 8 As shown, taking an encoder with impulse response characteristics comprising four pulse drive modules as an example, and a second decoding module comprising two modules as an example, the candidate landslide image 801 is divided into blocks to obtain a block-divided candidate landslide image 802; the block-divided candidate landslide image 802 is input into an encoder with impulse response characteristics 803, which includes a first pulse drive module 803-1, a second pulse drive module 803-2, a third pulse drive module 803-3, and a fourth pulse drive module 803-4. The segmented candidate landslide image 802 is input to the first pulse drive module 803-1 for encoding, and the first-level image feature 804 is output. The first-level image feature 804 is used as the input to the second pulse drive module 803-2 for encoding, and the second-level image feature 805 is output. The second-level image feature 805 is used as the input to the third pulse drive module 803-3 for encoding, and the third-level image feature 806 is output. The third-level image feature 806 is used as the input to the fourth pulse drive module 803-4 for encoding, and the fourth-level image feature 807 is output.

[0132] The first-level image features 804, the second-level image features 805, the third-level image features 806, and the fourth-level image features 807 obtained above are input into the cross-layer perceptual decoder 808 for decoding. Specifically, the cross-layer perceptual decoder 808 includes a first decoding module 808-1, a second decoding module 808-2, and a third decoding module 808-3. Among them, the first decoding module 808-1 includes a multi-scale context-aware module (MSCAM) 808-11 and an efficient upsampling module (EUCB) 808-12; the second decoding module 808-2 includes a cross-scale cross-fusion module (MSFA) 808-21, a multi-scale context-aware module (MSCAM) 808-11, and an efficient upsampling module (EUCB) 808-12; the third decoding module 808-3 includes a cross-scale cross-fusion module (MSFA) 808-21 and a multi-scale context-aware module (MSCAM) 808-11. The fourth-level image feature 807 is input into the multi-scale context-aware module (MSCAM) 808-11 in the first decoding module 808-1 to obtain the fourth intermediate image decoding feature 809. The fourth intermediate image decoding feature 809 serves as both the input to the efficient upsampling module (EUCB) 808-12 in the first decoding module 808-1 and the output of the first decoding module. The fourth intermediate image decoding feature 809 is input into the efficient upsampling module (EUCB) 808-12 in the first decoding module 808-1, and the fourth image decoding feature 810 is output. The fourth image decoding feature 810 and the third-level image feature 806 corresponding to the third pulse-driven module are input into the cross-scale cross-fusion module (MSFA) 808-21 of the second decoding module 808-2 corresponding to the third pulse-driven module, and the third image fusion feature 811 is output. The third image fusion feature 811 is input into the multi-scale context-aware module (MSCAM) 808-11 of the second decoding module 808-2 corresponding to the third pulse-driven module, and the third intermediate image decoding feature 812 is output. The third intermediate image decoding feature 812 serves as both the input to the efficient upsampling module (EUCB) 808-12 in the second decoding module 808-2 corresponding to the third pulse-driven module and the output of the second decoding module corresponding to the third pulse-driven module.The third intermediate image decoding feature 812 is input into the high-efficiency upsampling module (EUCB) 808-12 in the second decoding module 808-2 corresponding to the third pulse driving module, and the third image decoding feature 813 is output. Similarly, the third image decoding feature and the second level image feature corresponding to the second pulse driving module are input into the second decoding module 808-2 corresponding to the second pulse driving module to obtain the second intermediate image decoding feature 814 and the second image decoding feature 815. The processing process is the same as that of the second decoding module 808-2 corresponding to the third pulse driving module, and will not be repeated here. The second image decoding feature 815 and the first-level image feature 804 corresponding to the second pulse-driven module are input into the cross-scale cross-fusion module (MSFA) 808-21 in the third decoding module 808-3, outputting the first image fusion feature 816. The first image fusion feature 816 is then input into the multi-scale context-aware module (MSCAM) 808-11 in the third decoding module 808-3, outputting the first image decoding feature 817. The first image decoding feature 817, the second intermediate image decoding feature 814, the third intermediate image decoding feature 812, and the fourth intermediate image decoding feature 809 are then weighted and fused to output the target landslide image 818.

[0133] According to an embodiment of the present invention, in order to quantitatively evaluate the landslide extraction performance of the present invention and existing methods, the intersection-union ratio (IOU), precision (P), recall (R) and F1 score (F1) were used for evaluation. The calculation methods of the above indicators are shown in the following formulas (4)-(7), and the comparison results are shown in Table 1.

[0134] (4);

[0135] (5);

[0136] (6);

[0137] (7);

[0138] Wherein, TP represents the number of pixels correctly classified as landslides; FP represents the number of pixels incorrectly classified as non-landslides; and FN represents the number of pixels that were mistakenly classified as landslides but not as non-landslides.

[0139] Table 1

[0140]

[0141] As shown in Table 1, the present invention achieved the highest performance in all four sets of different performance calculation results, indicating that the present invention can obtain more accurate extraction results for landslides of different scales and exhibits superior performance.

[0142] Figure 9(A) shows the original image of a landslide according to an embodiment of the present invention; Figure 9(B) is a schematic diagram of the result of landslide range enhancement extraction by pulse spatial suppression on the original image of Figure 9(A); Figure 9(C) shows the original image of a landslide according to another embodiment of the present invention; Figure 9(D) is a schematic diagram of the result of landslide range enhancement extraction by pulse spatial suppression on the original image of Figure 9(C).

[0143] As shown in Figures 9(A)-9(B), for the original image of the landslide area in Figure 9(A), Figure 9(B) can accurately and completely extract the landslide area 901 in Figure 9(A); similarly, as shown in Figures 9(C)-9(D), for the original image of the landslide area in Figure 9(C), Figure 9(D) can accurately and completely extract the landslide area 901 in Figure 9(C). Therefore, from Figures 9(A)-9(D), it can be seen that for the extraction of landslide areas at different scales, the extraction results show that the method provided by this invention can obtain accurate and complete extraction results for landslides at different scales, demonstrating the advantages of this invention in landslide detection in complex geographical environments.

[0144] Figure 10 A block diagram of a landslide range enhancement extraction device with pulse spatial suppression according to an embodiment of the present invention is shown.

[0145] like Figure 10 As shown, the pulse spatial suppression landslide range enhancement extraction device 1000 includes: an image preprocessing module 1010, an image encoding module 1020, and an image decoding module 1030.

[0146] The image preprocessing module 1010 is used to preprocess the remote sensing image to be extracted to obtain candidate landslide images.

[0147] The image encoding module 1020 is used to encode the candidate landslide image using an encoder with impulse response characteristics to generate at least two levels of image features. The encoder with impulse response characteristics is used to improve the extraction of landslide features in the candidate landslide image and to suppress background interference features in the candidate landslide image.

[0148] The image decoding module 1030 is used to decode image features of at least two layers using a cross-layer perception decoder to obtain the target landslide image.

[0149] According to an embodiment of the present invention, the encoder with impulse response characteristics includes L impulse driving modules, each impulse driving module generating a corresponding layer image feature. The image encoding module 1020 includes: a feature generation first submodule 1020-1, a feature determination submodule 1020-2, a feature acquisition submodule 1020-3, and a feature generation second submodule 1020-4.

[0150] The first feature generation submodule 1020-1 is used to repeatedly perform this operation until the Lth level image feature corresponding to the Lth pulse driving module is generated: for the i-th pulse driving module, the (i-1)th level image feature is normalized and linearly mapped to generate the query feature, key feature and value feature corresponding to the i-th pulse driving module, where L is a positive integer ≥3 and 2≤i≤L.

[0151] Feature determination submodule 1020-2 is used to determine the i-th first intermediate image feature based on query features and key features.

[0152] The feature acquisition submodule 1020-3 is used to perform impulse response encoding processing on the i-th first intermediate image feature to obtain the i-th second intermediate image feature.

[0153] The second feature generation submodule 1020-4 is used to generate the i-th level image feature corresponding to the i-th pulse driving module based on the value feature and the i-th second intermediate image feature. When i=1, the first level image feature is generated by encoding the candidate landslide image using the first pulse driving module.

[0154] According to an embodiment of the present invention, each pulse driving module includes a K-layer pulse driving sub-module. The first feature generation sub-module includes a convolution operation unit and a normalization processing unit.

[0155] The convolution operation unit is used to perform a Gaussian convolution operation on the i-th first intermediate image feature in the k-th layer of the K-layer pulse driving submodule of the i-th pulse driving module to obtain the k-th layer convolution image feature.

[0156] The normalization processing unit is used to perform attention sorting and normalization processing on the convolutional image features of the k-th layer based on the selective attention mechanism, to obtain the i-th second intermediate image feature of the k-th layer and the i-th second intermediate image feature of the K-th layer, where K is a positive integer ≥1 and 1≤k≤K.

[0157] According to an embodiment of the present invention, the second feature generation submodule includes a product operation unit and a splicing processing unit.

[0158] The product operation unit is used to perform a product operation on the i-th second intermediate image feature of the k-th layer and the value feature to obtain the third intermediate image feature of the k-th layer, thus obtaining the third intermediate image feature of the K-th layer.

[0159] The stitching processing unit is used to stitch together the third intermediate image features of the K layers to generate the i-th layer image features corresponding to the i-th pulse driving module.

[0160] According to an embodiment of the present invention, the cross-layer perceptual decoder includes a first decoding module, a second decoding module, and a third decoding module. The image decoding module 1030 includes a first decoding submodule, a second decoding submodule, a third decoding submodule, and an image acquisition submodule.

[0161] The first decoding submodule is used to decode the Lth level image feature corresponding to the Lth pulse driving module using the first decoding module, so as to obtain the Lth image decoding feature.

[0162] The second decoding submodule is used to repeatedly perform this operation until the second image decoding feature is generated: using the second decoding module, the i-th image decoding feature and the i-1th level image feature corresponding to the i-1th pulse driving module are decoded to obtain the i-1th image decoding feature.

[0163] The third decoding submodule is used to decode the second image decoding feature and the first level image feature corresponding to the first pulse drive module using the third decoding module, so as to obtain the first image decoding feature.

[0164] The image acquisition submodule is used to obtain the target landslide image based on the decoding features of the first image.

[0165] According to an embodiment of the present invention, the first decoding module includes a multi-scale context-aware module and an efficient upsampling module. The first decoding submodule includes a multi-scale feature extraction unit and a first upsampling unit.

[0166] The first multi-scale feature extraction unit is used to extract multi-scale features from the Lth level image features using the multi-scale context awareness module, and obtain the Lth intermediate image decoding features.

[0167] The first upsampling unit is used to upsample the Lth intermediate image decoding feature using the efficient upsampling module to obtain the Lth image decoding feature.

[0168] According to an embodiment of the present invention, the second decoding module includes a cross-scale fusion module, a multi-scale context-aware module, and an efficient upsampling module. The second decoding sub-module includes: a first feature fusion unit, a multi-scale feature second extraction unit, and a second upsampling unit.

[0169] The first feature fusion unit is used to perform cross-scale cross-feature fusion on the i-th image decoding feature and the i-1th level image feature corresponding to the i-1th pulse driving module to obtain the i-1th image fusion feature.

[0170] The second multi-scale feature extraction unit is used to extract multi-scale features from the (i-1)th image fusion feature using the multi-scale context-aware module, so as to obtain the (i-1)th intermediate image decoding feature.

[0171] The second upsampling unit is used to upsample the (i-1)th intermediate image decoding feature using the efficient upsampling module to obtain the (i-1)th image decoding feature.

[0172] According to an embodiment of the present invention, the third decoding module includes a cross-scale fusion module and a multi-scale context-aware module. The third decoding sub-module includes a second feature fusion unit and a multi-scale feature third extraction unit.

[0173] The second feature fusion unit is used to perform cross-scale cross-feature fusion on the second image decoding feature and the first level image feature corresponding to the first pulse driving module to obtain the first image fusion feature.

[0174] The third multi-scale feature extraction unit uses the multi-scale context awareness module to extract multi-scale features from the first image fusion feature, thus obtaining the first image decoding feature.

[0175] According to an embodiment of the present invention, the image acquisition submodule includes: an image acquisition unit.

[0176] The image acquisition unit is used to obtain the target landslide image based on the first image decoding feature, the (i-1)th intermediate image decoding feature, and the Lth intermediate image decoding feature.

[0177] Any one or more of the modules, submodules, and units according to embodiments of the present invention, or at least part of the functions of any one or more of them, can be implemented in a single module. Any one or more of the modules, submodules, and units according to embodiments of the present invention can be implemented by dividing them into multiple modules. Any one or more of the modules, submodules, and units according to embodiments of the present invention can be at least partially implemented as hardware circuits, such as field-programmable gate arrays (FPGAs), programmable logic arrays (PLAs), systems-on-a-chip, systems-on-a-substrate, systems-on-package, application-specific integrated circuits (ASICs), or implemented in hardware or firmware by any other reasonable means of integrating or packaging circuits, or implemented in software, hardware, and firmware, or in any suitable combination of any of these three implementation methods. Alternatively, one or more of the modules, submodules, units, and subunits according to embodiments of the present invention can be at least partially implemented as computer program modules, which, when run, can perform corresponding functions.

[0178] For example, any plurality of the image preprocessing module 1010, image encoding module 1020, and image decoding module 1030 can be combined into one module / unit / subunit, or any one of these modules / units / subunits can be split into multiple modules / units / subunits. Alternatively, at least part of the functionality of one or more of these modules / units / subunits can be combined with at least part of the functionality of other modules / units / subunits and implemented in one module / unit / subunit. According to embodiments of the present invention, at least one of the image preprocessing module 1010, image encoding module 1020, and image decoding module 1030 can be at least partially implemented as hardware circuitry, such as a field-programmable gate array (FPGA), a programmable logic array (PLA), a system-on-a-chip, a system-on-a-substrate, a system-on-package, an application-specific integrated circuit (ASIC), or any other reasonable means of integrating or packaging the circuitry, or implemented in software, hardware, or firmware, or in any suitable combination of any of these three implementation methods. Alternatively, at least one of the image preprocessing module 1010, image encoding module 1020, and image decoding module 1030 may be implemented at least partially as a computer program module, which can perform corresponding functions when the computer program module is run.

[0179] It should be noted that the pulse-space-suppressed landslide range enhancement extraction device part in the embodiments of the present invention corresponds to the pulse-space-suppressed landslide range enhancement extraction method part in the embodiments of the present invention. For a detailed description of the pulse-space-suppressed landslide range enhancement extraction device part, please refer to the pulse-space-suppressed landslide range enhancement extraction method part, which will not be repeated here.

[0180] Figure 11 A block diagram of an electronic device suitable for implementing the landslide range enhancement extraction method with pulse spatial suppression described above, according to an embodiment of the present invention, is shown. Figure 11 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present invention.

[0181] like Figure 11 As shown, an electronic device according to an embodiment of the present invention includes a processor 1101, which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 1102 or a program loaded from a storage portion 1108 into a random access memory (RAM) 1103. The processor 1101 may include, for example, a general-purpose microprocessor (e.g., a CPU), an instruction set processor and / or an associated chipset and / or a special-purpose microprocessor (e.g., an application-specific integrated circuit (ASIC)), etc. The processor 1101 may also include onboard memory for caching purposes. The processor 1101 may include a single processing unit or multiple processing units for performing different actions of the method flow according to an embodiment of the present invention.

[0182] RAM 1103 stores various programs and data required for the operation of the electronic device. Processor 1101, ROM 1102, and RAM 1103 are interconnected via bus 1104. Processor 1101 executes various operations of the method flow according to embodiments of the present invention by executing programs in ROM 1102 and / or RAM 1103. It should be noted that programs may also be stored in one or more memories other than ROM 1102 and RAM 1103. Processor 1101 may also execute various operations of the method flow according to embodiments of the present invention by executing programs stored in one or more memories.

[0183] According to embodiments of the present invention, the electronic device may further include an input / output (I / O) interface 1105, which is also connected to a bus 1104. The electronic device may also include one or more of the following components connected to the input / output (I / O) interface 1105: an input section 1106 including a keyboard, mouse, etc.; an output section 1107 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 1108 including a hard disk, etc.; and a communication section 1109 including a network interface card such as a LAN card, modem, etc. The communication section 1109 performs communication processing via a network such as the Internet. A drive 1110 is also connected to the input / output (I / O) interface 1105 as needed. A removable medium 1111, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on the drive 1110 as needed so that computer programs read from it can be installed into the storage section 1108 as needed.

[0184] According to embodiments of the present invention, the method flow according to embodiments of the present invention can be implemented as a computer software program. For example, embodiments of the present invention include a computer program product comprising a computer program carried on a computer-readable storage medium, the computer program containing program code for performing the method shown in the flowchart. In such embodiments, the computer program can be downloaded and installed from a network via communication section 1109, and / or installed from removable medium 1111. When the computer program is executed by processor 1101, it performs the functions defined in the system of the embodiments of the present invention. According to embodiments of the present invention, the systems, devices, apparatuses, modules, units, etc., described above can be implemented by computer program modules.

[0185] The present invention also provides a computer-readable storage medium, which may be included in the device / apparatus / system described in the above embodiments; or it may exist independently and not assembled into the device / apparatus / system. The computer-readable storage medium carries one or more programs, which, when executed, implement the method according to the embodiments of the present invention.

[0186] According to embodiments of the present invention, the computer-readable storage medium may be a non-volatile computer-readable storage medium. Examples include, but are not limited to: portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination thereof. In the present invention, the computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

[0187] For example, according to embodiments of the present invention, a computer-readable storage medium may include one or more memories other than the ROM 1102 and / or RAM 1103 described above and / or ROM 1102 and RAM 1103.

[0188] Embodiments of the present invention also include a computer program product comprising a computer program containing program code for performing the methods provided in the embodiments of the present invention. When the computer program product is run on an electronic device, the program code is used to enable the electronic device to implement the landslide range enhancement extraction method for pulse spatial suppression provided in the embodiments of the present invention.

[0189] When the computer program is executed by the processor 1101, it performs the functions defined in the system / apparatus of this embodiment of the invention. According to embodiments of the invention, the systems, apparatuses, modules, units, etc., described above can be implemented by computer program modules.

[0190] In one embodiment, the computer program may rely on a tangible storage medium such as an optical storage device or a magnetic storage device. In another embodiment, the computer program may also be transmitted and distributed in the form of signals over a network medium, and may be downloaded and installed via the communication section 1109, and / or installed from the removable medium 1111. The program code contained in the computer program can be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination thereof.

[0191] According to embodiments of the present invention, program code for executing the computer programs provided in the embodiments of the present invention can be written in any combination of one or more programming languages. Specifically, these computational programs can be implemented using high-level procedural and / or object-oriented programming languages, and / or assembly / machine languages. Programming languages ​​include, but are not limited to, languages ​​such as Java, C++, Python, "C", or similar programming languages. The program code can be executed entirely on the user's computing device, partially on the user's device, partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing device can be connected to the user's computing device via any type of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (e.g., via the Internet using an Internet service provider).

[0192] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram or flowchart, and combinations of blocks in a block diagram or flowchart, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions. Those skilled in the art will understand that the features described in the various embodiments of the present invention can be combined and / or combined in various ways, even if such combinations or combinations are not explicitly described in the present invention. In particular, the features described in the various embodiments of the present invention can be combined and / or combined in various ways without departing from the spirit and teachings of the present invention. All such combinations and / or pairings fall within the scope of this invention.

[0193] The embodiments of the present invention have been described above. However, these embodiments are merely illustrative and not intended to limit the scope of the invention. Although various embodiments have been described above, this does not mean that the measures in the various embodiments cannot be used advantageously in combination. Various substitutions and modifications can be made by those skilled in the art without departing from the scope of the invention, and all such substitutions and modifications should fall within the scope of the invention.

Claims

1. A method for enhancing landslide extent extraction using pulse spatial suppression, characterized in that, The method includes: The remote sensing images to be extracted are preprocessed to obtain candidate landslide images; The candidate landslide image is encoded using an encoder with impulse response characteristics to generate multiple levels of image features, including: Repeat the following operations until the Lth level image feature corresponding to the Lth pulse drive module is generated, resulting in L level image features. The encoder with pulse response characteristics includes L pulse drive modules, each generating a corresponding level image feature: For the i-th pulse driving module, the image features of the (i-1)-th level are normalized and linearly mapped to generate query features, key features and value features corresponding to the i-th pulse driving module, where L is a positive integer ≥ 3 and 2 ≤ i ≤ L; Based on the query features and the key features, determine the i-th first intermediate image feature; The i-th first intermediate image feature is encoded with an impulse response to obtain the i-th second intermediate image feature; Based on the value features and the i-th second intermediate image features, an i-th level image feature corresponding to the i-th pulse driving module is generated. When i=1, the first level image feature is generated by encoding the candidate landslide image using the first pulse driving module. The encoder with impulse response characteristics is used to improve the extraction of landslide features from the candidate landslide image and to suppress background interference features in the candidate landslide image. The target landslide image is obtained by decoding the L layers of image features using a cross-layer perception decoder.

2. The method according to claim 1, characterized in that, Each of the pulse driving modules includes a K-layer pulse driving sub-module; Encoding the i-th first intermediate image feature with impulse response to obtain the i-th second intermediate image feature includes: For the kth layer of the pulse driving submodule of the i-th pulse driving module, a Gaussian convolution operation is performed on the i-th first intermediate image feature to obtain the k-th layer convolutional image feature; Based on the selective attention mechanism, the k-th layer convolutional image features are sorted by attention and normalized to obtain the i-th second intermediate image feature of the k-th layer, and the i-th second intermediate image feature of the K-th layer is obtained, where K is a positive integer ≥1 and 1≤k≤K.

3. The method according to claim 2, characterized in that, Based on the value features and the i-th second intermediate image features, generate the i-th level image feature corresponding to the i-th pulse driving module, including: Multiply the i-th second intermediate image feature of the k-th layer with the value feature to obtain the third intermediate image feature of the k-th layer, thus obtaining the third intermediate image feature of the K-th layer. The third intermediate image features of the K-layer are stitched together to generate the i-th layer image features corresponding to the i-th pulse driving module.

4. The method according to claim 1, characterized in that, The cross-layer perception decoder includes a first decoding module, a second decoding module, and a third decoding module; The process of decoding the L-level image features using a cross-layer perceptual decoder to obtain the target landslide image includes: Using the first decoding module, the Lth level image feature corresponding to the Lth pulse driving module is decoded to obtain the Lth image decoding feature; Repeat this operation until the second image decoding feature is generated: using the second decoding module, the i-th image decoding feature and the i-1th level image feature corresponding to the i-1th pulse drive module are decoded to obtain the i-1th image decoding feature; Using the third decoding module, the second image decoding feature and the first level image feature corresponding to the first pulse driving module are decoded to obtain the first image decoding feature; The target landslide image is obtained based on the first image decoding feature.

5. The method according to claim 4, characterized in that, The first decoding module includes a multi-scale context-aware module and a high-efficiency upsampling module; The step of using the first decoding module to decode the Lth level image feature corresponding to the Lth pulse driving module to obtain the Lth image decoded feature includes: The multi-scale context awareness module is used to extract multi-scale features from the Lth level image features to obtain the Lth intermediate image decoding features. The Lth intermediate image decoding feature is upsampled using the efficient upsampling module to obtain the Lth image decoding feature.

6. The method according to claim 5, characterized in that, The second decoding module includes a cross-scale cross-fusion module, the multi-scale context-aware module, and the high-efficiency upsampling module; The step of using the second decoding module to decode the i-th image decoding feature and the i-1th level image feature corresponding to the (i-1)th pulse driving module to obtain the (i-1)th image decoding feature includes: Using the cross-scale cross-fusion module, the i-th image decoding feature and the i-1th level image feature corresponding to the (i-1)th pulse driving module are cross-scale cross-feature fusion to obtain the (i-1)th image fusion feature; The multi-scale context awareness module is used to extract multi-scale features from the (i-1)th image fusion feature to obtain the (i-1)th intermediate image decoding feature; The (i-1)th intermediate image decoding feature is upsampled using the efficient upsampling module to obtain the (i-1)th image decoding feature.

7. The method according to claim 6, characterized in that, The third decoding module includes the cross-scale cross-fusion module and the multi-scale context-aware module; The process of using the third decoding module to decode the second image decoding feature and the first level image feature corresponding to the first pulse driving module to obtain the first image decoding feature includes: Using the cross-scale cross-fusion module, the second image decoding feature and the first level image feature corresponding to the first pulse driving module are fused across scales to obtain the first image fusion feature; The first image decoding feature is obtained by using the multi-scale context awareness module to extract multi-scale features from the first image fusion feature.

8. The method according to claim 7, characterized in that, Obtaining the target landslide image based on the first image decoding feature includes: The target landslide image is obtained based on the first image decoding feature, the (i-1)th intermediate image decoding feature, and the Lth intermediate image decoding feature.

9. An electronic device, comprising: One or more processors; Memory, used to store one or more programs. The characteristic is that, when the one or more programs are executed by the one or more processors, the one or more processors implement the method of any one of claims 1 to 8.