Hyperspectral anomaly detection method based on dual-domain consistent reconstruction network

CN122244624APending Publication Date: 2026-06-19QUANZHOU INST OF EQUIP MFG +1

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
QUANZHOU INST OF EQUIP MFG
Filing Date
2026-05-18
Publication Date
2026-06-19

Smart Images

  • Figure CN122244624A_ABST
    Figure CN122244624A_ABST
Patent Text Reader

Abstract

This invention relates to the field of computer vision, specifically to a hyperspectral anomaly detection method based on a dual-domain consistency reconstruction network; S1: Acquire the hyperspectral image to be detected, preprocess the hyperspectral image, and use the preprocessed hyperspectral image for unsupervised training; S2: Train the DDCRNet model to obtain a trained DDCRNet model; S3: Construct a dual-domain consistency reconstruction loss function, and use this dual-domain consistency reconstruction loss function to adjust the trained DDCRNet model to obtain an adjusted DDCRNet model; S4: Use the adjusted DDCRNet model to perform anomaly detection on the image and generate an anomaly detection map; This method effectively solves the contradiction between long-range modeling and local detail preservation, and improves detection accuracy.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of computer vision, and more specifically to a hyperspectral anomaly detection method based on a dual-domain consistency reconstruction network. Background Technology

[0002] Existing hyperspectral anomaly detection methods typically rely on traditional statistical models, such as the RX algorithm, that assume the background follows a specific distribution, such as a multivariate Gaussian distribution. However, real-world hyperspectral data often exhibits complex nonlinear structures and multimodal distributions, making simple statistical assumptions ill-suited to complex and variable backgrounds, leading to decreased detection performance. On the other hand, linear representation-based methods, such as CRD and LRR, while incorporating subspace learning, are limited by shallow linear structures and cannot effectively capture deep nonlinear spatial-spectral dependencies in hyperspectral images. This restricts their ability to model complex backgrounds and identify subtle anomalies. Summary of the Invention

[0003] The purpose of this invention is to provide a hyperspectral anomaly detection method based on a dual-domain consistency reconstruction network to improve recognition accuracy.

[0004] To achieve the above objectives, the present invention adopts the following technical solution: A method for establishing a dual-domain consistency reconstruction network is proposed, which constructs a DDCRNet model for hyperspectral anomaly detection. The DDCRNet model includes a spatial branch, a frequency branch, a fusion module, and an image reconstruction module. This spatial branch uses the Mamba sequence model to capture the long-range contextual dependencies of the input image sequence. At the same time, it uses the Haar wavelet transform to explicitly separate low-frequency structure and high-frequency texture through the Haar wavelet gated residual fusion module, and suppresses high-frequency noise through the gating mechanism to obtain spatial domain features. This frequency branch uses Fast Fourier Transform to map the image sequence to the frequency domain, employs a joint real and imaginary part modeling strategy to perform frequency domain feature interaction, and then recovers the frequency domain features through inverse transform. The fusion module uses learnable parameters to dynamically adjust the spatial domain features and the frequency domain features and perform weighted fusion, and generates a background reconstruction image through the image reconstruction module.

[0005] Preferably, the specific processing steps for this spatial branch are as follows: The first feature of the image sequence was extracted using the Mamba sequence model; The processing steps of this Haar wavelet-gated residual fusion module are as follows: The first feature is orthogonally decomposed using discrete Haar wavelet transform to obtain low-frequency approximate sub-bands, horizontal high-frequency sub-bands, vertical high-frequency sub-bands, and diagonal high-frequency sub-bands. The low-frequency approximate subband, the horizontal high-frequency subband, the vertical high-frequency subband, and the diagonal high-frequency subband are spliced ​​together to form a high-frequency subband feature. This high-frequency subband feature is then input into a gating network containing a gating mapping function and an activation function to generate a reliability weight. This reliability weight is then used to adjust the high-frequency subband feature to obtain an enhanced high-frequency feature. The enhanced high-frequency feature is obtained by inverse transformation of the enhanced high-frequency feature and the low-frequency approximate subband to obtain projection recombination; The enhanced feature is merged with the corresponding positional feature slices extracted from the backbone of the Mamba sequence model along the channel dimension and then processed. To interact, As a learnable point-by-point channel fusion operator, the interactive features are added to the first feature using an operator-by-operator approach to obtain spatial domain features.

[0006] Preferably, the specific processing steps for this frequency branch are as follows: The spectrum of the image sequence is obtained by using the two-dimensional discrete Fourier transform and then a frequency shift operation is performed to obtain the complex spectrum. The real and imaginary parts of the complex number's spectrum are spliced ​​along the channel dimension to form a tensor; The tensor is input into a shared frequency domain convolution module, which learns the correlation between the real and imaginary parts in the frequency domain to obtain convolutional features. The convolutional features are then fused with the tensor using residual fusion to obtain the first frequency features. The first frequency feature is separated into real and imaginary parts to obtain a complex spectrum. The complex spectrum is then restored back to the spatial domain using inverse frequency shift and two-dimensional inverse Fourier transform, and added to the first feature through residual connection to obtain the frequency domain feature.

[0007] The hyperspectral anomaly detection method based on a dual-domain consistency reconstruction network includes the following steps performed sequentially: S1: Obtain the hyperspectral image to be detected, preprocess the hyperspectral image, and use the preprocessed hyperspectral image for unsupervised training; S2: Train the DDCRNet model constructed by the dual-domain consistency reconstruction network construction method described above using the preprocessed hyperspectral image to obtain the trained DDCRNet model. S3: Construct a dual-domain consistency reconstruction loss function, and use this dual-domain consistency reconstruction loss function to adjust the trained DDCRNet model to obtain the adjusted DDCRNet model; S4: Use the adjusted DDCRNet model to perform anomaly detection on the preprocessed hyperspectral image and generate anomaly detection map.

[0008] Preferably, the preprocessing step in step S1 specifically involves: constructing a multi-scale patch extractor using a sliding window strategy; using the multi-scale patch extractor to extract image patch sequences of different scales from the hyperspectral image; and using a first convolution module to map the image patch sequences to a unified high-dimensional feature space to obtain a high-dimensional image patch sequence.

[0009] Preferably, the dual-domain consistency reconstruction loss function It is expressed by the following formula: ; in, This represents the hyperspectral image to be detected. Represents the background reconstructed image. Indicates the weighting coefficient. Describing the L1 norm, This represents the spectral amplitude obtained after the hyperspectral image undergoes a two-dimensional discrete Fourier transform. This represents the spectral amplitude obtained after the reconstructed background image undergoes a two-dimensional discrete Fourier transform. By adopting the aforementioned design scheme, the beneficial effects of the present invention are as follows: This application uses a dual-domain parallel DDCRNet model for hyperspectral image detection. The spatial branch uses Mamba to capture long-range dependencies, and the frequency branch uses FFT to directly process the global spectrum. The two are structurally independent but tightly coupled through a fusion mechanism, thus achieving true dual-domain collaboration. The frequency branch adopts a joint real and imaginary part modeling strategy, directly performing convolution interaction between the real and imaginary parts in the frequency domain, avoiding complex attention calculations, significantly reducing computational complexity, and preserving global frequency consistency. The spatial branch incorporates a HaarGate-ResFuse module, which explicitly separates low-frequency structure and high-frequency texture using Haar wavelet transform and suppresses high-frequency noise through a gating mechanism. This structural design is unique among existing Mamba-like methods and effectively resolves the contradiction between long-range modeling and local detail preservation. A dual-domain consistency reconstruction loss is introduced, adding a frequency domain amplitude spectrum consistency loss to the spatial L1 loss. This structural improvement forces the network to maintain background consistency in the frequency domain as well, thereby more effectively suppressing background interference and highlighting abnormal targets. Attached Figure Description

[0010] Figure 1 This is a flowchart of the anomaly detection method of the present invention; Figure 2 This is a diagram of the architecture of the Haar wavelet-gated residual fusion module of the present invention; Figure 3 This is a comparison chart of anomaly detection of the present invention with other methods on four datasets; Figure 4 This is a comparison chart of the detection results of the present invention on four datasets. Detailed Implementation

[0011] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are merely some embodiments of this invention, and not all embodiments. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this invention.

[0012] The terms "first," "second," "third," etc., used in the specification, claims, and accompanying drawings of this invention are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion. For example, a process, method, system, product, or apparatus that includes a series of steps or units is not limited to the listed steps or units, but may optionally include steps or units not listed, or may optionally include other steps or units inherent to these processes, methods, products, or apparatuses.

[0013] A method for establishing a dual-domain consistency reconstruction network is proposed, which constructs the DDCRNet model for hyperspectral anomaly detection. The DDCRNet model includes a spatial branch, a frequency branch, a fusion module, and an image reconstruction module. The fusion module includes learnable weight parameters and a feature interaction layer. The learnable weight parameters dynamically adjust the fusion ratio of spatial and frequency domain features, and the feature interaction layer performs channel concatenation and convolution interaction on the weighted spatial and frequency domain features. The image reconstruction module includes a convolutional reconstruction layer, which maps the fused dual-domain features to a background reconstruction image with the same size as the input hyperspectral image.

[0014] This spatial branch uses the Mamba sequence model to capture the long-range contextual dependencies of the input image sequence. At the same time, it uses the Haar wavelet transform to explicitly separate low-frequency structure and high-frequency texture through the Haar wavelet gated residual fusion module (HaarGate-ResFuse), and suppresses high-frequency noise through the gating mechanism to obtain spatial domain features. This frequency branch uses Fast Fourier Transform to map the image sequence to the frequency domain, employs a joint real and imaginary part modeling strategy to perform frequency domain feature interaction, and then recovers the frequency domain features through inverse transform. The fusion module uses learnable parameters to dynamically adjust the spatial domain features and the frequency domain features and perform weighted fusion, and generates a background reconstruction image through the image reconstruction module.

[0015] In this embodiment, the specific processing steps for the spatial branch are as follows: The first feature of the input image sequence is extracted using the Mamba sequence model. ; like Figure 2 As shown, the processing steps of this Haar wavelet-gated residual fusion module are as follows: The structural information of an image is typically concentrated in the low-frequency components, while texture details and noise are mixed in the high-frequency components. Therefore, the Discrete Haar Wavelet Transform (DWT) is used to transform this first feature. Orthogonal decomposition is performed, a guide map is generated through lightweight convolution, and then a single-level wavelet transform is performed on it to obtain the low-frequency approximate subband (LL), the horizontal high-frequency subband (LH), the vertical high-frequency subband (HL), and the diagonal high-frequency subband (HH). Not all high-frequency information is beneficial for anomaly detection, as it often contains a large amount of redundant noise. Therefore, a gating network was designed to evaluate the effectiveness of high-frequency components. The low-frequency approximation subband, the horizontal high-frequency subband, the vertical high-frequency subband, and the diagonal high-frequency subband are spliced ​​together to form the high-frequency subband feature. The high-frequency subband features The input contains gated mapping functions and Sigmoid activation function In the gated network, reliability weights are generated, and these reliability weights are used to adjust the high-frequency sub-band features to obtain enhanced high-frequency features. The process is represented by the following formula: ; in, This represents the element-wise multiplication operator; This reliability weight quantitatively indicates whether the high-frequency texture at the corresponding location should be preserved. A weight value close to 1 indicates that the texture details at that location are rich and reliable and should be preserved; a weight value close to 0 indicates that the location may be noise and should be suppressed.

[0016] The enhanced high-frequency feature is obtained by inverse transformation and projection reconstruction of the low-frequency approximate subband. ; This enhancement feature Feature slices at corresponding positions extracted from the backbone of the Mamba sequence model Perform channel dimension merging and pass To interact, For a learnable point-to-point channel fusion operator, This can be implemented using 1×1 convolutions to map the 2M-channel spliced ​​features to M-channel features while maintaining spatial resolution. This fuses the Mamba backbone features and Haar-guided enhancement features. The interacted features are then added to the first feature using operators to obtain the spatial domain features. The process is represented by the following formula: ; in, This indicates a channel splicing operation.

[0017] Through the above steps, the Haar wavelet-gated residual fusion module of this application can adaptively filter effective textures based on frequency domain characteristics, effectively preventing high-frequency noise from interfering with background modeling. This not only compensates for the shortcomings of the Mamba model in capturing local details, but also promotes the network to generate a feature representation that has both global context and retains fine structure, providing a solid foundation for subsequent anomaly separation.

[0018] This application addresses the shortcomings of convolutional networks and Transformers in capturing global spectral-frequency consistency by proposing a frequency branch based on joint real and imaginary part modeling. This mechanism utilizes the global receptive field of FFT to explicitly capture cross-regional dependencies in the frequency domain, aiming to solve the balance between efficiency and effectiveness in long-distance modeling of existing methods.

[0019] The specific processing steps for this frequency branch are as follows: The first feature is obtained using two-dimensional discrete Fourier transform (2D FFT). The spectrum is then processed and a frequency shift (FFTShift) operation is performed to move the low-frequency components representing the energy of the main image to the center of the spectrum to conform to the spatial locality of subsequent convolution operations, thus obtaining a complex spectrum. ; The complex spectrum real part and the virtual part Tensors are formed by splicing along the channel dimension. ; This tensor Input-shared frequency domain convolution module The shared frequency domain convolution module learns the correlation between the real and imaginary parts (i.e., amplitude and phase) in the frequency domain to obtain convolutional features. These convolutional features are then residually fused with the tensor to obtain the first frequency feature. This efficiently captures global structural consistency, and the process is expressed by the following formula: ; The first frequency feature The complex spectrum is obtained by separating the real and imaginary parts again. The complex spectrum is then restored back to the spatial domain by inverse shift and two-dimensional inverse Fourier transform (2D IFFT). The residual is then added to the first feature to obtain the frequency domain feature.

[0020] Through the steps described above, the frequency branch of this application can explicitly capture global frequency consistency without increasing the computational complexity of quadratic calculations. This not only avoids the high computational cost of self-attention mechanisms but also enhances the model's ability to fit the global background distribution by jointly modeling the real and imaginary parts, making the background reconstruction smoother and more consistent, thus making it easier to highlight targets with abnormal frequency characteristics.

[0021] Hyperspectral anomaly detection methods based on dual-domain consistency reconstruction networks, such as Figure 1 As shown, the steps are executed sequentially as follows: S1: Obtain the hyperspectral image to be detected, preprocess the hyperspectral image, and use the preprocessed hyperspectral image for unsupervised training; the preprocessing step in step S1 is as follows: construct a multi-scale patch extractor using a sliding window strategy, use the multi-scale patch extractor to extract image patch sequences of different scales from the hyperspectral image, use the first convolution module to map the image patch sequence to a unified high-dimensional feature space, and obtain the high-dimensional image patch sequence, i.e., the preprocessed hyperspectral image.

[0022] S2: Train the DDCRNet model constructed by the dual-domain consistency reconstruction network construction method described above using the preprocessed hyperspectral image to obtain the trained DDCRNet model. S3: Construct a dual-domain consistent reconstruction loss function (DDCRLoss) and use it to adjust the trained DDCRNet model to obtain the adjusted DDCRNet model. This dual-domain consistent reconstruction loss function simultaneously constrains spatial pixels and frequency domain amplitude during training, aiming to solve the problem of structural information loss in background reconstruction.

[0023] S4: Use the adjusted DDCRNet model to perform anomaly detection on the preprocessed hyperspectral image and generate anomaly detection map.

[0024] The specific implementation steps of the two-domain consistent reconstruction loss function are as follows: Spatial domain fidelity constraint: Calculate the distance L1 between the hyperspectral image to be detected and the anomaly detection image to ensure pixel-level reconstruction accuracy; Frequency domain amplitude consistency constraint: Perform FFT transform on the hyperspectral image and the anomaly detection image respectively, and calculate their amplitude spectra. and The L1 difference between them. This process forces the model to learn a consistent background representation in the frequency domain as well.

[0025] The dual-domain consistency reconstruction loss function It is expressed by the following formula: ; in, This represents the hyperspectral image to be detected. Represents the background reconstructed image. Indicates the weighting coefficient. Describing the L1 norm, This represents the spectral amplitude obtained after the hyperspectral image undergoes a two-dimensional discrete Fourier transform. This represents the spectral amplitude obtained after the reconstructed background image undergoes a two-dimensional discrete Fourier transform. Through the above steps, the dual-domain consistency reconstruction loss function of this application can constrain the network from two complementary domains, effectively suppressing high-frequency noise interference in the background, making the model more robust when reconstructing the background, and thus highlighting abnormal targets more significantly in the final residual map.

[0026] Figure 3 This paper presents a comparison of anomaly detection results between the proposed method and other methods such as RX, FRFE-RX, CRD, PCA, LREN, Auto-AD, GT-HAD, and SSHAD on the Pavia, Beach-4, MUUFL, and Salians datasets. The first column shows the ground truth anomaly annotations, and subsequent columns show the anomaly response maps generated by different methods. Figure 3 It can be seen that in the anomaly detection map generated by the method of this application, the response of the abnormal target area is more concentrated, while the response of the background area is relatively weak. It can highlight the abnormal target in the presence of complex background, edge texture and high frequency noise, indicating that the dual-domain consistency reconstruction network can effectively suppress background interference and improve the separability of the abnormal area.

[0027] Figure 4 The diagram illustrates the detection results of the proposed method on the Pavia, Beach-4, MUUFL, and Salians datasets. In each dataset, the upper image is the ground truth anomaly annotation map, and the lower image is the anomaly detection map generated by the proposed method. Figure 4 It can be seen that the high-response regions obtained by the method of this application basically correspond to the real abnormal target locations, and can form obvious responses on abnormal targets of different scales and shapes, indicating that the method has good background reconstruction ability and abnormal saliency expression ability.

[0028] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above description is only a specific embodiment of the present invention and is not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for establishing a dual-domain coherent reconfigurable network, characterized in that: A DDCRNet model is constructed for hyperspectral anomaly detection. The DDCRNet model includes spatial branch, frequency branch, fusion module and image reconstruction module. This spatial branch uses the Mamba sequence model to capture the long-range contextual dependencies of the input image sequence. At the same time, it uses the Haar wavelet transform to explicitly separate low-frequency structure and high-frequency texture through the Haar wavelet gated residual fusion module, and suppresses high-frequency noise through the gating mechanism to obtain spatial domain features. This frequency branch uses Fast Fourier Transform to map the image sequence to the frequency domain, employs a joint real and imaginary part modeling strategy to perform frequency domain feature interaction, and then recovers the frequency domain features through inverse transform. The fusion module uses learnable parameters to dynamically adjust the spatial domain features and the frequency domain features and perform weighted fusion, and generates a background reconstruction image through the image reconstruction module.

2. The method of claim 1, wherein: The specific steps for processing this spatial branch are as follows: The first feature of the image sequence was extracted using the Mamba sequence model; The processing steps of this Haar wavelet-gated residual fusion module are as follows: The first feature is orthogonally decomposed using discrete Haar wavelet transform to obtain low-frequency approximate sub-bands, horizontal high-frequency sub-bands, vertical high-frequency sub-bands, and diagonal high-frequency sub-bands. The low-frequency approximate subband, the horizontal high-frequency subband, the vertical high-frequency subband, and the diagonal high-frequency subband are spliced ​​together to form a high-frequency subband feature. This high-frequency subband feature is then input into a gated network containing a gated mapping function and an activation function to generate a reliability weight. This reliability weight is then used to adjust the high-frequency subband feature to obtain an enhanced high-frequency feature. The enhanced high-frequency feature is obtained by inverse transformation of the low-frequency approximate subband and projection recombination. The enhanced feature is merged with the corresponding position feature slice extracted by the Mamba sequence model backbone in the channel dimension and passed through interact, For the learnable point-by-point channel fusion operator, add the features after interaction and the first features to obtain the spatial domain features.

3. The method of claim 2, wherein: The specific processing steps for this frequency branch are as follows: The spectrum of the image sequence is obtained by using the two-dimensional discrete Fourier transform and then a frequency shift operation is performed to obtain the complex spectrum. The real and imaginary parts of the complex number's spectrum are spliced ​​along the channel dimension to form a tensor; The tensor is input into a shared frequency domain convolution module, which learns the correlation between the real and imaginary parts in the frequency domain to obtain convolutional features. The convolutional features are then fused with the tensor using residual fusion to obtain the first frequency features. The first frequency feature is separated into real and imaginary parts to obtain a complex spectrum. The complex spectrum is then restored back to the spatial domain using inverse frequency shift and two-dimensional inverse Fourier transform, and added to the first feature through residual connection to obtain the frequency domain feature.

4. A hyperspectral anomaly detection method based on dual-domain consistent reconstruction network, characterized in that: The steps are as follows, performed sequentially: S1: Obtain the hyperspectral image to be detected, preprocess the hyperspectral image, and use the preprocessed hyperspectral image for unsupervised training; S2: The DDCRNet model constructed by the method for establishing a dual-domain consistency reconstruction network as described in any one of claims 1-3 is trained using the preprocessed hyperspectral image to obtain the trained DDCRNet model. S3: Construct a dual-domain consistency reconstruction loss function, and use this dual-domain consistency reconstruction loss function to adjust the trained DDCRNet model to obtain the adjusted DDCRNet model; S4: Use the adjusted DDCRNet model to perform anomaly detection on the preprocessed hyperspectral image and generate anomaly detection map.

5. The hyperspectral anomaly detection method based on dual-domain consistent reconstruction network of claim 4, wherein: The preprocessing step in step S1 is as follows: a multi-scale patch extractor is constructed using a sliding window strategy; the multi-scale patch extractor is used to extract image patch sequences of different scales from the hyperspectral image; and the first convolution module is used to map the image patch sequences to a unified high-dimensional feature space to obtain a high-dimensional image patch sequence.

6. The hyperspectral anomaly detection method based on dual-domain consistent reconstruction network of claim 5, wherein: The dual-domain consistency reconstruction loss function Is expressed by the following formula: ; wherein, denotes a hyperspectral image to be detected, denotes a background reconstruction image, denotes a weight coefficient, denotes an L1 norm, denotes a spectral amplitude obtained by two-dimensional discrete Fourier transform of the hyperspectral image, denotes a spectral amplitude obtained by two-dimensional discrete Fourier transform of the background reconstruction image.