A hyperspectral image super-resolution method, device, system and storage medium

By constructing high-resolution and low-resolution models for feature analysis and training, the problems of high computational cost and poor interpretability of hyperspectral image super-resolution methods are solved, achieving high-quality image reconstruction and reducing computational burden.

CN120689203BActive Publication Date: 2026-06-19WUHAN INST OF TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
WUHAN INST OF TECH
Filing Date
2025-05-26
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing hyperspectral image super-resolution methods are computationally expensive and time-consuming, and deep learning-based methods are difficult to interpret and verify, making them unsuitable for applications in fields requiring high interpretability.

Method used

By constructing high-resolution and low-resolution models, performing feature analysis to obtain dictionary features and coefficient features, and training the super-resolution model, the mapping from low-resolution images to high-resolution images is achieved.

Benefits of technology

Higher quality, high spatial resolution hyperspectral images were generated, reducing computational costs and time while maintaining the interpretability of the model.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN120689203B_ABST
    Figure CN120689203B_ABST
Patent Text Reader

Abstract

This invention provides a method, apparatus, system, and storage medium for hyperspectral image super-resolution, belonging to the field of image reconstruction technology. The method includes: importing an original high-resolution hyperspectral image and an original low-resolution hyperspectral image; constructing a high-resolution model and a low-resolution model; and performing high-resolution feature analysis on the original high-resolution hyperspectral image using the high-resolution model to obtain high-resolution dictionary features and target high-resolution coefficient features. This invention solves the technical problem of poor quality of generated results in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images, realizing end-to-end mapping between low-resolution and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates primarily to the field of image reconstruction technology, specifically to a hyperspectral image super-resolution method, apparatus, system, and storage medium. Background Technology

[0002] Hyperspectral images possess rich reflectance information and a continuous and broad spectral range. Thanks to this abundant spectral information, hyperspectral images are widely used in various fields such as target detection, land cover classification, and remote sensing change detection. However, the acquisition of hyperspectral images is always subject to a trade-off between spatial detail and spectral coverage. Due to the physical limitations of imaging sensors, it is impossible to obtain high-resolution hyperspectral images that simultaneously cover both the spatial and spectral domains. Therefore, super-resolution, as a classic computer vision task, is a powerful alternative to directly acquiring high-resolution hyperspectral images in terms of economy, practicality, and sustainability.

[0003] In recent years, researchers have proposed numerous super-resolution methods for single hyperspectral images, including sparse representation, total variational methods, and low-rank priors. However, these methods typically model the super-resolution process as a complex optimization problem. While highly interpretable, they require extensive iterative computation, leading to high computational costs and time-consuming processes. Deep learning methods have gained widespread attention in various fields of computer vision due to their powerful feature extraction and representation capabilities. Some scholars have utilized deep learning-based hyperspectral image super-resolution techniques to improve the subjective and objective quality of reconstructed images. Although deep learning-based methods are highly efficient in their optimization processes, their inherent "black box" nature makes their decision-making processes difficult to understand and verify, limiting their generalization to highly interpretable domains such as government and public policy formulation. Summary of the Invention

[0004] The technical problem to be solved by the present invention is to provide a hyperspectral image super-resolution method, apparatus, system and storage medium to address the shortcomings of the prior art.

[0005] The technical solution of this invention to solve the above-mentioned technical problems is as follows: A hyperspectral image super-resolution method, comprising the following steps:

[0006] Import multiple original high-resolution hyperspectral images and original low-resolution hyperspectral images corresponding to each of the original high-resolution hyperspectral images;

[0007] A high-resolution model and a low-resolution model are constructed. The high-resolution model is used to perform high-resolution feature analysis on each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0008] The low-resolution model is used to perform low-resolution feature analysis on each of the original low-resolution hyperspectral images to obtain the target low-resolution dictionary features and target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0009] The low-resolution model is trained using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model.

[0010] Import the low-resolution image to be processed, and reconstruct the image using the super-resolution model to obtain the hyperspectral image super-resolution result.

[0011] Another technical solution of the present invention to solve the above-mentioned technical problems is as follows: A hyperspectral image super-resolution device, comprising:

[0012] The import module is used to import multiple original high-resolution hyperspectral images and original low-resolution hyperspectral images corresponding to each of the original high-resolution hyperspectral images.

[0013] The high-resolution feature analysis module is used to construct a high-resolution model and a low-resolution model. The high-resolution model is used to perform high-resolution feature analysis on each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0014] The low-resolution feature analysis module is used to perform low-resolution feature analysis on each of the original low-resolution hyperspectral images using the low-resolution model, so as to obtain the target low-resolution dictionary features and the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0015] The model training module is used to train the low-resolution model using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model.

[0016] The import module is also used to import low-resolution images to be processed;

[0017] The super-resolution result acquisition module is used to reconstruct the low-resolution image to be processed using the super-resolution model to obtain the hyperspectral image super-resolution result.

[0018] Based on the above-mentioned hyperspectral image super-resolution method, the present invention also provides a hyperspectral image super-resolution system.

[0019] Another technical solution of the present invention to solve the above-mentioned technical problems is as follows: a hyperspectral image super-resolution system, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the computer program, the hyperspectral image super-resolution method described above is implemented.

[0020] Based on the above-described hyperspectral image super-resolution method, the present invention also provides a computer-readable storage medium.

[0021] Another technical solution of the present invention to solve the above-mentioned technical problems is as follows: a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the hyperspectral image super-resolution method as described above.

[0022] The beneficial effects of this invention are as follows: High-resolution dictionary features and target high-resolution coefficient features are obtained through high-resolution feature analysis of the original high-resolution hyperspectral image using a high-resolution model; target low-resolution dictionary features and target low-resolution coefficient features are obtained through low-resolution feature analysis of the original low-resolution hyperspectral image using a low-resolution model; a super-resolution model is obtained by training the low-resolution model using the high-resolution dictionary features, target high-resolution coefficient features, target low-resolution dictionary features, and target low-resolution coefficient features; and a hyperspectral image super-resolution result is obtained by reconstructing the low-resolution image to be processed using the super-resolution model. This solves the technical problem of poor quality of generated results in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images, and realizes end-to-end mapping between low-resolution and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time. Attached Figure Description

[0023] Figure 1 A schematic flowchart of the hyperspectral image super-resolution method provided in an embodiment of the present invention;

[0024] Figure 2One of the schematic diagrams of the training model for the hyperspectral image super-resolution method provided in an embodiment of the present invention;

[0025] Figure 3 The second schematic diagram of the training model for the hyperspectral image super-resolution method provided in this embodiment of the invention;

[0026] Figure 4 A block diagram of a hyperspectral image super-resolution device provided in an embodiment of the present invention. Detailed Implementation

[0027] The principles and features of the present invention are described below with reference to the accompanying drawings. The examples given are only for explaining the present invention and are not intended to limit the scope of the present invention.

[0028] Figure 1 This is a flowchart illustrating a hyperspectral image super-resolution method provided in an embodiment of the present invention.

[0029] like Figure 1 As shown, a hyperspectral image super-resolution method includes the following steps:

[0030] Import multiple original high-resolution hyperspectral images and original low-resolution hyperspectral images corresponding to each of the original high-resolution hyperspectral images;

[0031] A high-resolution model and a low-resolution model are constructed. The high-resolution model is used to perform high-resolution feature analysis on each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0032] The low-resolution model is used to perform low-resolution feature analysis on each of the original low-resolution hyperspectral images to obtain the target low-resolution dictionary features and target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0033] The low-resolution model is trained using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model.

[0034] Import the low-resolution image to be processed, and reconstruct the image using the super-resolution model to obtain the hyperspectral image super-resolution result.

[0035] It should be understood that a high-resolution hyperspectral image (i.e., the original high-resolution hyperspectral image) is input, and the teacher network (i.e., the high-resolution model) is trained in a self-supervised learning manner.

[0036] It should be understood that training the teacher network (i.e., the high-resolution model) to acquire a high-resolution dictionary. (i.e., high-resolution dictionary features) and sparse representation coefficients (i.e., target high-resolution coefficient characteristics).

[0037] Specifically, the teacher network (i.e., the high-resolution model) is frozen, and the student network (i.e., the low-resolution model) is trained to obtain a low-resolution high-dimensional dictionary. (i.e., target low-resolution dictionary features) and sparse representation coefficients (i.e., target low-resolution coefficient features).

[0038] In the above embodiments, high-resolution dictionary features and target high-resolution coefficient features are obtained by analyzing the high-resolution features of the original high-resolution hyperspectral image using a high-resolution model. Target low-resolution dictionary features and target low-resolution coefficient features are obtained by analyzing the low-resolution features of the original low-resolution hyperspectral image using a low-resolution model. A super-resolution model is obtained by training the low-resolution model using the high-resolution dictionary features, target high-resolution coefficient features, target low-resolution dictionary features, and target low-resolution coefficient features. The super-resolution model is then used to reconstruct the hyperspectral image from the low-resolution image to be processed, resulting in a hyperspectral image super-resolution result. This solves the technical problem of poor quality of generated results in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images. It achieves end-to-end mapping between low-resolution and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time.

[0039] Optionally, as an embodiment of the present invention, the high-resolution model includes multiple first ESSA Block networks and a high-resolution coefficient generation network;

[0040] The process of performing high-resolution feature analysis on each of the original high-resolution hyperspectral images using the high-resolution model to obtain the high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes:

[0041] By using multiple first ESSA Block networks, high-resolution dictionary features are extracted from each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images.

[0042] The high-resolution coefficient generation network is used to extract high-resolution coefficient features from each of the original high-resolution hyperspectral images to obtain target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0043] Preferably, the number of the first ESSA Block networks can be 5.

[0044] Specifically, the first ESSA Block network is an existing technology and adopts the ESSA Block network described in the literature: Zhang M, Zhang C, Zhang Q, et al. Essaformer: Efficient transformer for hyperspectral image super-resolution[C] / / Proceedings of the IEEE / CVF International Conference on Computer Vision.2023:23073-23084.

[0045] It should be understood that this consists of two parts: a dictionary generation network and a coefficient generation network. The dictionary generation network is composed of 5 ESSA Blocks (i.e., the first ESSA Block network), which learns the corresponding high-resolution dictionary based on the input high-resolution hyperspectral image (i.e., the original high-resolution hyperspectral image). (i.e., high-resolution dictionary features).

[0046] In the above embodiments, high-resolution dictionary features and target high-resolution coefficient features are obtained by performing high-resolution feature analysis on the original high-resolution hyperspectral image using a high-resolution model. This solves the technical problem of poor quality of the generated result in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images, and realizes end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images.

[0047] Optionally, as an embodiment of the present invention, the high-resolution coefficient generation network includes a first fully connected layer, multiple Tinv Block layers, a first 1D attention layer, and a second fully connected layer;

[0048] The process of extracting high-resolution coefficient features from each of the original high-resolution hyperspectral images using the high-resolution coefficient generation network to obtain target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes:

[0049] The first fully connected layer is used to perform feature extraction processing on each of the original high-resolution hyperspectral images to obtain the original high-resolution hyperspectral features corresponding to each of the original high-resolution hyperspectral images.

[0050] Each of the original high-resolution hyperspectral features is extracted by multiple TInv Block layers to obtain high-resolution hyperspectral features after feature extraction corresponding to each of the original high-resolution hyperspectral images.

[0051] The first 1D attention layer updates the weights of each high-resolution hyperspectral feature extracted from the original high-resolution hyperspectral image by performing weight updates on each of the extracted features.

[0052] The updated high-resolution hyperspectral features are extracted by the second fully connected layer to obtain the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0053] It should be understood that the first and second fully connected layers, i.e., Linear layers, are also called fully connected layers or densely connected layers in the field of deep learning. They are a very fundamental component of neural networks, capable of mapping input features to a new feature space, which helps the model learn the complex relationships between input and output. Multilayer perceptrons (MLPs) can be constructed by stacking multiple Linear layers and combining them with non-linear activation functions to enhance the model's expressive power.

[0054] Specifically, the TInv Block layer is a prior art technique and adopts the TInv Block layer from the literature: Wang J, Lu T, Huang X, et al. Pan-sharpening via conditional invertible neural network[J]. Information Fusion, 2024, 101:101980.

[0055] It should be understood that the first 1D attention layer is a key component for processing sequential data (such as time series, audio, and text), enabling the model to focus on important parts of the sequence.

[0056] Specifically, the coefficient network (i.e., the high-resolution coefficient generation network) consists of linear layers (i.e., the first fully connected layer and the second fully connected layer), Tinv Blocks (i.e., Tinv Block layers), and 1D attention layers (i.e., the first 1D attention layer), which learn the corresponding sparse representation coefficients based on the input high-resolution hyperspectral image (i.e., the original high-resolution hyperspectral image). (i.e., target high-resolution coefficient characteristics).

[0057] In the above embodiments, the target high-resolution coefficient features are obtained by extracting high-resolution coefficient features from the original high-resolution hyperspectral image through a high-resolution coefficient generation network. This solves the technical problem of poor quality of the generated result in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images. It realizes end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher quality high spatial resolution hyperspectral images and reducing computational costs and time.

[0058] Optionally, as an embodiment of the present invention, the low-resolution model includes a low-resolution dictionary generation network and a low-resolution coefficient generation network;

[0059] The process of performing low-resolution feature analysis on each of the original low-resolution hyperspectral images using the low-resolution model to obtain the target low-resolution dictionary features and target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes:

[0060] The low-resolution dictionary generation network is used to extract low-resolution dictionary features from each of the original low-resolution hyperspectral images to obtain target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images.

[0061] The low-resolution coefficient generation network is used to extract low-resolution coefficient features from each of the original low-resolution hyperspectral images to obtain target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0062] In the above embodiments, the target low-resolution dictionary features and target low-resolution coefficient features are obtained by performing low-resolution feature analysis on the original low-resolution hyperspectral image through a low-resolution model. This realizes an end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time.

[0063] Optionally, as an embodiment of the present invention, the low-resolution dictionary generation network includes multiple second ESSA Block networks and an upsampling layer;

[0064] The process of extracting low-resolution dictionary features from each of the original low-resolution hyperspectral images using the low-resolution dictionary generation network to obtain target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images includes:

[0065] By using multiple second ESSA Block networks, feature extraction is performed on each of the original low-resolution hyperspectral images to obtain the original low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images.

[0066] The upsampling layer performs upsampling processing on each of the original low-resolution dictionary features to obtain target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images.

[0067] Preferably, the number of the plurality of second ESSA Block networks can be 3.

[0068] It should be understood that the second ESSA Block network is prior art and adopts the ESSA Block network described in the literature: Zhang M, Zhang C, Zhang Q, et al. Essaformer: Efficient transformer for hyperspectral image super-resolution[C] / / Proceedings of the IEEE / CVF International Conference on Computer Vision.2023:23073-23084.

[0069] Specifically, a student network (i.e., a low-resolution model) is constructed that is similar to the teacher network (i.e., the high-resolution model) but has a lighter dictionary and coefficient network. During the training of the student network (i.e., the low-resolution model), the teacher network (i.e., the high-resolution model) is frozen. A low-resolution hyperspectral image (i.e., the original low-resolution hyperspectral image) is input, and a dictionary (i.e., target low-resolution dictionary features) and coefficients (i.e. target low-resolution coefficient features) for the low-resolution image are generated. Since the low-resolution image and the high-resolution image have differences in spatial scale, it is necessary to upsample the dictionary of the low-resolution image to align it with the dictionary of the high-resolution image in order to generate a high-dimensional low-resolution dictionary.

[0070] In the above embodiments, the target low-resolution dictionary features are obtained by extracting low-resolution dictionary features from the original low-resolution hyperspectral image through a low-resolution dictionary generation network. This achieves end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time.

[0071] Optionally, as an embodiment of the present invention, the low-resolution coefficient generation network includes a third fully connected layer, a second 1D attention layer, and a fourth fully connected layer;

[0072] The process of extracting low-resolution coefficient features from each of the original low-resolution hyperspectral images using the low-resolution coefficient generation network to obtain target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes:

[0073] The third fully connected layer is used to extract features from each of the original low-resolution hyperspectral images to obtain the original low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0074] The second 1D attention layer updates the weights of each of the original low-resolution coefficient features to obtain the updated low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0075] The fourth fully connected layer is used to extract features from each of the updated low-resolution coefficient features to obtain the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0076] It should be understood that the third and fourth fully connected layers, i.e., Linear layers, are also called fully connected layers or densely connected layers in the field of deep learning. They are a fundamental component of neural networks, capable of mapping input features to a new feature space, which helps the model learn the complex relationships between input and output. Multilayer perceptrons (MLPs) can be constructed by stacking multiple Linear layers and combining them with non-linear activation functions to enhance the model's expressive power.

[0077] It should be understood that the second 1D attention layer is a key component for processing sequential data (such as time series, audio, and text), allowing the model to focus on important parts of the sequence.

[0078] In the above embodiments, the target low-resolution coefficient features are obtained by extracting low-resolution coefficient features from the original low-resolution hyperspectral image through a low-resolution coefficient generation network. This achieves end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time.

[0079] Optionally, as an embodiment of the present invention, the process of training the low-resolution model using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model includes:

[0080] The low-resolution dictionary features of each target and the low-resolution coefficient features of each original high-resolution hyperspectral image are multiplied to obtain the first reconstructed high-resolution hyperspectral image corresponding to each original high-resolution hyperspectral image.

[0081] The high-resolution dictionary features and the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images are multiplied to obtain the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images;

[0082] Spectral features are extracted from each of the first reconstructed high-resolution hyperspectral images using 3×3 convolutional layers to obtain the spectral features corresponding to each of the original high-resolution hyperspectral images and the spectral channel dimensions corresponding to each of the original high-resolution hyperspectral images.

[0083] By using a 1×1 convolutional layer, spatial features are extracted from each of the first reconstructed high-resolution hyperspectral images to obtain the spatial features corresponding to each of the original high-resolution hyperspectral images and the spatial channel dimensions corresponding to each of the original high-resolution hyperspectral images.

[0084] Each of the spectral features is normalized by a layer normalization layer to obtain a spectral query vector corresponding to each of the original high-resolution hyperspectral images.

[0085] The spatial features are normalized by the layer normalization layer to obtain the spatial query vector corresponding to each of the original high-resolution hyperspectral images.

[0086] By performing feature extraction on each of the spectral query vectors through convolutional layers, spectral key vectors corresponding to each of the original high-resolution hyperspectral images are obtained.

[0087] The convolutional layers are used to extract features from each spatial query vector to obtain spatial key vectors corresponding to each of the original high-resolution hyperspectral images.

[0088] Feature extraction is performed on each of the spectral key vectors by depthwise separable convolutional layers to obtain spectral numerical vectors corresponding to each of the original high-resolution hyperspectral images.

[0089] The spatial key vectors are extracted by the depthwise separable convolutional layer to obtain spatial numerical vectors corresponding to the original high-resolution hyperspectral images.

[0090] The spectral attention corresponding to each original high-resolution hyperspectral image is obtained by calculating the spatial query vector, the spectral channel dimension corresponding to each original high-resolution hyperspectral image, and the spectral key vector corresponding to each original high-resolution hyperspectral image using the first formula. The first formula is:

[0091]

[0092] Among them, CA j For the spectral attention corresponding to the j-th original high-resolution hyperspectral image, SoftMax() is the SoftMax function. This is the spatial query vector corresponding to the j-th original high-resolution hyperspectral image. Let j be the spectral key vector corresponding to the j-th original high-resolution hyperspectral image. Let be the spectral channel dimension corresponding to the j-th original high-resolution hyperspectral image;

[0093] The spatial attention corresponding to each original high-resolution hyperspectral image is obtained by calculating the spatial channel dimension corresponding to each original high-resolution hyperspectral image, and the spatial key vector corresponding to each original high-resolution hyperspectral image using the second equation. The second equation is:

[0094]

[0095] Among them, SA j For the spatial attention corresponding to the j-th original high-resolution hyperspectral image, SoftMax() is the SoftMax function. This is the spectral query vector corresponding to the j-th original high-resolution hyperspectral image. Let j be the spatial key vector corresponding to the j-th original high-resolution hyperspectral image. Let be the spatial channel dimension corresponding to the j-th original high-resolution hyperspectral image;

[0096] The target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images is obtained by calculating the spectral features, spectral attention corresponding to each of the original high-resolution hyperspectral images, spectral numerical vector corresponding to each of the original high-resolution hyperspectral images, spatial features corresponding to each of the original high-resolution hyperspectral images, spatial attention corresponding to each of the original high-resolution hyperspectral images, and spatial numerical vector corresponding to each of the original high-resolution hyperspectral images using the third equation.

[0097]

[0098] in, For the j-th original high-resolution hyperspectral image, the target high-resolution hyperspectral image is given by Conv(), where f is the convolutional processing method. FFN () represents the feedforward neural network function, where γ and β are learnable parameters, and CA j For the spectral attention corresponding to the j-th original high-resolution hyperspectral image, Let j be the spatial numerical vector corresponding to the j-th original high-resolution hyperspectral image. SA represents the spatial features corresponding to the j-th original high-resolution hyperspectral image. j For the spatial attention corresponding to the j-th original high-resolution hyperspectral image, Let j be the spectral numerical vector corresponding to the j-th original high-resolution hyperspectral image. The spectral features corresponding to the j-th original high-resolution hyperspectral image;

[0099] Loss functions are calculated for each of the high-resolution dictionary features, the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, and the target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, respectively, to obtain the target distillation loss function corresponding to each of the original high-resolution hyperspectral images;

[0100] The parameters of the low-resolution model are updated based on all the stated target distillation loss functions to obtain the super-resolution model.

[0101] It should be understood that the student network has a high-dimensional, low-resolution dictionary. (i.e., target low-resolution dictionary features) and sparse representation coefficients The initial reconstructed hyperspectral image is obtained by multiplying the (i.e., the target low-resolution coefficient features) together. (i.e., the first reconstructed high-resolution hyperspectral image).

[0102] Specifically, the dictionary (i.e., target low-resolution dictionary features) and coefficients (i.e. target low-resolution coefficient features) learned by the student through online learning are multiplied and then restored to the image space to obtain the initially reconstructed high-resolution hyperspectral image. (i.e., the first reconstructed high-resolution hyperspectral image).

[0103] It should be understood that high-resolution dictionaries (i.e., high-resolution dictionary features) and sparse representation coefficients Multiplying (i.e., the target high-resolution coefficient features) yields a high-resolution image. (i.e., the second reconstructed high-resolution hyperspectral image).

[0104] Specifically, the dictionary (i.e., high-resolution dictionary features) and coefficients (i.e., target high-resolution coefficient features) learned by the teacher network are restored to the image space through the inverse singular value decomposition process, resulting in a reconstructed high-resolution hyperspectral image. (i.e., the second reconstructed high-resolution hyperspectral image).

[0105] It should be understood that optimizing the initial reconstructed hyperspectral image is necessary. The spatial (i.e., spatial features) and spectral features of the first reconstructed high-resolution hyperspectral image are used to obtain the final hyperspectral image. (i.e., a high-resolution hyperspectral image of the target).

[0106] Specifically, the initial reconstruction of the student network (i.e., the first reconstructed high-resolution hyperspectral image) undergoes spatial and spectral feature refinement. First, spectral features are generated using a 3×3 (i.e., 3×3 convolutional layer), and spatial features are generated using a 1×1 convolutional layer. Then, spatial tokens Q are constructed from the spatial and spectral features respectively. spa (i.e., spectral query vector), K spa (i.e., the spectral key vector), V spa (i.e., the spectral numerical vector), and spectral tokens Q spe (i.e., spatial query vector), K spe (i.e., spatial key vector), V spe (i.e., spatial numerical vectors). Spatial and spectral features of the image are recovered through a cross-attention mechanism.

[0107] Specifically, the dictionary branch of the student network needs to upsample the low-resolution dictionary to restore it to a scale aligned with the high-resolution dictionary. Therefore, the initial reconstructed image of the student network (i.e., the first reconstructed high-resolution hyperspectral image) is:

[0108]

[0109] It should be understood that the teacher network dictionary (i.e., high-resolution dictionary features) and coefficients (i.e., target high-resolution coefficient features) supervise and distill the high-dimensional low-resolution dictionary of the student network. (i.e., target low-resolution dictionary features) and sparse representation coefficients (i.e., learning the target low-resolution coefficient features).

[0110] Specifically, considering the reconstruction accuracy of the student network and the unique spectral and spatial characteristics of the image, it is necessary to refine the reconstructed image of the student network. Fine-tuning and optimization are performed to restore more ideal spatial and spectral fidelity. Therefore, a spatial-spectral cross-attention module is proposed, consisting of spatial and spectral branches. Specifically, spectral and spatial features of the image are extracted using 3×3 (i.e., 3×3 convolutional layers) and 1×1 convolutions (i.e., 1×1 convolutional layers), respectively. spe (i.e., spectral characteristics) and X spa (i.e., spatial features) are input into the spatial-spectral cross-attention module, and after layer normalization (i.e., layer normalization layer), convolution (Conv) (i.e., convolutional layer) and depthwise separable convolution (i.e., depthwise separable convolutional layer), the respective query vectors Q (i.e., spatial query vector and spectral query vector), key vectors K (i.e., spatial key vector and spectral key vector), and numerical vectors V (i.e., spatial numerical vector and spectral numerical vector) are obtained, respectively:

[0111] Q spa ,K spa =W QK X spa V spa =W V X spa ,

[0112] Q spe ,K spe =X QK X spe V spe =W V X spe ,

[0113] Among them, W QK W V It is a learnable linear transformation from X spa Get Q spa ,K spa V spa From X spe Get Q spe ,K spe V spe The formulas for calculating spatial attention (SA) and spectral attention (CA) are as follows:

[0114]

[0115] Where d1 and d2 represent X respectively spa and X speThe channel dimension is reduced to prevent gradient vanishing. Since K and Q come from two different inputs, the two attentions fuse information from other dimensions. The final output is as follows:

[0116] Y spa =f FFN (γ(CA×V spa )+X spa ),

[0117] Y spe =f FFN (β(SA×V spe )+X spe ),

[0118] Where β and γ are learnable parameters, FFN represents a feedforward neural network, and the final reconstructed image of the student network is... (i.e., the target high-resolution hyperspectral image) is:

[0119]

[0120] In the above embodiments, a super-resolution model is obtained by training the low-resolution model with all high-resolution dictionary features, all target high-resolution coefficient features, all target low-resolution dictionary features, and all target low-resolution coefficient features. This recovers the spectral and texture details of the hyperspectral image, solves the technical problem of poor quality of the generated result in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images, and realizes end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher-quality high spatial resolution hyperspectral images and reducing computational costs and time.

[0121] Optionally, as an embodiment of the present invention, the process of calculating the loss function for each of the high-resolution dictionary features, the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, and the target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, to obtain the target distillation loss function corresponding to each of the original high-resolution hyperspectral images, includes:

[0122] The fourth equation is used to calculate the target distillation loss function corresponding to each of the original high-resolution hyperspectral images, the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images, the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, and the target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, respectively. The fourth equation is:

[0123]

[0124] in,

[0125] in, Let α, δ, and ε be the target distillation loss function corresponding to the j-th original high-resolution hyperspectral image, where α, δ, and ε are all balance coefficients. For the j-th original high-resolution hyperspectral image, the target high-resolution hyperspectral image is... This is the second reconstructed high-resolution hyperspectral image corresponding to the j-th original high-resolution hyperspectral image. Let be the dictionary distillation loss function corresponding to the j-th original high-resolution hyperspectral image. Let be the coefficient distillation loss function corresponding to the j-th original high-resolution hyperspectral image. For the j-th original high-resolution hyperspectral image, the high-resolution dictionary features are... For the j-th original high-resolution hyperspectral image, the target low-resolution dictionary features are... For the target high-resolution coefficient features corresponding to the j-th original high-resolution hyperspectral image, Let be the target low-resolution coefficient feature corresponding to the j-th original high-resolution hyperspectral image, and let |||1 be the L1 norm.

[0126] It should be understood that this is based on high-resolution dictionaries obtained through the teacher network. (i.e., high-resolution dictionary features) and coefficients (i.e., target high-resolution coefficient features), using relational knowledge distillation to supervise and guide student network dictionaries. (i.e., target low-resolution dictionary features) and coefficients The learning and updating of (i.e., the target low-resolution coefficient features) is specifically formulated as follows:

[0127]

[0128] Among them, L dicL represents the loss of lexicographic distillation (i.e., the lexicographic distillation loss function). coe This represents the loss from coefficient distillation (i.e., the coefficient distillation loss function).

[0129] Specifically, the distillation loss function (i.e., the target distillation loss function) of the student network is:

[0130]

[0131] Where α, δ and ε are balance coefficients.

[0132] In the above embodiments, the target distillation loss function is obtained by calculating the loss function for each high-resolution dictionary feature, target high-resolution coefficient feature, target low-resolution dictionary feature, target low-resolution coefficient feature, second reconstructed high-resolution hyperspectral image, and target high-resolution hyperspectral image. This solves the technical problem of poor quality of generated results in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images. It realizes end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher quality high spatial resolution hyperspectral images and reducing computational cost and time.

[0133] Optionally, as another embodiment of the present invention, the present invention includes: according to dictionary learning theory, a high-dimensional tensor can be decomposed into the product of sparse coefficients and a low-rank subspace, i.e., the product of sparse representation coefficients and a low-rank dictionary. To achieve effective supervision of dictionary and coefficient learning, this method adopts the idea of ​​knowledge distillation, divided into two stages: a teacher network and a student network. The dictionary and coefficients of the high-resolution image learned by the teacher network supervise the learning of the dictionary and coefficients during the low-resolution image reconstruction process. Specifically, the teacher network is trained to acquire the dictionary of the high-resolution image. and sparse representation coefficients The high-resolution image is obtained by multiplying the high-resolution dictionary and the sparse representation coefficients. Freeze the teacher network and train the student network to obtain a low-resolution dictionary and sparse representation coefficients. Upsampling the dictionary to obtain a high-dimensional, low-resolution dictionary Teacher Network Dictionary Sum of coefficients Dictionary for monitoring student networks Sum of coefficients Learning and updating; the dictionary for student networks Sum of coefficients Multiplication yields the initial reconstructed high-resolution hyperspectral image. Spatial-spectral cross-attention module refinement By analyzing the spatial and spectral features, a high spatial resolution hyperspectral image is obtained through final super-resolution reconstruction. This invention solves the technical problem of poor quality of generated results in the current process of generating high-resolution hyperspectral images from low-resolution hyperspectral images. It realizes end-to-end mapping between low-resolution hyperspectral images and high-resolution hyperspectral images, thereby generating higher quality high spatial resolution hyperspectral images.

[0134] Optionally, as another embodiment of the present invention, the present invention includes the following steps:

[0135] Step S1: Train the teacher network to acquire a high-resolution dictionary and sparse representation coefficients

[0136] Step S2: assemble the high-resolution dictionary and sparse representation coefficients Multiplication yields a high-resolution image.

[0137] Step S3: Freeze the teacher network and train the student network to acquire a low-resolution high-dimensional dictionary. and sparse representation coefficients

[0138] Step S4: Dictionary for the Teacher Network Sum of coefficients High-dimensional low-resolution dictionary for supervising student networks and sparse representation coefficients study;

[0139] Step S5: High-dimensional low-resolution dictionary for the student network and sparse representation coefficients Multiplication yields the initial reconstructed hyperspectral image.

[0140] Step S6: Spatial-spectral cross-attention module optimizes the initial reconstructed hyperspectral image Spatial and spectral features are used to obtain the final hyperspectral image.

[0141] Optionally, as another embodiment of the present invention, the present invention addresses the problems of complex optimization process and poor interpretability of existing deep learning-based hyperspectral image super-resolution technology by applying sparse representation modeling theory, that is, assuming that a three-dimensional hyperspectral image can be decomposed into a product of a low-rank dictionary and sparse coefficients, and proposes a new framework.

[0142] Optionally, as another embodiment of the present invention, compared with existing methods, the present invention has advantages and positive effects: The present invention consists of a teacher network, a student network, and a spatial-spectral cross-attention module. The teacher network is trained to obtain a high-resolution dictionary and sparse representation coefficients. The teacher network is frozen, and the student network is trained to obtain a low-resolution high-dimensional dictionary and sparse representation coefficients. The dictionary and coefficients of the teacher network supervise and distill the learning of the dictionary and sparse representation coefficients of the student network. The spatial-spectral cross-attention module optimizes the spatial and spectral features of the reconstructed hyperspectral image by the student network, obtaining the final hyperspectral image. The present invention uses self-supervised learning from the teacher stage to obtain the truth values ​​of the dictionary and coefficients, guiding the lightweight student network to learn the dictionary and coefficients. The spatial-spectral cross-attention module ensures the fidelity of the spectral and spatial information in the reconstruction results of the student network.

[0143] Optionally, as another embodiment of the present invention, the environment used in this example is: the server's CPU is The system consists of a Xeon(R)Silver 4316 CPU, an NVIDIA GeForce RTX 4090D GPU, an Ubuntu 20.04.5 operating system, and a PyTorch 2.5.1+cu118 and Python 3.11 compilation environment.

[0144] Optionally, as another embodiment of the present invention, based on dictionary learning theory, a high-dimensional tensor can be decomposed into the product of sparse coefficients and a low-rank subspace, i.e., the product of sparse representation coefficients and a low-rank dictionary. Combining dictionary learning and the low-rank property of the spectrum, the hyperspectral image I can be decomposed into...

[0145] I = D ic C oe ,

[0146] D ic Represents a low-rank dictionary, C oe Represents the coefficient.

[0147] High-resolution multispectral images are represented as I hr ∈R C×H×W The corresponding low-resolution hyperspectral image I lr ∈R C×h×w In the diagram, H,h and W,w represent height and width, respectively, and C represents the number of spectral bands in the hyperspectral image. hr and I lr They have the same spectral resolution but different spatial resolutions. Considering the difficulty in obtaining the dictionary and true values ​​of coefficients for hyperspectral images, a high-resolution hyperspectral image I is used. hr The teacher network is trained in a self-supervised manner. The teacher network mainly consists of a coefficient learning branch and a dictionary learning branch, which are used to acquire the dictionary for the high-resolution image. Sum of coefficients And restore to a high-resolution hyperspectral image The loss function L of the teacher network t as follows:

[0148]

[0149] in,

[0150] Alternatively, as another embodiment of the present invention, such as Figure 2 As shown, the advantage of knowledge distillation in this invention lies in its ability to transfer knowledge from a complex, large teacher network to a smaller, simpler student network. Therefore, although the student network also mainly consists of dictionary learning branches and coefficient learning branches, it is much lighter. It should be noted that the dictionary branch of the student network needs to upsample the low-resolution dictionary to restore it to a scale aligned with the high-resolution dictionary. Therefore, the initial reconstructed image of the student network is:

[0151]

[0152] Alternatively, as another embodiment of the present invention, such as Figure 3 As shown, this invention utilizes a high-resolution dictionary obtained through a teacher network. Sum of coefficients Using relational knowledge distillation, we supervise and guide students in using online dictionaries. Sum of coefficients The specific formulas for learning and updating are as follows:

[0153]

[0154] Among them, L dic L represents the loss during dictionary distillation. coe This indicates the loss due to coefficient distillation.

[0155] Optionally, as another embodiment of the present invention, compared with existing methods, the advantages and positive effects of the present invention are as follows: To alleviate the uninterpretability problem of hyperspectral image super-resolution based on deep learning, the present invention combines the advantages of dictionary learning and deep learning, decomposing the hyperspectral image reconstruction task into the task of learning sparse representation coefficients and a low-rank dictionary. To generate supervision of the dictionary and coefficients, a self-supervised teacher network is constructed, learning realistic dictionaries and coefficients from high-resolution images to guide the training of a lightweight student network, recovering the spectral and texture details of the hyperspectral image.

[0156] Figure 4 This is a block diagram of a hyperspectral image super-resolution device provided in an embodiment of the present invention.

[0157] Alternatively, as another embodiment of the present invention, such as Figure 4 As shown, a hyperspectral image super-resolution device includes:

[0158] The import module is used to import multiple original high-resolution hyperspectral images and original low-resolution hyperspectral images corresponding to each of the original high-resolution hyperspectral images.

[0159] The high-resolution feature analysis module is used to construct a high-resolution model and a low-resolution model. The high-resolution model is used to perform high-resolution feature analysis on each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0160] The low-resolution feature analysis module is used to perform low-resolution feature analysis on each of the original low-resolution hyperspectral images using the low-resolution model, so as to obtain the target low-resolution dictionary features and the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

[0161] The model training module is used to train the low-resolution model using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model.

[0162] The import module is also used to import low-resolution images to be processed;

[0163] The super-resolution result acquisition module is used to reconstruct the low-resolution image to be processed using the super-resolution model to obtain the hyperspectral image super-resolution result.

[0164] Optionally, another embodiment of the present invention provides a hyperspectral image super-resolution system, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the hyperspectral image super-resolution method as described above. This system can be a computer or similar system.

[0165] Optionally, another embodiment of the present invention provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the hyperspectral image super-resolution method as described above.

[0166] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such process, method, article, or apparatus.

[0167] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working process of the above-described apparatus and unit can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0168] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative. For instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed.

[0169] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of the embodiments of the present invention, depending on actual needs.

[0170] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0171] If the integrated unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0172] The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.

Claims

1. A hyperspectral image super-resolution method, characterized in that, Includes the following steps: Import multiple original high-resolution hyperspectral images and original low-resolution hyperspectral images corresponding to each of the original high-resolution hyperspectral images; A high-resolution model and a low-resolution model are constructed. The high-resolution model is used to perform high-resolution feature analysis on each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images. The low-resolution model is used to perform low-resolution feature analysis on each of the original low-resolution hyperspectral images to obtain the target low-resolution dictionary features and target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images. The low-resolution model is trained using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model. Import the low-resolution image to be processed, and reconstruct the image using the super-resolution model to obtain the hyperspectral image super-resolution result; The high-resolution model includes multiple first ESSA Block networks and a high-resolution coefficient generation network; The process of performing high-resolution feature analysis on each of the original high-resolution hyperspectral images using the high-resolution model to obtain the high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes: By using multiple first ESSA Block networks, high-resolution dictionary features are extracted from each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images. The high-resolution coefficient generation network is used to extract high-resolution coefficient features from each of the original high-resolution hyperspectral images to obtain target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

2. The hyperspectral image super-resolution method according to claim 1, characterized in that, The high-resolution coefficient generation network includes a first fully connected layer, multiple Tinv Block layers, a first 1D attention layer, and a second fully connected layer. The process of extracting high-resolution coefficient features from each of the original high-resolution hyperspectral images using the high-resolution coefficient generation network to obtain target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes: The first fully connected layer is used to perform feature extraction processing on each of the original high-resolution hyperspectral images to obtain the original high-resolution hyperspectral features corresponding to each of the original high-resolution hyperspectral images. Each of the original high-resolution hyperspectral features is extracted by multiple TInv Block layers to obtain high-resolution hyperspectral features after feature extraction corresponding to each of the original high-resolution hyperspectral images. The first 1D attention layer updates the weights of each high-resolution hyperspectral feature extracted from the original high-resolution hyperspectral image by performing weight updates on each of the extracted features. The updated high-resolution hyperspectral features are extracted by the second fully connected layer to obtain the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

3. The hyperspectral image super-resolution method according to claim 1, characterized in that, The low-resolution model includes a low-resolution dictionary generation network and a low-resolution coefficient generation network; The process of performing low-resolution feature analysis on each of the original low-resolution hyperspectral images using the low-resolution model to obtain the target low-resolution dictionary features and target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes: The low-resolution dictionary generation network is used to extract low-resolution dictionary features from each of the original low-resolution hyperspectral images to obtain target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images. The low-resolution coefficient generation network is used to extract low-resolution coefficient features from each of the original low-resolution hyperspectral images to obtain target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

4. The hyperspectral image super-resolution method according to claim 3, characterized in that, The low-resolution dictionary generation network includes multiple second ESSA Block networks and an upsampling layer; The process of extracting low-resolution dictionary features from each of the original low-resolution hyperspectral images using the low-resolution dictionary generation network to obtain target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images includes: By using multiple second ESSA Block networks, feature extraction is performed on each of the original low-resolution hyperspectral images to obtain the original low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images. The upsampling layer performs upsampling processing on each of the original low-resolution dictionary features to obtain target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images.

5. The hyperspectral image super-resolution method according to claim 3, characterized in that, The low-resolution coefficient generation network includes a third fully connected layer, a second 1D attention layer, and a fourth fully connected layer. The process of extracting low-resolution coefficient features from each of the original low-resolution hyperspectral images using the low-resolution coefficient generation network to obtain target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images includes: The third fully connected layer is used to extract features from each of the original low-resolution hyperspectral images to obtain the original low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images. The second 1D attention layer updates the weights of each of the original low-resolution coefficient features to obtain the updated low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images. The fourth fully connected layer is used to extract features from each of the updated low-resolution coefficient features to obtain the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

6. The hyperspectral image super-resolution method according to claim 1, characterized in that, The process of training the low-resolution model using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain the super-resolution model includes: The low-resolution dictionary features of each target and the low-resolution coefficient features of each original high-resolution hyperspectral image are multiplied to obtain the first reconstructed high-resolution hyperspectral image corresponding to each original high-resolution hyperspectral image. The high-resolution dictionary features and the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images are multiplied to obtain the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images; Spectral features are extracted from each of the first reconstructed high-resolution hyperspectral images using 3×3 convolutional layers to obtain the spectral features corresponding to each of the original high-resolution hyperspectral images and the spectral channel dimensions corresponding to each of the original high-resolution hyperspectral images. Spatial features are extracted from each of the first reconstructed high-resolution hyperspectral images using a 1×1 convolutional layer to obtain the spatial features corresponding to each of the original high-resolution hyperspectral images and the spatial channel dimensions corresponding to each of the original high-resolution hyperspectral images. Each of the spectral features is normalized by a layer normalization layer to obtain a spectral query vector corresponding to each of the original high-resolution hyperspectral images. The spatial features are normalized by the layer normalization layer to obtain the spatial query vector corresponding to each of the original high-resolution hyperspectral images. By performing feature extraction on each of the spectral query vectors through convolutional layers, spectral key vectors corresponding to each of the original high-resolution hyperspectral images are obtained. The convolutional layers are used to extract features from each spatial query vector to obtain spatial key vectors corresponding to each of the original high-resolution hyperspectral images. Feature extraction is performed on each of the spectral key vectors by depthwise separable convolutional layers to obtain spectral numerical vectors corresponding to each of the original high-resolution hyperspectral images. The spatial key vectors are extracted by the depthwise separable convolutional layer to obtain spatial numerical vectors corresponding to the original high-resolution hyperspectral images. The spectral attention corresponding to each original high-resolution hyperspectral image is obtained by calculating the spatial query vector, the spectral channel dimension corresponding to each original high-resolution hyperspectral image, and the spectral key vector corresponding to each original high-resolution hyperspectral image using the first formula. The first formula is: , in, For the first Spectral attention corresponding to each original high-resolution hyperspectral image For the SoftMax function, For the first Spatial query vector corresponding to each original high-resolution hyperspectral image For the first Spectral key vectors corresponding to the original high-resolution hyperspectral images For the first The spectral channel dimensions corresponding to each original high-resolution hyperspectral image; The spatial attention corresponding to each original high-resolution hyperspectral image is obtained by calculating the spatial channel dimension corresponding to each original high-resolution hyperspectral image, and the spatial key vector corresponding to each original high-resolution hyperspectral image using the second equation. The second equation is: , in, For the first Spatial attention corresponding to each original high-resolution hyperspectral image For the SoftMax function, For the first A spectral query vector corresponding to a raw high-resolution hyperspectral image. For the first Spatial key vectors corresponding to the original high-resolution hyperspectral images For the first The spatial channel dimension corresponding to each original high-resolution hyperspectral image; The target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images is obtained by calculating the spectral features, spectral attention corresponding to each of the original high-resolution hyperspectral images, spectral numerical vector corresponding to each of the original high-resolution hyperspectral images, spatial features corresponding to each of the original high-resolution hyperspectral images, spatial attention corresponding to each of the original high-resolution hyperspectral images, and spatial numerical vector corresponding to each of the original high-resolution hyperspectral images using the third equation. , in, For the first The target high-resolution hyperspectral image corresponding to the original high-resolution hyperspectral image For convolution processing, For feedforward neural network functions, and All are learnable parameters. For the first Spectral attention corresponding to each original high-resolution hyperspectral image For the first The spatial numerical vector corresponding to the original high-resolution hyperspectral image For the first Spatial features corresponding to the original high-resolution hyperspectral image For the first Spatial attention corresponding to each original high-resolution hyperspectral image For the first The spectral numerical vector corresponding to the original high-resolution hyperspectral image For the first Spectral features corresponding to the original high-resolution hyperspectral image; Loss functions are calculated for each of the high-resolution dictionary features, the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, and the target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, respectively, to obtain the target distillation loss function corresponding to each of the original high-resolution hyperspectral images; The parameters of the low-resolution model are updated based on all the stated target distillation loss functions to obtain the super-resolution model.

7. The hyperspectral image super-resolution method according to claim 6, characterized in that, The process of calculating the loss function for each of the high-resolution dictionary features, the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, and the target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, to obtain the target distillation loss function corresponding to each of the original high-resolution hyperspectral images, includes: The fourth equation is used to calculate the target distillation loss function corresponding to each of the original high-resolution hyperspectral images, the target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images, the target low-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images, the second reconstructed high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, and the target high-resolution hyperspectral image corresponding to each of the original high-resolution hyperspectral images, respectively. The fourth equation is: , in, , , in, For the first The target distillation loss function corresponding to the original high-resolution hyperspectral image , as well as All are balance coefficients. For the first The target high-resolution hyperspectral image corresponding to the original high-resolution hyperspectral image For the first The second reconstructed high-resolution hyperspectral image corresponds to the original high-resolution hyperspectral image. For the first The dictionary distillation loss function corresponding to the original high-resolution hyperspectral image For the first The coefficient distillation loss function corresponding to the original high-resolution hyperspectral image For the first High-resolution dictionary features corresponding to the original high-resolution hyperspectral image For the first The target low-resolution dictionary features corresponding to the original high-resolution hyperspectral image. For the first The target high-resolution coefficient features corresponding to the original high-resolution hyperspectral image For the first The target's low-resolution coefficient features corresponding to the original high-resolution hyperspectral image. It is an L1 norm.

8. A hyperspectral image super-resolution device, characterized in that, include: The import module is used to import multiple original high-resolution hyperspectral images and original low-resolution hyperspectral images corresponding to each of the original high-resolution hyperspectral images. The high-resolution feature analysis module is used to construct a high-resolution model and a low-resolution model. The high-resolution model is used to perform high-resolution feature analysis on each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images and target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images. The low-resolution feature analysis module is used to perform low-resolution feature analysis on each of the original low-resolution hyperspectral images using the low-resolution model, so as to obtain the target low-resolution dictionary features and the target low-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images. The model training module is used to train the low-resolution model using all the high-resolution dictionary features, all the target high-resolution coefficient features, all the target low-resolution dictionary features, and all the target low-resolution coefficient features to obtain a super-resolution model. The import module is also used to import low-resolution images to be processed; The super-resolution result acquisition module is used to reconstruct the low-resolution image to be processed using the super-resolution model to obtain the hyperspectral image super-resolution result. The high-resolution model includes multiple first ESSA Block networks and a high-resolution coefficient generation network; The high-resolution feature analysis module is specifically used for: By using multiple first ESSA Block networks, high-resolution dictionary features are extracted from each of the original high-resolution hyperspectral images to obtain high-resolution dictionary features corresponding to each of the original high-resolution hyperspectral images. The high-resolution coefficient generation network is used to extract high-resolution coefficient features from each of the original high-resolution hyperspectral images to obtain target high-resolution coefficient features corresponding to each of the original high-resolution hyperspectral images.

9. A computer-readable storage medium storing a computer program, characterized in that, When the computer program is executed by a processor, it implements the hyperspectral image super-resolution method as described in any one of claims 1 to 7.