Hyperspectral and multispectral image fusion method and system based on deep dictionary learning
By employing a deep dictionary learning method and utilizing a hybrid linear fusion module to construct a mapping relationship between hyperspectral and multispectral images, the problems of spectral information loss and computational resource dependence in existing technologies are solved, achieving efficient image fusion results.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- WUHAN INST OF TECH
- Filing Date
- 2025-05-29
- Publication Date
- 2026-06-19
AI Technical Summary
Existing hyperspectral and multispectral image fusion methods, while improving spatial resolution, are prone to destroying spectral information, leading to spectral distortion. Furthermore, deep learning-based methods suffer from limited model generalization ability and high computational resource dependence.
A deep dictionary learning-based approach is adopted, which constructs a mapping relationship from low spatial resolution hyperspectral images to high spatial resolution hyperspectral images by using a hybrid linear fusion module including a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. Image fusion is then performed using a dimensionality-reduced convolutional neural network, the least squares method, a transformer structure, and a one-dimensional attention mechanism.
It effectively alleviates the problem of non-overlapping spectral responses, reduces the blurring and spectral distortion of the fusion results, and features process interpretability and lightweight characteristics, thereby improving the accuracy and efficiency of image fusion.
Smart Images

Figure CN120726430B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of remote sensing image processing technology, and in particular to a method and system for fusion of hyperspectral and multispectral images based on deep dictionary learning. Background Technology
[0002] Hyperspectral images possess rich spectral dimensional information, enabling precise capture of the spectral characteristics of ground objects. However, limitations in imaging principles and sensor technology typically result in lower spatial resolution. Multispectral images, on the other hand, offer higher spatial resolution, but their spectral information is relatively limited. Fusing hyperspectral and multispectral images combines the advantages of both, generating a high-resolution hyperspectral image that possesses both rich spectral information and high spatial resolution. High-quality fusion results can reduce sensor constraints and improve the performance of remote sensing applications, providing a superior data foundation for areas such as land cover classification, target detection and recognition, and environmental monitoring.
[0003] Early hyperspectral and multispectral fusion methods were primarily based on dictionary learning theory, assuming that the low spatial resolution hyperspectral image and the desired high spatial resolution hyperspectral image are described by the same dictionary, and the result is obtained through sparse coding and dictionary learning. However, while enhancing spatial details, this often destroys the original spectral information, leading to a certain degree of spectral distortion. Later, tensor decomposition (TFD) was introduced to reduce spectral distortion. TFD directly models the three-dimensional data of the hyperspectral image, better capturing the correlation between data in different dimensions. Although the aforementioned fusion methods based on traditional algorithms have achieved certain performance, they mostly rely on manually designed models, often requiring precise model assumptions, and are computationally expensive.
[0004] Deep learning-based hyperspectral and multispectral fusion methods learn the mapping relationship between input low-spatial-resolution hyperspectral images and high-spatial-resolution multispectral images and the target high-spatial-resolution hyperspectral image by constructing a deep neural network model. Hyperspectral images exhibit significant spectral redundancy, displaying high inter-band correlation not only in image space but also easily generating redundant embedded information in feature space, thus increasing the modeling difficulty. Current deep learning-based hyperspectral and multispectral fusion methods, while improving fusion accuracy to some extent, still face practical challenges such as limited model generalization ability and high computational resource dependence. With the development of sensor technology, the relationship between hyperspectral and multispectral images exhibits more complex nonlinear relationships, which traditional models struggle to effectively characterize, easily leading to distortion in the fusion results.
[0005] Therefore, there is an urgent need to provide a technical solution to address the above problems. Summary of the Invention
[0006] To address the aforementioned technical problems, this invention provides a method and system for hyperspectral and multispectral image fusion based on deep dictionary learning.
[0007] Firstly, the present invention provides a hyperspectral and multispectral image fusion method based on deep dictionary learning, the technical solution of which is as follows:
[0008] Acquire the target hyperspectral image and target multispectral image of the region to be fused;
[0009] Based on an image fusion model comprising multiple sequentially connected hybrid linear fusion modules, the target hyperspectral image and the target multispectral image are fused to obtain a target fused image of the region to be fused.
[0010] Each hybrid linear fusion module includes: a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. The dictionary generation network is used to: receive the input hyperspectral image of the region to be fused and obtain an initial spectral dictionary corresponding to the hyperspectral image. The dictionary update network is used to: generate a target spectral dictionary based on the target multispectral image and the initial spectral dictionary. The abundance update network is used to: obtain a target abundance matrix based on the initial spectral dictionary and the reshaped hyperspectral image. The linear fusion network is used to: linearly fuse the target spectral dictionary with the fine-tuned target abundance matrix to obtain a fused image, which is then used as the hyperspectral image of the region to be fused received by the next hybrid linear fusion module.
[0011] Specifically, the fused image output by the last hybrid linear fusion module is determined as the target fused image.
[0012] The beneficial effects of the hyperspectral and multispectral image fusion method based on deep dictionary learning of the present invention are as follows:
[0013] The method of the present invention can effectively alleviate the problem of non-overlapping spectral responses, reduce the ambiguity and spectral distortion of the fusion results, and has the characteristics of process interpretability and lightweight.
[0014] Based on the above scheme, the hyperspectral and multispectral image fusion method based on deep dictionary learning of the present invention can be further improved as follows.
[0015] In one alternative approach, the dictionary generation network is specifically used for:
[0016] The hyperspectral image is used to extract features using a dimensionality-reducing convolutional neural network to obtain a feature map, which is then input into a reshaping function to construct the initial spectral dictionary.
[0017] In one alternative approach, the abundance update network is specifically used for:
[0018] The hyperspectral image is input into the reshaping function to obtain the reshaped hyperspectral image;
[0019] The target abundance matrix is calculated using the least squares method, combined with the initial spectral dictionary and the reconstructed hyperspectral image.
[0020] In one alternative approach, the dictionary update network is specifically used for:
[0021] The target multispectral image is concatenated with the initial spectral dictionary to obtain the concatenated features;
[0022] The splicing features are input into the transformer structure and fine-tuned in conjunction with the initial spectral dictionary to generate the target spectral dictionary.
[0023] In one alternative approach, the linear fusion network is specifically used for:
[0024] Based on a one-dimensional attention mechanism, the target abundance matrix is fine-tuned to obtain the fine-tuned target abundance matrix;
[0025] The target spectral dictionary is linearly fused with the fine-tuned target abundance matrix to obtain the fused image.
[0026] In one alternative approach, it also includes:
[0027] A high-resolution original hyperspectral image of the region to be fused is obtained, and the original hyperspectral image is downsampled to obtain a low-resolution hyperspectral image.
[0028] The low-resolution hyperspectral image is interpolated to obtain a low-resolution hyperspectral image of the target.
[0029] Secondly, this invention provides a hyperspectral and multispectral image fusion system based on deep dictionary learning, the technical solution of which is as follows:
[0030] Includes: an acquisition unit and a fusion unit;
[0031] The acquisition unit is used to: acquire the target hyperspectral image and the target multispectral image of the region to be fused;
[0032] The fusion unit is used to: fuse the target hyperspectral image and the target multispectral image based on an image fusion model that includes multiple sequentially connected hybrid linear fusion modules, to obtain a target fused image of the region to be fused;
[0033] Each hybrid linear fusion module includes: a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. The dictionary generation network is used to: receive the input hyperspectral image of the region to be fused and obtain an initial spectral dictionary corresponding to the hyperspectral image. The dictionary update network is used to: generate a target spectral dictionary based on the target multispectral image and the initial spectral dictionary. The abundance update network is used to: obtain a target abundance matrix based on the initial spectral dictionary and the reshaped hyperspectral image. The linear fusion network is used to: linearly fuse the target spectral dictionary with the fine-tuned target abundance matrix to obtain a fused image, which is then used as the hyperspectral image of the region to be fused received by the next hybrid linear fusion module.
[0034] Specifically, the fused image output by the last hybrid linear fusion module is determined as the target fused image.
[0035] The beneficial effects of the hyperspectral and multispectral image fusion system based on deep dictionary learning of the present invention are as follows:
[0036] The system of the present invention can effectively alleviate the problem of non-overlapping spectral responses, reduce the ambiguity and spectral distortion of the fusion results, and has the characteristics of process interpretability and lightweight.
[0037] Based on the above scheme, the hyperspectral and multispectral image fusion system based on deep dictionary learning of the present invention can be further improved as follows.
[0038] In an alternative embodiment, the system further includes a preprocessing unit; the preprocessing unit is used for:
[0039] A high-resolution original hyperspectral image of the region to be fused is obtained, and the original hyperspectral image is downsampled to obtain a low-resolution hyperspectral image.
[0040] The low-resolution hyperspectral image is interpolated to obtain a low-resolution hyperspectral image of the target.
[0041] Thirdly, the technical solution of an electronic device according to the present invention is as follows:
[0042] It includes a memory, a processor, and a program stored in the memory and running on the processor, wherein the processor executes the program to implement the steps of the hyperspectral and multispectral image fusion method based on deep dictionary learning as described in this invention.
[0043] Fourthly, the technical solution of a computer-readable storage medium provided by the present invention is as follows:
[0044] The computer-readable storage medium stores instructions that, when read, cause the computer-readable storage medium to perform the steps of the hyperspectral and multispectral image fusion method based on deep dictionary learning of the present invention.
[0045] The above description is merely an overview of the technical solution of the present invention. In order to better understand the technical means of the present invention and to implement it in accordance with the contents of the specification, and in order to make the above and other objects, features and advantages of the present invention more apparent and understandable, specific embodiments of the present invention are described below. Attached Figure Description
[0046] The accompanying drawings are for illustrative purposes only and are not intended to limit the invention. Furthermore, the same reference numerals denote the same parts throughout the drawings. In the drawings:
[0047] Figure 1 This is a flowchart illustrating an embodiment of a hyperspectral and multispectral image fusion method based on deep dictionary learning according to the present invention.
[0048] Figure 2 This is a schematic diagram illustrating the overall principle.
[0049] Figure 3 A schematic diagram illustrating the effect of determining the number of hybrid linear fusion modules;
[0050] Figure 4 This is a diagram showing the comparison of experimental results;
[0051] Figure 5 This is a schematic diagram of an embodiment of a hyperspectral and multispectral image fusion system based on deep dictionary learning according to the present invention;
[0052] Figure 6 This is a schematic diagram of an embodiment of an electronic device according to the present invention. Detailed Implementation
[0053] Exemplary embodiments of the invention will now be described in more detail with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be implemented in various forms and should not be limited to the embodiments set forth herein.
[0054] Figure 1This diagram illustrates a flowchart of an embodiment of a hyperspectral and multispectral image fusion method based on deep dictionary learning provided by the present invention. This method can be executed by electronic devices such as terminal devices or servers. The terminal device can be any fixed or mobile terminal, such as user equipment (UE), mobile device, user terminal, terminal, cellular phone, cordless phone, personal digital assistant (PDA), handheld device, computing device, in-vehicle device, or wearable device. The server can be a single server or a server cluster composed of multiple servers. Any electronic device can implement the hyperspectral and multispectral image fusion method based on deep dictionary learning by having its processor call computer-readable instructions stored in memory. Taking a server as an example, the server in this embodiment has an Intel Xeon E5-2665 CPU, an NVIDIA GTX2080Ti GPU, an Ubuntu 18.04 operating system, and a compilation environment of PyTorch 1.1.0, Python 3.5, CUDA 9.0, and CUDNN 7.1. Figure 1 As shown, the hyperspectral and multispectral image fusion method based on deep dictionary learning includes the following steps:
[0055] S1. Obtain the target hyperspectral image and target multispectral image of the region to be fused.
[0056] In this embodiment, the region to be fused refers to the area where image fusion is required. Hyperspectral images are acquired using a hyperspectral sensor, and multispectral images are acquired using a multispectral sensor. The target hyperspectral image is a low-resolution hyperspectral image after interpolation processing, and its expression is: N represents the number of bands in the target hyperspectral image, and H and W represent the height and width of the target hyperspectral image. The expression for the target multispectral image is: n represents the number of channels in the target multispectral image.
[0057] S2. Based on an image fusion model containing multiple sequentially connected hybrid linear fusion modules, the target hyperspectral image and the target multispectral image are fused to obtain the target fused image of the region to be fused.
[0058] Among them, such as Figure 2As shown, each hybrid linear fusion module includes: a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. The dictionary generation network is used to: receive the input hyperspectral image of the region to be fused and obtain an initial spectral dictionary corresponding to the hyperspectral image; the dictionary update network is used to: generate a target spectral dictionary based on the target multispectral image and the initial spectral dictionary; the abundance update network is used to: obtain a target abundance matrix based on the initial spectral dictionary and the reshaped hyperspectral image; the linear fusion network is used to: linearly fuse the target spectral dictionary with the fine-tuned target abundance matrix to obtain a fused image, which is then used as the hyperspectral image of the region to be fused received by the next hybrid linear fusion module.
[0059] Specifically, the fused image output by the last hybrid linear fusion module is determined as the target fused image.
[0060] It should be noted that the hyperspectral image input to the dictionary generation network in the first hybrid linear fusion module of the image fusion model is the target hyperspectral image, and the fused image output by the last hybrid linear fusion module is the target fused image. The multispectral image input to the dictionary update network in each hybrid linear fusion module is the target multispectral image, and the reconstruction of the hyperspectral image dictionary is continuously guided by the target multispectral image.
[0061] In one alternative approach, the dictionary generation network is specifically used for:
[0062] The hyperspectral image is used to extract features using a dimensionality-reducing convolutional neural network to obtain a feature map, which is then input into a reshaping function to construct the initial spectral dictionary.
[0063] The expression corresponding to the dictionary generation network is: D k This represents the initial spectral dictionary corresponding to the k-th hybrid linear fusion module. f represents the hyperspectral image of the region to be fused received in the k-th hybrid linear fusion module; fe (.) denotes a dimension-reducing convolutional neural network used to encode an N-band hyperspectral image into an m-channel feature map; Reshape(.) denotes a reshaping function used to construct an initial spectral dictionary D from the feature map. k D k The size is B×m×HW, where B represents the batch size.
[0064] In one alternative approach, the abundance update network is specifically used for:
[0065] The hyperspectral image is input into the reshaping function to obtain the reshaped hyperspectral image.
[0066] The expression for the reconstructed hyperspectral image is: This represents the reshaped hyperspectral image corresponding to the k-th hybrid linear fusion module. The dimensions are B×N×HW.
[0067] The target abundance matrix is calculated using the least squares method, combined with the initial spectral dictionary and the reconstructed hyperspectral image.
[0068] The expression for calculating the target abundance matrix is as follows: λ k Let α represent the target abundance matrix corresponding to the k-th hybrid linear fusion module. k Here, E represents the learnable hyperparameters, and E is the identity matrix. Represents the initial spectral dictionary D k The transpose of .
[0069] In one alternative approach, the dictionary update network is specifically used for:
[0070] The target multispectral image is concatenated with the initial spectral dictionary to obtain concatenated features. The concatenated features are then input into a transformer structure and fine-tuned in conjunction with the initial spectral dictionary to generate the target spectral dictionary.
[0071] The expression for generating the target spectral dictionary is: This represents the target spectral dictionary corresponding to the k-th hybrid linear fusion module, concat(.) represents the splicing layer, and f Dref (.) represents the transformer structure.
[0072] It should be noted that traditional deep learning-based hyperspectral and multispectral image fusion tasks typically perform image fusion at the pixel level or feature level, while the framework proposed in this embodiment converts hyperspectral and multispectral images into a low-dimensional dictionary space before fusion.
[0073] In one alternative approach, the linear fusion network is specifically used for:
[0074] Based on a one-dimensional attention mechanism, the target abundance matrix is fine-tuned to obtain the fine-tuned target abundance matrix.
[0075] The expression for generating the fine-tuned target abundance matrix is as follows: f represents the fine-tuned target abundance matrix corresponding to the k-th hybrid linear fusion module. λref (.) represents a one-dimensional attention mechanism.
[0076] The target spectral dictionary is linearly fused with the fine-tuned target abundance matrix to obtain the fused image.
[0077] The expression for generating the fused image is: k takes the value of a positive integer between 1 and d, where d represents the total number of hybrid linear fusion modules. When k is not d, The hyperspectral image of the region to be fused received as the (k+1)th hybrid linear fusion module When k is d, Determined as That is, the target fused image.
[0078] It should be noted that the alternation of each hybrid linear fusion module in the image fusion model can be equivalent to an optimization process of solving the cost function. This embodiment uses the mean absolute error between the fused image and the ground truth as the loss function, and the expression for the loss function is: This represents the fused image corresponding to the d-th (last) hybrid linear fusion module, i.e., the target fused image.
[0079] Furthermore, the value of d is determined based on the performance during model training. Figure 3 As shown, the default value of d is 5. Figure 3 The horizontal axis in the graph represents the computational complexity metric, used to measure the computational load of the model. The higher the GFLOPS (billion floating-point operations per second) value, the higher the computational complexity of the model. Figure 3 The vertical axis in the graph represents the image quality assessment metric, used to measure the similarity between the reconstructed image and the original image. A higher PSNR (Peak Signal-to-Noise Ratio) value indicates better image quality and a reconstruction effect that is closer to the original image.
[0080] In one alternative approach, it also includes:
[0081] A high-resolution original hyperspectral image of the region to be fused is obtained, and the original hyperspectral image is downsampled to obtain a low-resolution hyperspectral image.
[0082] The low-resolution hyperspectral image is interpolated to obtain a low-resolution hyperspectral image of the target.
[0083] The expression for the original hyperspectral image is I. HR The expression for a low-resolution hyperspectral image is:
[0084] It should be noted that, as Figure 4 As shown, Figure 4 The first row in the image is the hyperspectral image recovered by a deep learning method. Figure 4 The second row shows the error map between the restored image and the real high-resolution hyperspectral image. The last column shows the image restored using the fusion method described in this embodiment. The less content in the error map, the smaller the difference between the restored image and the original image, indicating higher reconstruction quality and better fidelity of spectral and spatial information. Representative fusion methods that have performed well in recent years were selected for comparison. From the perspective of detail preservation in the restored image and the cleanliness of the error map, this embodiment significantly outperforms existing methods in terms of visual effect and error control, demonstrating excellent fusion performance and methodological advantages.
[0085] The technical solution of this embodiment abandons the traditional highly coupled fusion mode in feature and image space. Guided by the generalized mixed information model, it constructs a maximum a posteriori probability model and uses this solution process as a guide to perform information fusion in a low-dimensional dictionary space. This can effectively alleviate the problem of non-overlapping spectral responses, reduce the ambiguity and spectral distortion of the fusion results, and has the characteristics of process interpretability and lightweight.
[0086] Figure 5 This diagram illustrates a structural schematic of an embodiment of a hyperspectral and multispectral image fusion system 200 based on deep dictionary learning provided by the present invention. Figure 5 As shown, the system 200 includes: an acquisition unit 210 and a fusion unit 220;
[0087] The acquisition unit 210 is used to: acquire the target hyperspectral image and the target multispectral image of the region to be fused;
[0088] The fusion unit 220 is used to: fuse the target hyperspectral image and the target multispectral image based on an image fusion model that includes multiple sequentially connected hybrid linear fusion modules, to obtain a target fused image of the region to be fused;
[0089] Each hybrid linear fusion module includes: a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. The dictionary generation network is used to: receive the input hyperspectral image of the region to be fused and obtain an initial spectral dictionary corresponding to the hyperspectral image. The dictionary update network is used to: generate a target spectral dictionary based on the target multispectral image and the initial spectral dictionary. The abundance update network is used to: obtain a target abundance matrix based on the initial spectral dictionary and the reshaped hyperspectral image. The linear fusion network is used to: linearly fuse the target spectral dictionary with the fine-tuned target abundance matrix to obtain a fused image, which is then used as the hyperspectral image of the region to be fused received by the next hybrid linear fusion module.
[0090] Specifically, the fused image output by the last hybrid linear fusion module is determined as the target fused image.
[0091] In one alternative approach, the dictionary generation network is specifically used for:
[0092] The hyperspectral image is used to extract features using a dimensionality-reducing convolutional neural network to obtain a feature map, which is then input into a reshaping function to construct the initial spectral dictionary.
[0093] In one alternative approach, the abundance update network is specifically used for:
[0094] The hyperspectral image is input into the reshaping function to obtain the reshaped hyperspectral image;
[0095] The target abundance matrix is calculated using the least squares method, combined with the initial spectral dictionary and the reconstructed hyperspectral image.
[0096] In one alternative approach, the dictionary update network is specifically used for:
[0097] The target multispectral image is concatenated with the initial spectral dictionary to obtain the concatenated features;
[0098] The splicing features are input into the transformer structure and fine-tuned in conjunction with the initial spectral dictionary to generate the target spectral dictionary.
[0099] In one alternative approach, the linear fusion network is specifically used for:
[0100] Based on a one-dimensional attention mechanism, the target abundance matrix is fine-tuned to obtain the fine-tuned target abundance matrix;
[0101] The target spectral dictionary is linearly fused with the fine-tuned target abundance matrix to obtain the fused image.
[0102] In an alternative embodiment, the system further includes a preprocessing unit; the preprocessing unit is used for:
[0103] A high-resolution original hyperspectral image of the region to be fused is obtained, and the original hyperspectral image is downsampled to obtain a low-resolution hyperspectral image.
[0104] The low-resolution hyperspectral image is interpolated to obtain a low-resolution hyperspectral image of the target.
[0105] It should be noted that the beneficial effects of the hyperspectral and multispectral image fusion system based on deep dictionary learning provided in the above embodiments are the same as those of the hyperspectral and multispectral image fusion method based on deep dictionary learning, and will not be repeated here. Furthermore, the system provided in the above embodiments is only illustrated by the division of the above functional modules. In practical applications, the above functions can be assigned to different functional modules as needed, that is, the system can be divided into different functional modules according to the actual situation to complete all or part of the functions described above. In addition, the system and method embodiments provided in the above embodiments belong to the same concept, and their specific implementation process is detailed in the method embodiments, and will not be repeated here.
[0106] The hyperspectral and multispectral image fusion system based on deep dictionary learning of the present invention can be a computer program (including program code) running on a computer device. For example, the hyperspectral and multispectral image fusion system based on deep dictionary learning of the present invention is an application software that can be used to execute the corresponding steps in the hyperspectral and multispectral image fusion method based on deep dictionary learning of the present invention.
[0107] In some embodiments, the hyperspectral and multispectral image fusion system based on deep dictionary learning of the present invention can be implemented in a combination of hardware and software. As an example, the hyperspectral and multispectral image fusion system based on deep dictionary learning of the present invention can be a processor in the form of a hardware decoding processor, which is programmed to execute the hyperspectral and multispectral image fusion method based on deep dictionary learning of the present invention. For example, the processor in the form of a hardware decoding processor can be one or more application-specific integrated circuits (ASICs), DSPs, programmable logic devices (PLDs), complex programmable logic devices (CPLDs), field-programmable gate arrays (FPGAs), or other electronic components.
[0108] The modules described in the embodiments of this invention can be implemented in software or hardware. The names of the modules are not, in some cases, limiting the scope of the module itself.
[0109] An electronic device according to an embodiment of the present invention includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements any of the above-mentioned hyperspectral and multispectral image fusion methods based on deep dictionary learning. That is, an electronic device according to an embodiment of the present invention may include, but is not limited to: a processor and a memory; the memory is used to store the computer program; the processor is used to execute the hyperspectral and multispectral image fusion method based on deep dictionary learning shown in any embodiment of the present invention by calling the computer program.
[0110] In one alternative embodiment, an electronic device is provided, such as Figure 6 As shown, Figure 6 The illustrated electronic device 4000 includes a processor 4001 and a memory 4003. The processor 4001 and the memory 4003 are connected, for example, via a bus 4002. Optionally, the electronic device 4000 may further include a transceiver 4004, which can be used for data interaction between the electronic device and other electronic devices, such as sending and / or receiving data. It should be noted that in practical applications, the transceiver 4004 is not limited to one type, and the structure of the electronic device 4000 does not constitute a limitation on the embodiments of the present invention.
[0111] Processor 4001 may be a CPU (Central Processing Unit), a general-purpose processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or other programmable logic devices, transistor logic devices, hardware components, or any combination thereof. It can implement or execute the various exemplary logic blocks, modules, and circuits described in conjunction with the disclosure of this invention. Processor 4001 may also be a combination that implements computational functions, such as a combination of one or more microprocessors, a combination of a DSP and a microprocessor, etc.
[0112] Bus 4002 may include a path for transmitting information between the aforementioned components. Bus 4002 may be a PCI (Peripheral Component Interconnect) bus or an EISA (Extended Industry Standard Architecture) bus, etc. Bus 4002 can be divided into address bus, data bus, control bus, etc. For ease of representation, Figure 6The bus 4002 is represented by only one thick line, but this does not mean that there is only one bus or one type of bus.
[0113] The memory 4003 may be ROM (Read Only Memory) or other types of static storage devices capable of storing static information and instructions, RAM (Random Access Memory) or other types of dynamic storage devices capable of storing information and instructions, or EEPROM (Electrically Erasable Programmable Read Only Memory), CD-ROM (Compact Disc Read Only Memory) or other optical disc storage, optical disc storage (including compressed optical discs, laser discs, optical discs, digital universal optical discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium capable of carrying or storing desired program code in the form of instructions or data structures and accessible by a computer, but not limited thereto.
[0114] The memory 4003 stores the application code (computer program) for executing the present invention, and its execution is controlled by the processor 4001. The processor 4001 executes the application code stored in the memory 4003 to implement the content shown in the foregoing method embodiments.
[0115] Among them, electronic devices can also be terminal devices. A terminal device can be any terminal device that can install applications and access web pages through applications, including at least one of smartphones, tablets, laptops, desktop computers, smart speakers, smartwatches, smart TVs, and smart in-vehicle devices.
[0116] It should be noted that, Figure 6 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present invention.
[0117] An embodiment of the present invention provides a computer-readable storage medium storing a computer program, which, when executed by a processor, implements any of the above-mentioned hyperspectral and multispectral image fusion methods based on deep dictionary learning.
[0118] Alternatively, the computer-readable storage medium may be a read-only memory (ROM), a random access memory (RAM), a compact disc read-only memory (CD-ROM), magnetic tape, a floppy disk, and an optical data storage device, etc.
[0119] In an exemplary embodiment, a computer program product or computer program is also provided, which includes computer instructions stored in a computer-readable storage medium. A processor of an electronic device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the electronic device to perform the aforementioned hyperspectral and multispectral image fusion method based on deep dictionary learning.
[0120] Computer program code for performing the operations of this invention can be written in one or more programming languages or a combination thereof. These programming languages include object-oriented programming languages—such as Java, Smalltalk, and C++—and conventional procedural programming languages—such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).
[0121] It should be understood that the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of methods and computer program products according to various embodiments of the present invention. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing the specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, may be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0122] The computer-readable storage medium provided in this invention can be, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this invention, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
[0123] The aforementioned computer-readable storage medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to perform the method shown in the above embodiments.
[0124] The above description is merely a preferred embodiment of the present invention and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of disclosure in this invention is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features with similar functions disclosed in this invention.
[0125] It should be noted that the terms "first," "second," etc., used in the specification and claims of this application are used to distinguish similar objects and represent a limitation on a specific order or sequence. Where appropriate, the order of use for similar objects can be interchanged so that the embodiments of this application described herein can be implemented in an order other than that shown or described.
[0126] Those skilled in the art will recognize that this invention can be implemented as a system, method, or computer program product. Therefore, this invention can be specifically implemented in the following forms: it can be entirely hardware, entirely software (including firmware, resident software, microcode, etc.), or a combination of hardware and software, generally referred to herein as a "circuit," "module," or "system." Furthermore, in some embodiments, this invention can also be implemented as a computer program product contained in one or more computer-readable media, which includes computer-readable program code.
[0127] Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Those skilled in the art can make changes, modifications, substitutions and variations to the above embodiments within the scope of the present invention.
Claims
1. A hyperspectral and multispectral image fusion method based on deep dictionary learning, characterized in that, include: Acquire the target hyperspectral image and target multispectral image of the region to be fused; Based on an image fusion model comprising multiple sequentially connected hybrid linear fusion modules, the target hyperspectral image and the target multispectral image are fused to obtain a target fused image of the region to be fused. Each hybrid linear fusion module includes: a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. The dictionary generation network receives the input hyperspectral image of the region to be fused and obtains an initial spectral dictionary corresponding to the hyperspectral image. The dictionary update network generates a target spectral dictionary based on the target multispectral image and the initial spectral dictionary. The abundance update network inputs the hyperspectral image into a reshaping function to obtain a reshaped hyperspectral image; calculates a target abundance matrix using the least squares method, combined with the initial spectral dictionary and the reshaped hyperspectral image. The linear fusion network fine-tunes the target abundance matrix based on a one-dimensional attention mechanism to obtain a fine-tuned target abundance matrix; and linearly fuses the target spectral dictionary with the fine-tuned target abundance matrix to obtain a fused image, which is then used as the hyperspectral image of the region to be fused received by the next hybrid linear fusion module. Specifically, the fused image output by the last hybrid linear fusion module is determined as the target fused image.
2. The deep dictionary learning based method for fusion of hyperspectral and multispectral images according to claim 1, wherein, The dictionary generation network is specifically used for: The hyperspectral image is used to extract features using a dimensionality-reducing convolutional neural network to obtain a feature map, which is then input into a reshaping function to construct the initial spectral dictionary. 3.The deep dictionary learning based method for fusion of hyperspectral and multispectral images according to claim 1, wherein, The dictionary update network is specifically used for: The target multispectral image is concatenated with the initial spectral dictionary to obtain the concatenated features; The splicing features are input into the transformer structure and fine-tuned in conjunction with the initial spectral dictionary to generate the target spectral dictionary.
4. The hyperspectral and multispectral image fusion method based on deep dictionary learning according to any one of claims 1 to 3, characterized in that, Also includes: A high-resolution original hyperspectral image of the region to be fused is obtained, and the original hyperspectral image is downsampled to obtain a low-resolution hyperspectral image. The low-resolution hyperspectral image is interpolated to obtain a low-resolution hyperspectral image of the target.
5. A hyperspectral and multispectral image fusion system based on deep dictionary learning, characterized by, include: Acquisition unit and fusion unit; The acquisition unit is used to: acquire the target hyperspectral image and the target multispectral image of the region to be fused; The fusion unit is used to: fuse the target hyperspectral image and the target multispectral image based on an image fusion model that includes multiple sequentially connected hybrid linear fusion modules, to obtain a target fused image of the region to be fused; Each hybrid linear fusion module includes: a dictionary generation network, a dictionary update network, an abundance update network, and a linear fusion network. The dictionary generation network receives the input hyperspectral image of the region to be fused and obtains an initial spectral dictionary corresponding to the hyperspectral image. The dictionary update network generates a target spectral dictionary based on the target multispectral image and the initial spectral dictionary. The abundance update network inputs the hyperspectral image into a reshaping function to obtain a reshaped hyperspectral image; calculates a target abundance matrix using the least squares method, combined with the initial spectral dictionary and the reshaped hyperspectral image. The linear fusion network fine-tunes the target abundance matrix based on a one-dimensional attention mechanism to obtain a fine-tuned target abundance matrix; and linearly fuses the target spectral dictionary with the fine-tuned target abundance matrix to obtain a fused image, which is then used as the hyperspectral image of the region to be fused received by the next hybrid linear fusion module. Specifically, the fused image output by the last hybrid linear fusion module is determined as the target fused image.
6. The hyperspectral and multispectral image fusion system based on deep dictionary learning according to claim 5, characterized in that, Also includes: Preprocessing unit; The preprocessing unit is used for: A high-resolution original hyperspectral image of the region to be fused is obtained, and the original hyperspectral image is downsampled to obtain a low-resolution hyperspectral image. The low-resolution hyperspectral image is interpolated to obtain a low-resolution hyperspectral image of the target.
7. An electronic device, comprising: The electronic device includes a processor coupled to a memory, the memory storing at least one computer program, which is loaded and executed by the processor to enable the electronic device to implement the hyperspectral and multispectral image fusion method based on deep dictionary learning as described in any one of claims 1 to 4.
8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores at least one computer program, which is loaded and executed by a processor to enable the computer-readable storage medium to implement the hyperspectral and multispectral image fusion method based on deep dictionary learning as described in any one of claims 1 to 4.