Nighttime infrared and visible image fusion method and system with synergistic illumination enhancement and modal balance
By using an adaptive gated fusion network and a dual-branch feature enhancement network, the problems of texture detail loss and modal imbalance in the fusion of infrared and visible light images in low-light nighttime are solved, generating high-quality fused images and improving visual effects and information integrity.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- DALIAN UNIV
- Filing Date
- 2025-10-31
- Publication Date
- 2026-06-19
AI Technical Summary
Existing infrared and visible light image fusion methods suffer from loss of texture details and modal feature imbalance in low-light nighttime scenes, resulting in insufficient visual quality and practical value of the fusion results.
By employing an adaptive gated fusion network (AGFNet) and a dual-branch feature enhancement and decoupling network (DFEDNet), a collaborative illumination enhancement and modal balance mechanism is constructed by adaptively adjusting the feature contribution ratio of infrared and visible light images, thereby mining potential features of low-light visible light images and achieving dynamic feature balance.
Generate fused images with higher visual quality and greater feature richness, preserving both the texture details of visible light images and the target salience of infrared images, thus meeting the requirements for information integrity and clarity in low-light nighttime scenes.
Smart Images

Figure CN121437286B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image fusion technology, and more specifically to a method and system for nighttime infrared and visible image fusion with synergistic illumination enhancement and modal balance. Background Technology
[0002] In computer vision and industrial applications, the fusion of infrared and visible light images plays a crucial role. Visible light images possess rich texture details and excellent visual perception adaptability, clearly presenting the shape, texture, and other appearance information of objects in a scene. Infrared images, relying on their ability to capture the thermal radiation information of objects, can accurately identify important targets such as vehicles and pedestrians even in extreme environments such as low light, fog, and rain, unaffected by ambient lighting conditions. By combining the advantages of both types of images, not only can data redundancy be reduced, but high-quality images with high contrast, rich texture details, and target saliency can also be generated, providing strong support for information processing in complex scenes.
[0003] However, current infrared and visible light image fusion methods have significant shortcomings in low-light nighttime scenes: on the one hand, visible light images in low-light environments exhibit degradation. Although they still contain locally identifiable features such as edge structures and semantic information, existing methods fail to fully exploit these potential effective information, resulting in severe loss of visible light-related texture details in the fusion results. On the other hand, existing methods often employ a crude, direct fusion approach for infrared and visible light modal features, without considering the feature imbalance between the two modalities in low-light scenes. This makes it difficult for the fusion results to achieve a dynamic balance between detail preservation and target saliency, violating the core principle of multimodal fusion, "cooperative enhancement of complementary features," and ultimately affecting the visual quality and practical value of the fused image. Summary of the Invention
[0004] The purpose of this invention is to propose a method and system for fusing nighttime infrared and visible images with synergistic illumination enhancement and modal balance. By constructing a synergistic image enhancement and feature balancing mechanism, the potential features of low-light visible images are fully explored, and the contribution ratio of complementary features of the two modalities is dynamically adjusted to generate a fused image with higher visual quality and greater feature richness.
[0005] According to a first aspect of the present disclosure, a nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance is provided, comprising the following steps:
[0006] Acquire visible light images with appropriate brightness and rich features. ;
[0007] Construct an adaptive gating fusion network AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ;
[0008] Visible light feature map and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location.
[0009] Based on spatial weights G, 1-G, and the visible light feature map after channel weighting. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
[0010] According to a second aspect of the present disclosure, a nighttime infrared and visible image fusion system with synergistic illumination enhancement and modal balance is provided, comprising:
[0011] The low-light visible light image processing module acquires visible light images with appropriate brightness and rich features. ;
[0012] The AGFNet construction and input preprocessing module constructs an adaptive gating fusion network, AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images... The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ;
[0013] The dual-modal feature channel weighting and spatial weight generation module generates visible light feature maps. and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location.
[0014] The fusion image reconstruction module uses spatial weights G, 1-G, and the channel-weighted visible light feature map. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
[0015] According to a third aspect of the present disclosure, an electronic device is provided, including a memory, a processor, and a computer program stored in the memory and running on the memory, wherein the processor executes the program to implement the nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance.
[0016] According to a fourth aspect of the present disclosure, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the described nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance.
[0017] The advantages of the above technical solutions adopted in this invention compared with the prior art are as follows:
[0018] 1. This invention proposes an adaptive low-light infrared-visible light image fusion method with balanced modal feature representation. By optimizing the integration logic of the two complementary multimodal features of infrared and visible light, it effectively avoids the problem of single modal feature dominating the fusion result, and finally generates a fused image with higher visual quality and greater feature richness. It not only retains the texture details, edge structure and other appearance information of the visible light image, but also incorporates the target saliency advantage of the infrared image, meeting the core requirements of image information integrity and clarity in low-light night scenes.
[0019] 2. This invention designs a dual-branch feature enhancement and decoupling network (DFEDNet) based on Retinex theory. This network has dual functions of "feature mining + illumination cancellation": on the one hand, it deeply mines the intrinsic features of visible light images suppressed under low light conditions (such as semantic information of dark areas and fine textures) and awakens the expressive ability of degraded features; on the other hand, it simultaneously performs illumination-reflection component decomposition to accurately eliminate interference such as uneven illumination and insufficient brightness caused by low light environment, thereby improving the quality of low light visible light images from the root and providing high-quality basic data for subsequent multimodal fusion.
[0020] 3. To address the problem of imbalance between infrared and visible light modal features under extreme nighttime conditions, this invention designs an adaptive gated fusion network (AGFNet). By dynamically adjusting the contribution weights of the two modal features, feature compatibility constraints are constructed in the frequency domain. This avoids texture loss caused by excessive highlighting of infrared features and prevents target blurring caused by weak visible light features. It achieves dynamic balance of different imaging modal features during the fusion process, ensuring that the fusion result can maintain stable information integrity and visual consistency even in complex and extreme scenarios. Attached Figure Description
[0021] The accompanying drawings, which form part of this application, are used to provide a further understanding of this application. The illustrative embodiments of this application and their descriptions are used to explain this application and do not constitute an undue limitation of this application.
[0022] Figure 1 This is a flowchart of the image fusion process of the present invention;
[0023] Figure 2 Flowchart of DFEDNet implementation method;
[0024] Figure 3 A flowchart illustrating the implementation of AGFNet;
[0025] Figure 4 The flowchart shows the implementation method of AGF-Module. Detailed Implementation
[0026] The present disclosure will be further described below with reference to the accompanying drawings and embodiments.
[0027] It should be noted that the following detailed descriptions are exemplary and intended to provide further explanation of this application. Unless otherwise specified, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains.
[0028] It should be noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the exemplary embodiments according to this application. As used herein, the singular form is intended to include the plural form as well, unless the context clearly indicates otherwise. Furthermore, it should be understood that when the terms "comprising" and / or "including" are used in this specification, they indicate the presence of features, steps, operations, devices, components, and / or combinations thereof.
[0029] It should be noted that the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of methods and systems according to various embodiments of this disclosure. It should be noted that each block in a flowchart or block diagram may represent a module, segment, or portion of code, which may include one or more executable instructions for implementing the logical functions specified in the various embodiments. It should also be noted that in some alternative implementations, the functions marked in the blocks may occur in a different order than that shown in the drawings. For example, two consecutively represented blocks may actually be executed substantially in parallel, or they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the flowcharts and / or block diagrams, and combinations of blocks in the flowcharts and / or block diagrams, may be implemented using a dedicated hardware-based system that performs the specified functions or operations, or using a combination of dedicated hardware and computer instructions.
[0030] Example 1:
[0031] like Figure 1 As shown, this embodiment provides a nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance, including the following steps:
[0032] S1. Acquire a visible light image with appropriate brightness and rich features. ;
[0033] Specifically, a dual-branch feature enhancement and decoupling network, DFEDNet, is constructed, such as... Figure 2 As shown, the network includes a decomposition module (Decom-Module) and a feature mining network module (FMN-Module); it processes low-light visible light images... The data are input to the Decom-Module and FMN-Module respectively. The Decom-Module decomposes the low-light visible light image to obtain the reflection component R and the illumination component L. The FMN-Module extracts features from the low-light visible light image to obtain the feature map. ; Employing comparative structural loss For the reflection component R and the feature map By applying constraints, the constrained reflection components are obtained. The constrained reflection component R' is multiplied element-wise with the illumination component L to obtain a visible light image with suitable brightness and rich features. .
[0034] The low-light visible light image The components are decomposed into a reflection component R and an illumination component L. The specific process is as follows:
[0035]
[0036] in, This indicates element-wise multiplication.
[0037] In one embodiment, feature maps are obtained by extracting features from low-light visible light images using an FMN-Module. The method is as follows:
[0038] Employing a subspace attention module from low-light visible light images Extract multi-channel feature maps F, where the dimension of F is represented as F∈R. m*h*w m is the number of channels in the feature map, and h and w are the spatial dimensions of the feature map in the height and width directions, respectively;
[0039] Divide the multi-channel feature map F into i mutually exclusive subspaces. Each subspace contains G=m / i channels to distribute complex feature information;
[0040] To further improve the robustness of cross-channel dependency capture and the stability of feature redistribution, considering the diversity of local features at different scales and the fact that a larger receptive field helps capture more global contextual information in extreme nighttime scenes, multi-scale pooling was employed on the feature maps of each subspace during the pooling stage. Specifically, 5×5 max pooling (with padding=2) and 3×3 pooling were used, and the results of the two pooling methods were averaged to obtain the fused multi-scale pooling result.
[0041]
[0042] in, This represents a multi-scale pooling operation; such a division can distribute complex feature information into different subspaces, providing finer-grained processing units for subsequent attention mechanisms.
[0043] The subspace feature map after multi-scale pooling is sequentially processed by MaxPool local information extraction, Softmax attention weight generation, DepthwiseConvolution (DW, 1×1 kernel depthwise convolution) cross-channel information extraction, PointwiseConvolution (PW, single-filter pointwise convolution) channel fusion, and BatchNormalization (BN layer) normalization. Normalization effectively balances the activation amplitudes between channels, helping to alleviate gradient fluctuations and accelerate convergence. The resulting processed subspace feature map is shown below.
[0044]
[0045] The processed subspace feature map is redistributed using element-wise multiplication and addition operations:
[0046]
[0047] in, Represents element-wise multiplication. Represents element-wise addition;
[0048] Feature map after redistributing features from all subspaces Perform cascading operations to form the final feature map. :
[0049]
[0050] S2. Construct an adaptive gated fusion network AGFNet, such as Figure 3 As shown, the network includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ;
[0051] Specifically, for visible light images The Y channel is processed using the Contrast Limiting Adaptive Histogram Equalization (CLAHE) algorithm to enhance image contrast.
[0052] The encoder extracts features to provide high-quality feature representations for subsequent fusion and reconstruction, specifically:
[0053]
[0054] Where E represents encoder, Indicates the enhanced The Y channel.
[0055] S3. Visible light feature map and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map Input to AGF-Module, such as Figure 4 As shown, this module generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features, where G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location.
[0056] Specifically, obtain the feature map after connection. The method is as follows: the visible light feature map is processed by the squeezing and excitation module (SE module). and infrared feature map Perform channel weighting to obtain the corresponding channel weights. and ; channel weight With visible light feature map Element-wise multiplication yields the channel-weighted visible light feature map. ; channel weight With infrared feature map Element-wise multiplication is performed to obtain the channel-weighted infrared feature map. Visible light feature map after channel weighting Infrared feature map after channel weighting Perform a channel connection operation to obtain the connected feature map. .
[0057] Channel weight and The method of obtaining it is:
[0058]
[0059] in and The numerical range is usually within [0,1], and then the resulting features are spliced and channel compressed;
[0060] For the connected feature map The fused attention map is obtained through processing. A :
[0061]
[0062] Among them, It is the Sigmoid activation function, which ensures that the weights are between 0 and 1; Conv represents the convolution operation. It is an activation function; based on a fused attention map. A The spatial weights G are generated as follows:
[0063]
[0064] Spatial gating weights G are generated by sigmoid activation. Each value in G represents the degree of attention to visible light features at the corresponding spatial location. The closer the value is to 1, the more dependent the location is on visible light features, and vice versa.
[0065] S4. Based on the spatial weights G, 1-G, and the channel-weighted visible light feature map. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
[0066] Specifically, the fusion features for:
[0067]
[0068] Blue and red concentration shifts (Cb and Cr) from the enhanced image The process of a decoder reconstructing an image can be defined as follows:
[0069]
[0070] in This represents the transfer matrix that converts an image from YCbCr to RGB; and These represent the Cb and Cr channels in the enhanced visible light image, respectively.
[0071] Example 2:
[0072] This embodiment provides a nighttime infrared and visible image fusion system with synergistic illumination enhancement and modal balance, including:
[0073] The low-light visible light image processing module acquires visible light images with appropriate brightness and rich features. ;
[0074] The AGFNet construction and input preprocessing module constructs an adaptive gating fusion network, AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images... The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ;
[0075] The dual-modal feature channel weighting and spatial weight generation module generates visible light feature maps. and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location.
[0076] The fusion image reconstruction module uses spatial weights G, 1-G, and the channel-weighted visible light feature map. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
[0077] Example 3:
[0078] An electronic device includes a memory, a processor, and a computer program stored in the memory and running thereon, wherein the processor, when executing the program, implements the above-described nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance, comprising:
[0079] Acquire visible light images with appropriate brightness and rich features. ;
[0080] Construct an adaptive gating fusion network AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ;
[0081] Visible light feature map and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location.
[0082] Based on spatial weights G, 1-G, and the visible light feature map after channel weighting. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
[0083] Example 4:
[0084] A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the above-described method for nighttime infrared and visible image fusion with synergistic illumination enhancement and modal balance, comprising:
[0085] Acquire visible light images with appropriate brightness and rich features. ;
[0086] Construct an adaptive gating fusion network AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ;
[0087] Visible light feature map and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location.
[0088] Based on spatial weights G, 1-G, and the visible light feature map after channel weighting. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
[0089] Those skilled in the art will understand that the modules or steps described above can be implemented using general-purpose computer devices. Optionally, they can be implemented using computer-executable program code, which can then be stored in a storage device for execution by a computer device. Alternatively, they can be fabricated as separate integrated circuit modules, or multiple modules or steps can be fabricated as a single integrated circuit module. This disclosure is not limited to any particular combination of hardware and software.
[0090] The above description is merely a preferred embodiment of this application and is not intended to limit this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the protection scope of this application.
[0091] While the specific embodiments of this disclosure have been described above in conjunction with the accompanying drawings, this is not intended to limit the scope of protection of this disclosure. Those skilled in the art should understand that various modifications or variations that can be made by those skilled in the art without creative effort based on the technical solutions of this disclosure are still within the scope of protection of this disclosure.
Claims
1. A night-time infrared and visible image fusion method with synergistic illumination enhancement and modal balance, characterized in that, Includes the following steps: Acquiring a visible light image with suitable brightness and rich features ; Construct an adaptive gating fusion network AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ; Visible light feature map and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location. Based on spatial weights G, 1-G, and the visible light feature map after channel weighting. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. ; Acquiring a visible light image with suitable brightness and rich features The manner is: Construct a dual-branch feature enhancement and decoupling network DFEDNet, which includes a decomposition module (Decom-Module) and a feature mining network module (FMN-Module); to transform low-light and visible light images... The data are input to the Decom-Module and FMN-Module respectively. The Decom-Module decomposes the low-light visible light image to obtain the reflection component R and the illumination component L. The FMN-Module extracts features from the low-light visible light image to obtain the feature map. ; Employing comparative structural loss For the reflection component R and the feature map By applying constraints, the constrained reflection components are obtained. The constrained reflection component R' is multiplied element-wise with the illumination component L to obtain a visible light image with suitable brightness and rich features. .
2. The method for night-time infrared and visible image fusion with synergistic illumination enhancement and modal balance as claimed in claim 1, wherein, The FMN-Module extracts features of the low-light visible light image to obtain a feature map The method is as follows: Employing a subspace attention module from low-light visible light images Extract multi-channel feature maps F, where the dimension of F is represented as F∈R. m*h*w m is the number of channels in the feature map, and h and w are the spatial dimensions of the feature map in the height and width directions, respectively; Divide the multi-channel feature map F into i mutually exclusive subspaces. Each subspace contains m / i channels to distribute complex feature information; Multi-scale pooling operations are performed on the feature maps of each subspace, specifically 5×5 max pooling and 3×3 pooling. The results of the two pooling operations are then averaged to obtain the fused multi-scale pooling result. wherein, denotes a multi-scale pooling operation; The subspace feature map after multi-scale pooling is subjected to MaxPool local information extraction, Softmax attention weight generation, DepthwiseConvolution cross-channel information extraction, PointwiseConvolution channel fusion, and BatchNormalization normalization operations to obtain the processed subspace feature map: The processed subspace feature map is redistributed using element-wise multiplication and addition operations: wherein represents an element-wise multiplication, represents an element-wise addition; Feature map after redistributing features from all subspaces Perform cascading operations to form the final feature map. : 。 3. The method for night-time infrared and visible image fusion with synergistic illumination enhancement and modal balance as claimed in claim 1, wherein, Visible light images The Y channel is processed using a contrast-limited adaptive histogram equalization algorithm to enhance image contrast.
4. The method for night-time infrared and visible image fusion with synergistic illumination enhancement and modal balance as claimed in claim 1, wherein, Obtain the feature map after connection The method is as follows: visible light feature maps are processed by squeezing and excitation modules respectively. and infrared feature map Perform channel weighting to obtain the corresponding channel weights. and ; channel weight With visible light feature map Element-wise multiplication yields the channel-weighted visible light feature map. ; channel weight With infrared feature map Element-wise multiplication is performed to obtain the channel-weighted infrared feature map. Visible light feature map after channel weighting Infrared feature map after channel weighting Perform a channel connection operation to obtain the connected feature map. .
5. The nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance according to claim 1, characterized in that, For the connected feature map The fused attention map is obtained through processing. A : in, It uses the Sigmoid activation function to ensure that the weights are between 0 and 1; Conv represents the convolution operation, and ReLU is the activation function; it is based on a fused attention map. A The spatial weights G are generated as follows: Each value in G represents the degree of attention paid to visible light features at the corresponding spatial location.
6. The nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance according to claim 1, characterized in that, The fusion features Are: The fused image Is: in This represents the transfer matrix that converts an image from YCbCr to RGB. and These represent the Cb and Cr channels in the enhanced visible light image, respectively.
7. A nighttime infrared and visible image fusion system with synergistic illumination enhancement and modal balance, characterized in that, include: The low-light visible light image processing module acquires visible light images with appropriate brightness and rich features. Acquire visible light images with appropriate brightness and rich features. The method is as follows: Construct a dual-branch feature enhancement and decoupling network DFEDNet, which includes a decomposition module (Decom-Module) and a feature mining network module (FMN-Module); to transform low-light and visible light images... The data are input to the Decom-Module and FMN-Module respectively. The Decom-Module decomposes the low-light visible light image to obtain the reflection component R and the illumination component L. The FMN-Module extracts features from the low-light visible light image to obtain the feature map. ; Employing comparative structural loss For the reflection component R and the feature map By applying constraints, the constrained reflection components are obtained. The constrained reflection component R' is multiplied element-wise with the illumination component L to obtain a visible light image with suitable brightness and rich features. ; The AGFNet construction and input preprocessing module constructs an adaptive gating fusion network, AGFNet, which includes an encoder, a spatial adaptive gating module (AGF-Module), and a decoder; for visible light images... The Y channel is enhanced to improve the visible light and infrared images. The input is fed into the encoder for feature extraction to obtain a visible light feature map. and infrared feature map ; The dual-modal feature channel weighting and spatial weight generation module generates visible light feature maps. and infrared feature map After performing channel weighting processing separately, the concatenated feature map is obtained. The concatenated feature map The input is fed into the AGF-Module, which generates spatial weights G and 1-G to allocate the contribution ratio of visible light image features and infrared image features. G represents the degree of attention to visible light features at the corresponding spatial location, and 1-G represents the degree of attention to infrared features at the corresponding spatial location. The fusion image reconstruction module uses spatial weights G, 1-G, and channel-weighted visible light feature maps. Infrared feature map after channel weighting The final fusion features are obtained. ; will ultimately fuse features Compared with enhanced visible light images aisle, The channels are input together to the decoder, which reconstructs the fused features and combines them. Channels and The color information of each channel is used to generate the final fused image. .
8. An electronic device comprising a memory, a processor, and a computer program stored in the memory and running thereon, characterized in that, When the processor executes the program, it implements the nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance as described in any one of claims 1-6.
9. A computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the nighttime infrared and visible image fusion method with synergistic illumination enhancement and modal balance as described in any one of claims 1-6.