An unsupervised low-light image enhancement method and system, device, medium

By combining retinal theory and the latent space-retinal diffusion model of deep neural networks, the adaptability problem of low-light image enhancement under different lighting conditions is solved, and high-quality unsupervised image enhancement effect is achieved.

CN117893456BActive Publication Date: 2026-06-26UNIV OF ELECTRONICS SCI & TECH OF CHINA

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
UNIV OF ELECTRONICS SCI & TECH OF CHINA
Filing Date
2024-01-18
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Existing low-light image enhancement methods are difficult to adapt to different lighting conditions. Traditional methods lack robustness, deep learning-based methods suffer from overfitting and poor generalization, and unsupervised learning methods perform poorly in real-world scenarios.

Method used

Combining traditional retinal theory and deep neural networks, this paper decomposes the features of low-light images into reflectance and illuminance maps using a latent space-retinal diffusion model. Unsupervised learning is then used to train the model on unpaired data, and a self-constrained consistency loss function is employed to optimize the model and improve its generalization ability.

Benefits of technology

The image enhancement improved visual fidelity and overall quality under different lighting conditions, improved the generalization performance of the diffusion model, and achieved effective image enhancement through unsupervised learning.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117893456B_ABST
    Figure CN117893456B_ABST
Patent Text Reader

Abstract

The present application relates to the field of image processing and computer vision, and discloses an unsupervised low-light image enhancement method and system, the method comprising the following steps: S1, constructing a diffusion model; S2, training the diffusion model using pairs of low-quality images in a public dataset and the sum of four loss functions; S3, training the diffusion model using any non-paired low-light image, normal-light image and the sum of two loss functions; S4, using the reflectance map of the low-light feature and the illumination map of the normal-light feature as the diffusion model, and under the guidance of the low-light feature, the enhanced feature is obtained by restoration, and the enhanced feature is used as the input of the decoder to reconstruct the final enhanced image. The system comprises a model construction unit, a model training unit and an enhanced image output unit. The present application also discloses an electronic device and a computer readable storage medium. The present application is used for low-light image enhancement by using traditional retinal theory and deep neural network.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of image processing and computer vision, and in particular to an unsupervised low-light image enhancement method, system, device, and medium for enhancing low-light images using traditional retinal theory and deep neural networks. Background Technology

[0002] Images captured in low-light environments suffer from various degradation factors, such as low visibility, low contrast, and noise. Converting low-light images into high-quality, normally lit images can help improve the performance of downstream vision tasks and real-world intelligent systems, such as image classification, object detection, autonomous driving, and visual navigation.

[0003] Traditional low-light image enhancement methods primarily rely on manual priors, such as histogram equalization and Retinex Theory. However, low-light image enhancement is an ill-posed problem, making it difficult to adjust these priors to adapt to different lighting conditions. Deep learning-based low-light image enhancement methods utilize powerful neural network architectures to learn the mapping from low-light images to normal-light images in an end-to-end manner. Although deep learning-based methods are more robust than traditional methods, they often suffer from overfitting and poor generalization ability, resulting in unsatisfactory visual fidelity in the enhancement results.

[0004] Recently, generative model-based methods have achieved excellent performance in low-light image enhancement tasks. Among them, diffusion models have attracted much attention due to their powerful generative capabilities and avoidance of instability and mode collapse problems found in previous generative models such as Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). However, most diffusion model-based methods rely on supervised learning using large-scale pairwise data and conditional mechanisms, but collecting pairs of real-world low-light / normal-light images is challenging. To leverage the unlabeled nature of unsupervised learning to improve the generalization ability of diffusion models, some methods utilize structural and textural priors from pre-trained diffusion models for reconstruction without training from scratch. However, these methods are limited by known degradation patterns and therefore often perform poorly in real-world scenes where degradation is complex and unknown.

[0005] To address the aforementioned problems, this invention provides an unsupervised low-light image enhancement method, system, device, and medium that utilizes traditional retinal theory and deep neural networks for low-light image enhancement. Summary of the Invention

[0006] This invention provides an unsupervised low-light image enhancement method, system, device, and medium for enhancing low-light images using traditional retinal theory and deep neural networks.

[0007] This invention is achieved through the following technical solution: an unsupervised low-light image enhancement method, comprising the following steps:

[0008] Step S1: Construct a latent space-retinal diffusion model, which includes a CNN encoder, a content transfer decomposition network, and a CNN decoder connected sequentially from front to back.

[0009] Step S2: Train the CNN encoder, content transfer decomposition network, and CNN decoder using pairs of low-quality images from a public dataset and the sum of four loss functions.

[0010] Step S3: Train the latent space-retinal diffusion model using any unpaired low-light image, normal-light image, and the sum of two loss functions;

[0011] Step S4: The reflectance map of low-light features and the illuminance map of normal-light features are used as the latent space-retinal diffusion model. Under the guidance of the low-light features, the model is restored to obtain enhanced features, which are then used as input to the decoder to reconstruct the final enhanced image.

[0012] To better realize the present invention, the content transfer decomposition network in step S1 further includes the following structure: the content transfer decomposition network includes an upper branch structure and a lower branch structure;

[0013] The upper branch structure includes, from front to back, a first convolutional layer, a second convolutional layer, a cross attention module, a feature addition layer, a third convolutional layer, a fourth convolutional layer, and a first activation function module connected sequentially.

[0014] The lower branch structure includes, from front to back, a fifth convolutional layer, a sixth convolutional layer, a self-attention module, a feature subtraction layer, a seventh convolutional layer, an eighth convolutional layer, and a second activation function module; the sixth convolutional layer and the feature subtraction layer are connected in a skip connection.

[0015] The sixth convolutional layer is connected to the cross-attention module, and the self-attention module is connected to the feature addition layer.

[0016] To better implement the present invention, the paired low-quality images in step S2 further include paired low-light images and low-light images, low-light images and overexposed images, and overexposed images and overexposed images.

[0017] To better realize the present invention, step S3 further includes:

[0018] Using a CNN encoder to process unpaired low-light images I low and normal lighting image I high By transforming the latent space of the aforementioned latent space-retinal diffusion model, the corresponding low-light features are obtained. and normal lighting characteristics And based on the retinal theory, low-light characteristics were estimated. and normal lighting characteristics Initialized reflection map Illuminance diagram

[0019] Then, the reflection map will be initialized. Illuminance map The content transfer decomposition network is fed into the content transfer decomposition network for optimization; the content transfer decomposition network uses the cross-attention module to process the illumination map. The content information is used to enhance the initial reflection map. The content information is used, and the self-attention module is used to further extract the illumination map. The information in the image is added to the reflectance map to obtain the reflectance map R with low illumination characteristics. low R high Illuminance diagram L with normal illumination characteristics low L high .

[0020] To better realize the present invention, step S4 further includes:

[0021] The reflection map R low和 Illuminance diagram L high Combined as input to the latent space-retinal diffusion model To perform the forward diffusion process;

[0022] Using a predefined variance sequence {β1, β2, ..., β...} T The process involves T steps to gradually convert x0 into Gaussian noise. In the reverse denoising process, the data distribution learned by the latent space-retinal diffusion model is used in low-light features. Under the guidance of [the relevant authorities], the random sampling Gaussian noise was gradually [reduced / reduced]. Convert to high-quality enhancement features Enhanced features obtained from the reverse denoising process As input to the CNN decoder to reconstruct the final high-quality enhanced image

[0023] To better realize the present invention, further, the four loss functions in step S2 include content loss function, retinal reconstruction loss function, and reflection loss function. Figure 1 Consistency loss function, illuminance smoothing loss function.

[0024] To better realize the present invention, the two loss functions in step S3 further include the diffusion loss function and the self-constrained consistency loss function.

[0025] This invention also provides an unsupervised low-light image enhancement system, comprising a model building unit, a model training unit, and an enhanced image output unit, wherein:

[0026] The model building unit is used to build a latent space-retinal diffusion model, which includes a CNN encoder, a content transfer decomposition network and a CNN decoder connected from front to back.

[0027] The model training unit is used to train the CNN encoder, content transfer decomposition network, and CNN decoder using paired low-quality images from a public dataset and the sum of four loss functions; and to train the latent space-retinal diffusion model using arbitrary unpaired low-light images, normal-light images, and the sum of two loss functions.

[0028] The enhanced image output unit is used to take the reflectance map of low-light features and the illuminance map of normal-light features as the latent space-retinal diffusion model, and restore them under the guidance of low-light features to obtain enhanced features. These enhanced features are then used as input to the decoder to reconstruct the final enhanced image.

[0029] The present invention also provides an electronic device comprising a processor and a memory; the processor includes the unsupervised low-light image enhancement system described in the second aspect above.

[0030] The present invention also provides a computer-readable storage medium comprising instructions that, when executed on an electronic device described in the third aspect, cause the electronic device to perform the method described in the first aspect.

[0031] Compared with the prior art, the present invention has the following advantages and beneficial effects:

[0032] (1) This invention provides an unsupervised low-light image enhancement method, system, device, and medium that combines physically interpretable retinal theory and diffusion models. By training on a large amount of unpaired real-world data to learn degraded representations under different lighting scenarios, the generalization ability of the diffusion model is improved. We first use an encoder to convert unpaired low-light images and normal-light images into the latent space. Through the constructed content transfer decomposition network, based on retinal theory, the encoded features are decomposed into a content-rich reflectance map and a content-independent illuminance map.

[0033] (2) This invention provides an unsupervised low-light image enhancement method, system, device, and medium. The reflectance map of low-light features and the illuminance map of normal-light features are used as inputs to a diffusion model. Under the guidance of the low-light features, the enhanced features are obtained and used as inputs to a decoder to reconstruct the final enhanced image. Furthermore, we propose a self-constrained consistency loss to promote the diffusion model to reconstruct enhanced features with the same intrinsic content information as the input low-light image, further improving the overall visual quality. Attached Figure Description

[0034] The present invention will be further described in conjunction with the following drawings and embodiments. All inventive concepts of the present invention should be considered as disclosed content and within the scope of protection of the present invention.

[0035] Figure 1 A schematic diagram of the hidden space-retinal diffusion model in an unsupervised low-light image enhancement method, system, device, and medium provided in this application embodiment;

[0036] Figure 2 A schematic diagram of the structure of the content transfer decomposition network in an unsupervised low-light image enhancement method, system, device, and medium provided in this application embodiment;

[0037] Figure 3 Figure a illustrates the low-light image enhancement effect of an unsupervised low-light image enhancement method, system, device, and medium provided in this application embodiment;

[0038] Figure 4 Figure b illustrates the low-light image enhancement effect of an unsupervised low-light image enhancement method, system, device, and medium provided in this application embodiment. Detailed Implementation

[0039] To more clearly illustrate the technical solutions of the embodiments of the present invention, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. It should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments, and therefore should not be regarded as a limitation on the scope of protection. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0040] In the description of this invention, it should be noted that, unless otherwise explicitly specified and limited, the terms "set up," "connected," and "linked" should be interpreted broadly. For example, they can refer to a fixed connection, a detachable connection, or an integral connection; they can refer to a mechanical connection or an electrical connection; they can refer to a direct connection or an indirect connection through an intermediate medium; and they can refer to the internal connection of two components. Those skilled in the art can understand the specific meaning of the above terms in this invention based on the specific circumstances.

[0041] Example 1:

[0042] This embodiment provides an unsupervised low-light image enhancement method, system, device, and medium, such as Figure 1 As shown, this invention proposes an unsupervised low-light image enhancement method based on a latent space-retinal diffusion model. It combines physically interpretable retinal theory with a diffusion model, improving the generalization ability of the diffusion model by training on a large amount of unpaired real-world data to learn degraded representations under different lighting scenarios. First, an encoder is used to transform unpaired low-light and normal-light images into a latent space. Then, a constructed content transfer decomposition network, based on retinal theory, decomposes the encoded features into a content-rich reflectance map and a content-independent illuminance map. Subsequently, the reflectance map with low-light features and the illuminance map with normal-light features are used as inputs to the diffusion model. Guided by the low-light features, the model is restored to obtain enhanced features, which are then used as inputs to the decoder to reconstruct the final enhanced image. Furthermore, a self-constrained consistency loss is proposed to promote the diffusion model to reconstruct enhanced features with the same intrinsic content information as the input low-light image, further improving the overall visual quality.

[0043] Example 2:

[0044] This embodiment further optimizes upon embodiment 1. We employ a two-stage training approach. In the first stage, we train the CNN encoder, content transfer decomposition network, and CNN decoder using pairs of low-quality images (including pairs of low-light and low-light images, low-light and overexposed images, and overexposed and overexposed images) from a publicly available dataset. In the second stage, we train the diffusion model using arbitrary unpaired low-light and normal-light images, leveraging the unlabeled nature of unsupervised learning to improve the generalization ability of the diffusion model and enhance its performance on low-light images.

[0045] The other parts of this embodiment are the same as those in Embodiment 1, so they will not be described again.

[0046] Example 3:

[0047] This embodiment further optimizes the above embodiment 1 or 2 by using a CNN encoder to process unpaired low-light images I. low and normal lighting image I high Transform into the latent space to obtain the corresponding low-light features. and normal lighting characteristics And based on the retinal theory, low-light characteristics were estimated. and normal lighting characteristics Initialized reflection map Illuminance map Then, the reflection map will be initialized. Illuminance map The content is fed into the constructed content transfer decomposition network for optimization. This network uses a cross-attention module to use the content information in the illuminance map to enhance the content information in the reflectance map. It also uses a self-attention module to further extract the content information in the illuminance map and supplement it into the reflectance map, resulting in a final content-rich reflectance map R. low R high Illuminance map L unrelated to content low L high .

[0048] For low-light images, the input to the content transfer decomposition network is low-light features. Initialized reflection map Illuminance map The output is the final reflection map R. low Content Illuminance Map L low .

[0049] For images under normal lighting, the input to the content transfer decomposition network is the normal lighting features. Initialized reflection map Illuminance map The output is the final reflection map R.high Content Illuminance Map L high

[0050] The reflectance map R of the estimated low-light characteristics low Illuminance diagram L with normal illumination characteristics high Combined as input to the diffusion model To perform the forward diffusion process, a predefined variance sequence {β1, β2, ..., β} is used. T The process involves T steps to gradually convert x0 into Gaussian noise. In the reverse denoising process, the data distribution learned by the diffusion model is used to address low-light characteristics. Under the guidance of [the relevant authorities], the random sampling Gaussian noise was gradually [reduced / reduced]. Convert to high-quality enhancement features

[0051] Enhanced features obtained from the reverse denoising process As input to the CNN decoder to reconstruct the final high-quality enhanced image By training on unpaired real-world data, the diffusion model can learn complex and unknown real-world degradation distributions, thus exhibiting good generalization performance under different lighting conditions.

[0052] The other parts of this embodiment are the same as any one of the embodiments 1-2 above, so they will not be described again.

[0053] Example 4:

[0054] This embodiment further optimizes any one of embodiments 1-3 above. The training of the above model is divided into two stages. In the first stage, the CNN encoder, content transfer decomposition network and CNN decoder are optimized using pairs of low-quality images from the public dataset, and the parameters of the diffusion model are frozen. The loss function used is mainly as follows:

[0055] 1. Content Loss:

[0056]

[0057] Where I1 and I2 are pairs of low-quality images in the public dataset, and ε(·) is the encoder. For the decoder, ||·||2 is the L2 distance.

[0058] The purpose of this loss constraint is to constrain the encoder and decoder to reconstruct predictions that are consistent with the input image.

[0059] 2. Retinex Reconstruction Loss:

[0060]

[0061] This loss is based on retinal theory to constrain the content transfer decomposition network, enabling the estimated reflectance and illuminance maps to reconstruct the input features. Here, ||·||1 represents the L1 distance. It is an auxiliary item used to improve the decomposition quality of reflectance maps under different lighting conditions. The calculation formula is:

[0062]

[0063] 3. Reflection Figure 1 Reflectance Consistency Loss:

[0064]

[0065] The purpose of this loss constraint is to ensure that the reflectance map only represents the inherent content information of the image and should remain consistent under different lighting conditions.

[0066] 4. Illumination Smoothing Loss:

[0067]

[0068] The purpose of this loss constraint is to ensure that the illumination map only represents the contrast and brightness information of the image and should have local smoothness. Where λ g The coefficient is used to balance the perceived strength of the structure.

[0069] The loss function for the first stage is the sum of the four loss functions mentioned above:

[0070]

[0071] Where λ1 and λ2 are the coefficients of the strength of the equilibrium loss function.

[0072] The second stage uses arbitrary unpaired low-light and normal-light images to optimize the diffusion model, while freezing the parameters of the encoder, content transfer decomposition network, and decoder. The loss functions used are mainly as follows:

[0073] 1. Diffusion Loss:

[0074]

[0075] Where, ε t Given random Gaussian noise, For the noise data estimated by the time-diffusion model, εθ For the noise estimation network used in the diffusion model, x t Let x0 be the degraded data obtained at time t after adding noise to the input x0 through a forward diffusion process. Low light characteristics

[0076] 2. Self-constrained Consistency Loss:

[0077]

[0078] in, The pseudo-label features obtained after using gamma correction to enhance the illumination of input low-light features:

[0079]

[0080] Where γ is the illumination correction coefficient. The purpose of this loss constraint is to ensure that the enhanced features have the same intrinsic information as the input low-light features.

[0081] The loss function in the second stage is the sum of the two loss functions mentioned above:

[0082]

[0083] The other parts of this embodiment are the same as any one of the embodiments 1-3 above, so they will not be described again.

[0084] Example 5:

[0085] This embodiment is a further optimization based on any one of embodiments 1-4 above, such as... Figure 3 As shown, the results are first presented on a real-world paired dataset, (a) is the input low-light image, (b)-(l) are the enhancement results of the contrast method, (m) is the enhancement result of the method of the present invention, and (n) is the reference normal-light image.

[0086] In addition, such as Figure 4 As shown, there are also some augmentation results on real unpaired datasets to verify the generalization performance of the model. (a) is the input low-light image, (b)-(g) are the augmentation results of the contrast method, and (h) is the augmentation result of the method of the present invention.

[0087] The other parts of this embodiment are the same as any one of the embodiments 1-4 above, so they will not be described again.

[0088] Example 6:

[0089] The present invention also provides an unsupervised low-light image enhancement system that matches the method, including a model building unit, a model training unit, and an enhanced image output unit.

[0090] The present invention also provides an electronic device comprising a processor and a memory; the processor includes the unsupervised low-light image enhancement system described above.

[0091] Example 7:

[0092] The present invention also provides a computer-readable storage medium comprising instructions; when the instructions are executed on the electronic device described in the above embodiments, the electronic device causes the electronic device to perform the methods described in the above embodiments. Optionally, the computer-readable storage medium may be a memory.

[0093] The processor involved in the embodiments of this application can be a chip. For example, it can be a field programmable gate array (FPGA), an application specific integrated circuit (ASIC), a system on chip (SoC), a central processor unit (CPU), a network processor (NP), a digital signal processor (DSP), a microcontroller unit (MCU), a programmable logic device (PLD), or other integrated chips.

[0094] The memory involved in the embodiments of this application can be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory used in the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.

[0095] It should be understood that in the various embodiments of this application, the order of the above-mentioned processes does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of this application.

[0096] Those skilled in the art will recognize that the modules and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.

[0097] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and modules described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.

[0098] In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or modules may be electrical, mechanical, or other forms.

[0099] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical modules; that is, they may be located on one device or distributed across multiple devices. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.

[0100] In addition, the functional modules in the various embodiments of this application can be integrated into one device, or each module can exist physically separately, or two or more modules can be integrated into one device.

[0101] In the above embodiments, implementation can be achieved, in whole or in part, through software, hardware, firmware, or any combination thereof. When implemented using software programs, implementation can be, in whole or in part, in the form of a computer program product. This computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of this application are generated. The computer can be a general-purpose computer, a special-purpose computer, a computer network, or other programmable device. The computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another. For example, the computer instructions can be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer-readable storage medium can be any available medium accessible to a computer or a data storage device containing one or more servers, data centers, etc., that can be integrated with the medium. The available media can be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., DVDs), or semiconductor media (e.g., solid-state drives (SSDs)).

[0102] The above description is merely a preferred embodiment of the present invention and is not intended to limit the present invention in any way. Any simple modifications or equivalent changes made to the above embodiments based on the technical essence of the present invention shall fall within the protection scope of the present invention.

Claims

1. An unsupervised low-light image enhancement method, characterized in that, Includes the following steps: Step S1: Construct a latent space-retinal diffusion model, which includes a CNN encoder, a content transfer decomposition network, and a CNN decoder connected sequentially from front to back. Step S2: Train the CNN encoder, content transfer decomposition network, and CNN decoder using pairs of low-quality images from a public dataset and the sum of four loss functions. Step S3: Train the latent space-retinal diffusion model using any unpaired low-light image, normal-light image, and the sum of two loss functions; Using a CNN encoder to process unpaired low-light images and normal lighting image By transforming the latent space of the latent space-retinal diffusion model, the corresponding low-light features are obtained. and normal lighting characteristics And based on the retinal theory, low-light characteristics were estimated. and normal lighting characteristics Initialized reflection map , Illuminance map , ; Then, the reflection map will be initialized. , Illuminance map , The content transfer decomposition network is fed into the content transfer decomposition network for optimization; the content transfer decomposition network uses a cross-attention module to process the illumination map. , The content information is used to enhance the initial reflection map. , The content information is processed, and a self-attention module is used to further extract the illumination map. , The information in the image is added to the reflectance map to obtain a reflectance map with low-light characteristics. , Illuminance diagram with normal illumination characteristics , ; Step S4: The reflectance map of low illumination features and the illuminance map of normal illumination features are used as the latent space-retinal diffusion model. The model is restored under the guidance of the low illumination features to obtain enhanced features, and these enhanced features are used as input to the decoder to reconstruct the final enhanced image.

2. The unsupervised low-light image enhancement method according to claim 1, characterized in that, The structure of the content transfer decomposition network in step S1 includes: the content transfer decomposition network includes an upper branch structure and a lower branch structure; The upper branch structure includes, from front to back, a first convolutional layer, a second convolutional layer, a cross attention module, a feature addition layer, a third convolutional layer, a fourth convolutional layer, and a first activation function module connected sequentially. The lower branch structure includes, from front to back, a fifth convolutional layer, a sixth convolutional layer, a self-attention module, a feature subtraction layer, a seventh convolutional layer, an eighth convolutional layer, and a second activation function module; the sixth convolutional layer and the feature subtraction layer are connected in a skip connection. The sixth convolutional layer is connected to the cross-attention module, and the self-attention module is connected to the feature addition layer.

3. The unsupervised low-light image enhancement method according to claim 1, characterized in that, The pairs of low-quality images in step S2 include pairs of low-light images and low-light images, low-light images and overexposed images, and overexposed images and overexposed images.

4. The unsupervised low-light image enhancement method according to claim 1, characterized in that, Step S4 includes: The reflection image Illuminance map Combined as input to the latent space-retinal diffusion model To perform the forward diffusion process; Using a predefined variance sequence ,pass Step by step Convert to Gaussian noise In the reverse denoising process, the data distribution learned by the latent space-retinal diffusion model is used to address low-light characteristics. Under the guidance of [the relevant authorities], the randomly sampled Gaussian noise was gradually [reduced / reduced]. Convert to high-quality enhancement features The enhanced features obtained from the reverse denoising process As input to the CNN decoder to reconstruct the final high-quality enhanced image .

5. The unsupervised low-light image enhancement method according to claim 1, characterized in that, The four loss functions in step S2 include content loss function, retinal reconstruction loss function, reflectance consistency loss function, and illuminance smoothing loss function.

6. The unsupervised low-light image enhancement method according to claim 1, characterized in that, The two loss functions in step S3 include the diffusion loss function and the self-constrained consistency loss function.

7. An unsupervised low-light image enhancement system, characterized in that, It includes a model building unit, a model training unit, and an enhanced image output unit, wherein: The model building unit is used to build a latent space-retinal diffusion model, which includes a CNN encoder, a content transfer decomposition network and a CNN decoder connected from front to back. The model training unit is used to train the CNN encoder, content transfer decomposition network, and CNN decoder using paired low-quality images from a public dataset and the sum of four loss functions; and to train the latent space-retinal diffusion model using arbitrary unpaired low-light images, normal-light images, and the sum of two loss functions. Using a CNN encoder to process unpaired low-light images and normal lighting image By transforming the latent space of the latent space-retinal diffusion model, the corresponding low-light features are obtained. and normal lighting characteristics And based on the retinal theory, low-light characteristics were estimated. and normal lighting characteristics Initialized reflection map , Illuminance map , ; Then, the reflection map will be initialized. , Illuminance map , The content transfer decomposition network is fed into the content transfer decomposition network for optimization; the content transfer decomposition network uses a cross-attention module to process the illumination map. , The content information is used to enhance the initial reflection map. , The content information is processed, and a self-attention module is used to further extract the illumination map. , The information in the image is added to the reflectance map to obtain a reflectance map with low-light characteristics. , Illuminance diagram with normal illumination characteristics , ; An enhanced image output unit is used to take the reflectance map of low-light features and the illuminance map of normal-light features as the latent space-retinal diffusion model, restore them under the guidance of low-light features to obtain enhanced features, and use them as input to the decoder to reconstruct the final enhanced image.

8. An electronic device, characterized in that, It includes a processor and a memory; the processor is used to run the unsupervised low-light image enhancement system as described in claim 7.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium includes instructions that, when executed on the electronic device of claim 8, cause the electronic device to perform the method of any one of claims 1-6.