Underwater image enhancement method based on hybrid expert and state space model
By using an end-to-end deep neural network based on a hybrid expert and state-space model, the adaptability and efficiency issues of underwater image enhancement methods in complex environments are solved. This achieves efficient processing of color distortion and detail blurring, improving the model's generalization ability and image quality.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- 安徽交检交通发展研究中心有限责任公司
- Filing Date
- 2026-03-10
- Publication Date
- 2026-06-19
AI Technical Summary
Existing underwater image enhancement methods are not adaptable enough to the complex and ever-changing underwater environment, making it difficult to simultaneously handle color distortion and detail blurring, and it is also difficult to balance computational efficiency and performance.
An end-to-end deep neural network based on a hybrid expert and state-space model is adopted. Image degradation is adaptively processed through the hybrid expert state-space module, and multi-scale features are integrated using a dual-gated cross-attention fusion module. The network is trained by combining pixel-domain and frequency-domain loss functions.
It achieves adaptive and refined processing of complex degradation patterns, improves the model's generalization ability and robustness, and outputs high-quality enhanced images suitable for real-time applications.
Smart Images

Figure CN122243773A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing technology, and in particular to an underwater image enhancement method based on a hybrid expert and state-space model. Background Technology
[0002] Underwater images commonly suffer from color distortion, low contrast, and blurred details due to light absorption and scattering in water, as well as the influence of suspended particles, severely impacting the accuracy of underwater detection and observation applications. Existing enhancement methods mainly fall into two categories: methods based on physical models rely on precise parameter estimation, but lack generalization ability and robustness in complex and variable underwater environments; while data-driven deep learning methods have made progress, they still face key challenges.
[0003] These challenges mainly include: insufficient adaptability of models to complex, variable, and non-uniform underwater degradation patterns, making it difficult to simultaneously handle color distortion and detail blurring; the network is prone to losing detailed information during feature extraction and fusion, and existing multi-scale feature fusion mechanisms are inefficient; the model's generalization ability to unseen underwater scenes or new degradation types is limited; furthermore, network performance and computational efficiency are often difficult to balance, restricting real-time applications. Therefore, there is an urgent need for a novel underwater image enhancement method that can adaptively handle complex degradation, efficiently fuse features, possess strong generalization ability, and is computationally efficient. Summary of the Invention
[0004] The purpose of this invention is to solve the technical problems existing in current underwater image enhancement methods, such as insufficient adaptability to complex degradation patterns, low efficiency of feature extraction and fusion, limited model generalization ability, and difficulty in balancing computational efficiency and performance, and to provide an underwater image enhancement method based on hybrid expert and state space model.
[0005] To achieve the above objectives, this application adopts the following technical solution: an underwater image enhancement method based on a hybrid expert and state-space model, comprising the following steps:
[0006] S1. Constructing an image enhancement model: Construct an end-to-end deep neural network containing an encoder, a bottleneck layer, and a decoder as the image enhancement model; the encoder consists of multiple cascaded hybrid expert state space modules used to extract deep features from the input underwater image;
[0007] S2. Core Model Processing: The underwater image to be enhanced is input into the image enhancement model; the hybrid expert state space module performs the following sequential operations on the input features: noise-injection-based gated routing to dynamically select and activate experts, the activated visual state space experts perform state space modeling of the features including semantic sequence rearrangement, and gated fusion of the outputs of each expert;
[0008] S3. Feature Fusion and Reconstruction: Between corresponding levels of the encoder and decoder, a dual-gated cross-attention fusion module is used to fuse the deep semantic features output by the encoder with the detailed features from shallow or skip connections; the decoder upsamples and reconstructs the fused features to output an enhanced image.
[0009] S4. Model Training and Inference: The image enhancement model is trained using a hybrid loss function that includes pixel domain loss and frequency domain loss; the trained model is then used to enhance the input underwater degraded image.
[0010] Preferably, in step S1, the processing procedure of the hybrid expert state space module specifically includes:
[0011] S11. Gated Routing: Calculates input features through a lightweight gated network. The routing weights are calculated in a three-stage process, including noise perturbations, by the gating network: firstly, the routing weights are calculated... Perform feature aggregation to obtain a global descriptor Next, the routing score with noise injection is calculated. Finally, Top-k sparse selection is performed to generate normalized gating weights. ;
[0012] S12. Expert Handling: Based on The input features are dynamically routed to one or more of N parallel visual state space experts; each expert has a differentiated parameter configuration for targeted processing of different types of image degradation.
[0013] S13. State-space modeling: Within the activated expert, layer normalization is performed on the input feature sequence and decoupled into content branches and gating branches; the content branches are rearranged based on semantic similarity, and discretized state-space computation is performed on the rearranged sequence.
[0014] S14. Feature Output: Restore the original order of the feature sequence calculated in the state space, and output it after gating fusion with the gated branch features.
[0015] Preferably, in step S11, the routing score with noise injection... The calculation formula is:
[0016] ;
[0017] in, For global feature descriptors, and The weight matrix is a learnable matrix. A random noise vector sampled from a standard normal distribution. This indicates element-wise multiplication. It is a smooth activation function;
[0018] Normalized gating weights Obtained through the following formula:
[0019] ;
[0020] in Indicates the highest retained score The operation of setting one value and the remainder to negative infinity. This is the preset number of activated experts.
[0021] Preferably, in step S12, the differentiated parameter configuration is as follows: , respectively representing the first The internal rank, kernel size, state space dimension, and channel dilation ratio of each expert; specific processing includes:
[0022] For high-frequency detail regions in an image, priority is given to routing to internal rank. High and convolution kernel size Smaller specialists;
[0023] For the low-frequency region of color distortion, priority is given to routing to the dimension of the state space. Larger kernel size Larger experts;
[0024] For regular areas, routing is performed to the default expert for parameter balancing;
[0025] For complex degraded regions, the routing to channel expansion ratio Highly qualified experts.
[0026] Preferably, in step S13, the sequence rearrangement mechanism based on semantic similarity specifically includes:
[0027] Preset a containing A pool of learnable hints for semantic prototypes ;
[0028] For the input content branch sequence Each of the tags in The matching degree between the target and each prototype in the cue pool is calculated, and a one-hot encoded routing matrix is generated by Gumbel-Softmax sampling. ,in The sequence length;
[0029] Based on the routing matrix The semantic categories indicated are used to aggregate semantically similar tags in the original sequence to generate new sequences that are semantically adjacent. .
[0030] Preferably, in step S13, the discretization state space calculation process is as follows:
[0031] For the rearranged sequence The continuous-time state-space equations are discretized using the zero-order preserve technique, and the discrete state transition matrix is obtained. With input matrix The calculation formula is:
[0032] ;
[0033] ;
[0034] in, For the evolution matrix, For the projection matrix, The time-scale parameter is adaptively varied with the input. It is the identity matrix;
[0035] The recursive state is updated as follows: .
[0036] Preferably, in step S13, during the output stage of state-space computation, the attention state-space observation equation is constructed:
[0037] ;
[0038] in, Based on the output matrix, For through projection matrix, To retrieve from the prompt pool Based on the routing matrix Extracted instance-level dynamic hints;
[0039] The feature sequence of the completed calculation After reversing and restoring the original order, it is related to the gating branch feature. Through fusion, experts ultimately output features. The calculation formula is:
[0040] ;
[0041] in For activation function, This indicates element-wise multiplication. This indicates a linear projection layer.
[0042] Preferably, in step S3, the fusion process of the dual-gated cross-attention fusion module is as follows:
[0043] Deep semantic features of the encoder and decoder current features Perform layer normalization separately;
[0044] Generate query vectors through linear mapping Key vector Sum value vector ;
[0045] Calculate the fused features :
[0046] ;
[0047] in, To output the projection matrix, Scaling factor and These are learnable bidirectional gating coefficients.
[0048] Preferably, in step S4, the hybrid loss function For pixel domain loss and frequency domain loss Weighted sum:
[0049] ;
[0050] in, This is the preset balance coefficient.
[0051] Preferably, frequency domain loss The calculation method is as follows: predict the image for the network. With clear truth image Perform two-dimensional fast Fourier transforms on each part and calculate their real and imaginary parts. Norm differences:
[0052] ;
[0053] in, This represents a two-dimensional Fast Fourier Transform. and These represent the real and imaginary parts of a complex number, respectively.
[0054] The technical effects and advantages of this invention are as follows:
[0055] This invention integrates a hybrid expert mechanism and a visual state space model, efficiently solving the challenges of global color shift correction and local detail restoration in underwater image enhancement while maintaining linear computational complexity. The visual state space model endows the network with the ability to capture global long-distance dependencies, fundamentally improving the limited field of view problem of traditional convolutional networks and ensuring global consistency of color correction. Simultaneously, the integration of a noise-injected intelligent gating network and a functionally specialized expert pool enables the model to dynamically activate the optimal expert combination based on the degradation characteristics of local image regions, achieving adaptive and refined processing of non-uniform and complex degradation patterns. Furthermore, the dual-gated cross-attention module optimizes multi-scale feature fusion, effectively suppressing noise and preserving key details. The combination of pixel and frequency domain hybrid loss functions further ensures high quality in visual naturalness and texture fidelity of the enhanced image. This significantly improves the model's generalization ability and robustness to varying underwater environments, providing a clearer and more reliable data foundation for subsequent advanced vision tasks. Attached Figure Description
[0056] The disclosure of this invention is illustrated with reference to the accompanying drawings. It should be understood that the drawings are for illustrative purposes only and are not intended to limit the scope of protection of this invention. In the drawings, the same reference numerals are used to refer to the same parts:
[0057] Figure 1 This is a diagram of the overall network architecture of the present invention;
[0058] Figure 2 This is a block diagram of the hybrid expert state space module structure of the present invention;
[0059] Figure 3 This is a block diagram of the hybrid expert gating routing and expert pool structure of the present invention. Detailed Implementation
[0060] It is readily understood that, based on the technical solution of this invention, those skilled in the art can propose various interchangeable structural methods and implementations without altering the essential spirit of the invention. Therefore, the following detailed embodiments and accompanying drawings are merely illustrative examples of the technical solution of this invention and should not be considered as the entirety of the invention or as limitations or restrictions on the technical solution of this invention.
[0061] like Figures 1-3 As shown, this invention provides an underwater image enhancement method based on a hybrid expert and state-space model, aiming to solve problems such as complex underwater image degradation, poor adaptability of existing methods, and insufficient detail recovery. This method constructs an end-to-end deep neural network, utilizes a hybrid expert state-space module to adaptively process different degradation modes, and efficiently integrates multi-scale features through a dual-gated cross-attention fusion module, ultimately outputting a high-quality enhanced image.
[0062] The method of the present invention includes the following main steps:
[0063] S1. Constructing an image enhancement model:
[0064] This invention constructs an end-to-end deep neural network as an image enhancement model, which adopts an encoder-bottleneck layer-decoder architecture.
[0065] The encoder consists of multiple cascaded hybrid expert state space modules; each hybrid expert state space module is designed to extract deep features from the input underwater image and can adaptively handle different types of image degradation.
[0066] The bottleneck layer connects the encoder and decoder and typically contains one or more hybrid expert state space modules for further extraction of the highest-level abstract features.
[0067] The decoder is used to upsample and reconstruct the features extracted by the encoder, and finally output an enhanced image. Between the corresponding levels of the encoder and decoder, feature fusion is performed through a dual-gated cross-attention fusion module to effectively integrate feature information from different levels.
[0068] S2, Core Model Processing:
[0069] The underwater image to be enhanced is input into the image enhancement model. The hybrid expert state space module is the core component of this invention. It performs the following operations on the input features in sequence: First, it performs noise injection-based gating routing. This routing mechanism uses a lightweight gating network to dynamically select and activate one or more visual state space experts according to the characteristics of the input features. This dynamic selection enables the model to specifically handle image degradation in different regions or of different types.
[0070] Secondly, the activated visual state space experts perform state space modeling of the features, which includes semantic sequence rearrangement; each expert uses a state space model to capture the long-distance dependencies of the feature sequence and optimizes the sequence processing through the semantic sequence rearrangement mechanism.
[0071] Finally, the outputs of each expert are gated and fused; the outputs of multiple activated experts are weighted and combined through the gated fusion mechanism to form the final output features of the hybrid expert state space module.
[0072] S3, Feature Fusion and Reconstruction:
[0073] Between corresponding levels of the encoder and decoder, a dual-gated cross-attention fusion module is used to fuse deep semantic features output by the encoder with detailed features from shallow or skip connections. This fusion method can effectively integrate features at different scales and semantic levels, avoid loss of details, and suppress noise.
[0074] The decoder upsamples and reconstructs the fused features to gradually restore the spatial resolution of the image and finally outputs the enhanced image.
[0075] S4. Model Training and Inference: The image enhancement model is trained using a hybrid loss function that includes pixel-domain loss and frequency-domain loss. This hybrid loss function can simultaneously constrain the quality of the enhanced image at the pixel level and in the frequency domain, thereby obtaining a more natural and detailed enhancement effect.
[0076] The trained model is used to enhance input underwater degraded images, outputting high-quality enhanced images.
[0077] The key modules and mechanisms in the above steps will be described in more detail below.
[0078] The hybrid expert state space module has the following internal structure and processing flow:
[0079] S11. Gated Router:
[0080] Input features are computed using a lightweight gating network. The routing weights; the gated network performs a three-stage computation including noise perturbations:
[0081] First, the input features Perform feature aggregation to obtain a global descriptor Specifically, by performing global average pooling and global max pooling in parallel and then summing the results, a global feature descriptor that takes into account both background smoothness information and texture saliency information is obtained. The calculation formula is as follows:
[0082] ;
[0083] Secondly, calculate the routing score with noise injection. ;
[0084] To enhance the robustness of the gating mechanism and promote expert load balancing, learnable random noise is introduced when generating expert scores; two learnable weight matrices are defined: the gating weight matrix and the gating weight matrix. and noise weight matrix The original score and the noise component together constitute the unnormalized routing score VV, which is calculated as follows:
[0085] ;
[0086] in, This represents a random noise vector sampled from a standard normal distribution. This indicates element-wise multiplication. This is a smooth activation function used to control the non-negativity of the noise amplitude;
[0087] Finally, Top-k sparse selection is performed to generate normalized gating weights. This step employs a Top-k sparse selection strategy, retaining only the highest-scoring entries. One expert is selected, and the weights of the remaining experts are forced to negative infinity; the final normalized gating weight vector is... Calculated using the Softmax function:
[0088] ;
[0089] in Indicates the highest retained score The operation of setting one value and the remainder to negative infinity. This is the preset number of activated experts.
[0090] S12. Expert Handling:
[0091] Based on normalized gating weights The input features are dynamically routed to one or more of N parallel visual state space experts; each expert has a differentiated parameter configuration for targeted processing of different types of image degradation.
[0092] Differentiated parameter configurations are , respectively representing the first The internal rank, kernel size, state space dimension, and channel dilation ratio of each expert; specific processing includes:
[0093] For high-frequency detail regions in images: Gated networks preferentially route input features to internal rank features. High and convolution kernel size Smaller experts; high internal rank allows the model to capture subtle feature changes precisely, and small-sized convolutional kernels focus on local neighborhood interactions, thus effectively recovering high-frequency details;
[0094] For low-frequency regions of color distortion: Gated networks prioritize routing input features to the dimension of the state space. Larger kernel size Larger experts and larger state space dimensions allow the hidden states in the state space equation to have greater capacity to encode historical information over longer time steps. Combined with the wide physical receptive field provided by large-size convolutional kernels, the model is able to maintain global color consistency and correct color bias.
[0095] For regular regions: the gating network routes the input features to a default expert with parameter equalization, which has a moderate configuration. In order to address the widespread degradation;
[0096] For complex degenerate regions: Gated networks route input features to the channel dilation ratio ei. The higher expert level and high channel expansion ratio allow input features to be mapped to a higher-dimensional feature space, leveraging the significantly increased parameter capacity and nonlinear fitting ability to decouple extremely complex degradation patterns.
[0097] S13. State-space modeling:
[0098] Within the activated expert, layer normalization is performed on the input feature sequence, and the layers are decoupled into content branches and gating branches.
[0099] The sequence rearrangement mechanism based on semantic similarity is as follows:
[0100] Preset a containing A pool of learnable hints for semantic prototypes ;
[0101] For each marker in the input content branch sequence UU The matching degree between the target and each prototype in the cue pool is calculated, and a one-hot encoded routing matrix is generated by Gumbel-Softmax sampling. ,in The sequence length;
[0102] Based on the semantic categories indicated by the routing matrix RR, semantically similar tags in the original sequence are clustered to generate a new sequence with semantically close proximity. .
[0103] The discretized state-space calculation process is as follows:
[0104] For the rearranged sequence The continuous-time state-space equations are discretized using the zero-order preservation technique; the discrete state transition matrix... With input matrix The calculation formula is:
[0105] ;
[0106] ;
[0107] in, For the evolution matrix, For the projection matrix, The time-scale parameter is adaptively varied with the input. It is the identity matrix;
[0108] The recursive state is updated as follows: .
[0109] S14. Feature Output:
[0110] In the output stage of state-space computation, the attention state-space observation equation is constructed as follows:
[0111] ;
[0112] in, Based on the output matrix, For through projection matrix, To retrieve from the prompt pool Based on the routing matrix Extracted instance-level dynamic hints.
[0113] The feature sequence of the completed calculation After reversing and restoring the original order, it is related to the gating branch feature. Through fusion, experts ultimately output features. The calculation formula is:
[0114] ;
[0115] in, For activation function, This indicates element-wise multiplication. This indicates a linear projection layer.
[0116] Detailed description of the dual-gated cross-attention fusion module:
[0117] The dual-gated cross-attention fusion module is used to fuse deep semantic features from the encoder. and decoder current features The fusion process is as follows:
[0118] First, the deep semantic features of the encoder and decoder current features Perform layer normalization separately;
[0119] Generate query vectors through linear mapping Key vector Sum value vector .
[0120] Calculate the fused features :
[0121] ;
[0122] in, To output the projection matrix, Scaling factor and The learnable bidirectional gating coefficients enable the model to adaptively balance information from the encoder and decoder, effectively suppressing noise and restoring image details.
[0123] Hybrid loss function and model training:
[0124] During the model training phase, this invention employs a hybrid loss function that includes pixel-domain loss and frequency-domain loss. .
[0125] Hybrid loss function For pixel domain loss and frequency domain loss Weighted sum:
[0126] ;
[0127] in, This is a preset balance coefficient used to adjust the contribution of frequency domain loss to the total loss.
[0128] pixel domain loss Computational network predicts images With clear truth image The sum of the absolute differences between the pixels, i.e.:
[0129] ;
[0130] This loss function encourages the predicted image to approximate the ground truth image at the pixel level.
[0131] Frequency domain loss The calculation method is as follows: predict the image for the network. With clear truth image Perform two-dimensional fast Fourier transforms on each part and calculate their real and imaginary parts. Norm differences:
[0132] ;
[0133] in, This represents a two-dimensional Fast Fourier Transform. and These represent the real and imaginary parts of the complex number, respectively. The purpose of introducing frequency domain loss is to constrain the global frequency distribution of the enhanced image, so that it is consistent with the ground image in terms of high-frequency information such as texture and edges, as well as low-frequency information such as color and brightness, thereby avoiding artifacts or unnatural visual effects in the enhanced image.
[0134] Model training process:
[0135] Dataset preparation: Collect a dataset containing pairs of underwater images. Each pair of images includes one degraded image generated from a real underwater environment or synthetic means as input, and a corresponding clear, distortion-free image as a ground truth. Standardize and preprocess the original image data, including resizing and pixel value normalization. Implement data augmentation strategies, such as random horizontal flipping, vertical flipping, and random cropping, to increase the diversity of the training data.
[0136] Network initialization: The weight parameters of the image augmentation model are randomly initialized or pre-trained weights are used.
[0137] Optimizer selection: Optimizers such as AdamW are used for iterative updates of model parameters.
[0138] Iterative training: In each training iteration, a batch of degraded underwater images is input into the model. After processing by the encoder, bottleneck layer, and decoder, the predicted augmented images are obtained. The mixing loss between the predicted and ground truth images is calculated. Based on the loss value, the gradient is calculated using the backpropagation algorithm, and the model parameters are updated.
[0139] Convergence criterion: Continue training until the model achieves optimal performance on the validation set or the loss function converges.
[0140] Model inference process:
[0141] After training, the trained model is used to enhance new underwater degraded images. The underwater image to be enhanced is input into the model, and the model performs calculations according to the forward propagation path, namely steps S2 and S3, and finally outputs the enhanced image. Since the model has learned the mapping relationship from degraded image to clear image, it can directly generate high-quality enhanced image.
[0142] The technical scope of this invention is not limited to the content described above. Those skilled in the art can make various modifications and variations to the above embodiments without departing from the technical concept of this invention, and all such modifications and variations should fall within the protection scope of this invention.
Claims
1. A method for underwater image enhancement based on hybrid expert and state space model, characterized in that, Includes the following steps: S1. Constructing an image enhancement model: Construct an end-to-end deep neural network containing an encoder, a bottleneck layer, and a decoder as an image enhancement model; the encoder consists of multiple cascaded hybrid expert state space modules, used to extract deep features from the input underwater image; S2. Core Model Processing: The underwater image to be enhanced is input into the image enhancement model; the hybrid expert state space module performs the following sequential operations on the input features: noise-injected gated routing to dynamically select and activate experts, the activated visual state space experts perform state space modeling of the features including semantic sequence rearrangement, and gated fusion of the outputs of each expert; S3. Feature Fusion and Reconstruction: Between the corresponding levels of the encoder and decoder, the deep semantic features output by the encoder are fused with the detailed features from shallow or skip connections through a dual-gated cross-attention fusion module. The decoder upsamples and reconstructs the fused features to output an enhanced image; S4. Model Training and Inference: The image enhancement model is trained using a hybrid loss function that includes pixel domain loss and frequency domain loss; the trained model is then used to enhance the input underwater degraded image.
2. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 1, characterized in that, In step S1, the processing procedure of the hybrid expert state space module specifically includes: S11. Gated Routing: Calculates input features through a lightweight gated network. The routing weights are calculated, and the gated network performs a three-stage calculation including noise perturbations: first, for... Perform feature aggregation to obtain a global descriptor Next, the routing score with noise injection is calculated. Finally, Top-k sparse selection is performed to generate normalized gating weights. ; S12. Expert Handling: Based on The input features are dynamically routed to one or more of N parallel visual state space experts; each expert has a differentiated parameter configuration for targeted processing of different types of image degradation. S13. State-space modeling: Within the activated expert, layer normalization is performed on the input feature sequence and decoupled into content branches and gating branches; the content branches are rearranged based on semantic similarity, and discretized state-space computation is performed on the rearranged sequence. S14. Feature Output: Restore the original order of the feature sequence calculated in the state space, and output it after gating fusion with the gated branch features.
3. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 2, characterized in that, In step S11, the routing score with noise injection The calculation formula is: ; in, For global feature descriptors, and The weight matrix is a learnable matrix. A random noise vector sampled from a standard normal distribution. This indicates element-wise multiplication. It is a smooth activation function; The normalized gating weights Obtained through the following formula: ; in Indicates the highest retained score The operation of setting one value and the remainder to negative infinity. This is the preset number of activated experts.
4. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 3, characterized in that, In step S12, the differentiated parameter configuration is as follows: , respectively representing the first The internal rank, kernel size, state space dimension, and channel dilation ratio of each expert; the targeted processing specifically includes: For high-frequency detail regions in an image, priority is given to routing to internal rank. High and convolution kernel size Smaller specialists; For the low-frequency region of color distortion, priority is given to routing to the dimension of the state space. Larger kernel size Larger experts; For regular areas, routing is performed to the default expert for parameter balancing; For complex degraded regions, the routing to channel expansion ratio Highly qualified experts.
5. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 2, characterized in that, In step S13, the sequence rearrangement mechanism based on semantic similarity specifically includes: Preset a containing A pool of learnable hints for semantic prototypes ; For the input content branch sequence Each of the tags in The matching degree between the target and each prototype in the cue pool is calculated, and a one-hot encoded routing matrix is generated by Gumbel-Softmax sampling. ,in The sequence length; Based on the routing matrix The semantic categories indicated are used to aggregate semantically similar tags in the original sequence to generate new sequences that are semantically adjacent. .
6. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 5, characterized in that, In step S13, the discretization state space calculation process specifically includes: For the rearranged sequence The continuous-time state-space equations are discretized using the zero-order preserve technique, and the discrete state transition matrix is obtained. With input matrix The calculation formula is: ; ; in, For the evolution matrix, For the projection matrix, The time-scale parameter is adaptively varied with the input. It is the identity matrix; The recursive state is updated as follows: .
7. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 6, characterized in that, In step S13, during the output stage of state-space computation, the attention state-space observation equation is constructed: ; in, Based on the output matrix, For through projection matrix, To obtain from the prompt pool Based on the routing matrix Extracted instance-level dynamic hints; The feature sequence of the completed calculation After reversing and restoring the original order, it is related to the gating branch feature. Through fusion, experts ultimately output features. The calculation formula is: ; in For activation function, This indicates element-wise multiplication. This indicates a linear projection layer.
8. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 7, characterized in that, In step S3, the fusion process of the dual-gated cross-attention fusion module is specifically as follows: Deep semantic features of the encoder and decoder current features Perform layer normalization separately; Generate query vectors through linear mapping Key vector Sum value vector ; Calculate the fused features : ; in, To output the projection matrix, Scaling factor and These are learnable bidirectional gating coefficients.
9. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 1, characterized in that, In step S4, the hybrid loss function For pixel domain loss and frequency domain loss Weighted sum: ; in, This is the preset balance coefficient.
10. The underwater image enhancement method based on a hybrid expert and state-space model according to claim 9, characterized in that, The frequency domain loss The calculation method is as follows: predict the image for the network. With clear truth image Perform two-dimensional fast Fourier transforms on each part and calculate their real and imaginary parts. Norm differences: ; in, This represents a two-dimensional Fast Fourier Transform. and These represent the real and imaginary parts of a complex number, respectively.