A semi-supervised 3D left atrium segmentation method based on DoubleW-Net
By combining the DoubleW-Net network architecture with the attention module, the accuracy and generalization ability of semi-supervised medical image segmentation are improved, solving the problem of insufficient model generalization ability in existing methods, and achieving high accuracy and stability in left atrial segmentation.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- HEBEI UNIV OF ENG
- Filing Date
- 2025-12-02
- Publication Date
- 2026-06-19
AI Technical Summary
Existing semi-supervised medical image segmentation methods have limited utilization of unlabeled data and insufficient model generalization ability, making it difficult to meet the high accuracy and stability requirements of clinical diagnosis. In particular, they suffer from insufficient segmentation accuracy and robustness in the task of left atrial segmentation.
We adopt the DoubleW-Net network architecture, combining channel attention modules and global attention modules to construct parallel upper and lower layer networks. Through feature interaction and extraction, we improve the accuracy and generalization ability of the model.
Using only a small amount of labeled data, we achieved high accuracy and robustness in left atrial segmentation, which is superior to existing methods and has strong generalization ability and segmentation performance.
Smart Images

Figure CN121582273B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image segmentation technology, and in particular to a semi-supervised 3D left atrial segmentation method based on DoubleW-Net. Background Technology
[0002] Medical image segmentation is a key technology assisting doctors in disease diagnosis. While fully supervised methods based on convolutional neural networks achieve high accuracy, their performance heavily relies on large amounts of expert-annotated data. In the medical field, acquiring such labeled data is time-consuming, labor-intensive, and costly, becoming a major bottleneck for practical applications. In contrast, unlabeled medical image data is readily available; therefore, semi-supervised learning methods that effectively utilize unlabeled data have become a research hotspot.
[0003] Current mainstream semi-supervised methods primarily rely on pseudo-labels and consistency regularization. Pseudo-labeling methods use the initial model to generate predictions on unlabeled data as surrogate ground truth for self-training, but their performance is severely limited by the quality of the initial pseudo-labels. Low-quality pseudo-labels lead to error accumulation, causing confirmation bias and performance degradation. Consistency regularization improves model robustness by perturbing unlabeled data and constraining model output consistency. However, existing methods often employ a single network or simple perturbations, resulting in limited constraint strength and difficulty in fully exploiting the potential of unlabeled data. The models also exhibit insufficient generalization ability when faced with varying data distributions. Existing state-of-the-art semi-supervised methods have achieved competitive performance using only a small amount of labeled data, demonstrating the feasibility of this approach. However, their performance still lags behind fully supervised methods, with room for improvement in segmentation accuracy and robustness, failing to fully meet the stringent requirements of high accuracy and stability in clinical diagnosis.
[0004] Therefore, a semi-supervised 3D left atrial segmentation method based on DoubleW-Net is proposed to solve the above problems. Summary of the Invention
[0005] To address the aforementioned challenges, this invention provides a semi-supervised 3D left atrial segmentation method based on DoubleW-Net. By constructing a differentiated parallel network with an upper layer of 3D convolutions and a lower layer of depthwise separable convolutions, and combining a channel attention module used only in the upper layer with a global attention module, efficient feature interaction and extraction are achieved, significantly improving the accuracy and generalization ability of convolutional neural network models in the field of atrial segmentation.
[0006] To achieve the above objectives, this invention provides a semi-supervised 3D left atrial segmentation method based on DoubleW-Net, comprising the following steps:
[0007] S1: Acquire and preprocess the 3D left atrial magnetic resonance image to be tested;
[0008] S2: Construct a training dataset based on labeled 3D left atrial magnetic resonance imaging (MRI) data and unlabeled 3D left atrial MRI data;
[0009] S3: Construct a semi-supervised segmentation neural network and train it using a segmentation loss function based on the training dataset to obtain the trained semi-supervised segmentation neural network; the semi-supervised segmentation neural network is a parallel DoubleW-Net structure, including an upper W network constructed by integrating channel attention modules and global attention modules, and a lower W network;
[0010] S4: Input the preprocessed 3D left atrial magnetic resonance image to be tested into the trained semi-supervised segmentation neural network to obtain the segmentation result of the left atrial magnetic resonance image.
[0011] Preferably, the preprocessing in S1 specifically includes:
[0012] Remove the information region around the 3D left atrial MRI image to be tested, retain the central image portion, crop the image to 112*112*80, remove the edge information region in the image, and save the cropped data in an h5 file.
[0013] Preferably, the upper W network uses standard 3×3×3 convolutional kernels for feature extraction, integrates a channel attention module in the first layer at the network input, and integrates a global attention module in the skip connection stage; the lower W network uses depthwise separable convolutions to construct the encoder-decoder, and does not integrate channel attention modules or global attention modules.
[0014] Preferably, in step S3, the channel attention module obtains the weights of each channel through parallel adaptive average pooling and adaptive max pooling, processes them with the ReLU activation function, multiplies them channel by channel with the original feature map, and adds the residuals with the input feature map, thereby emphasizing or suppressing different feature channels according to the needs of the task.
[0015] Preferably, the global attention module in step S3 is a Transformer-like structure based on 3D convolution and integrates position encoding to capture long-distance dependencies of feature maps during the skip connection stage, thereby enhancing the global context feature representation capability.
[0016] Preferably, both the upper W network and the lower W network use transposed convolution for upsampling. The feature maps of the upsampling stage of the lower W network are added element-wise to the feature maps of the upsampling stage of the upper W network to obtain richer features.
[0017] Preferably, after completing all upsampling stages, a 1×1×1 convolutional block is used to fuse the feature maps output by the upper W network and the lower W network to obtain the final segmentation result.
[0018] Preferably, the semi-supervised segmentation neural network employs a channel reduction strategy to reduce the number of network parameters.
[0019] Preferably, the segmentation loss function in step S3 consists of the segmentation loss of the upper W network and the segmentation loss of the lower W network.
[0020] Preferably, the segmentation loss function is expressed as:
[0021] ;
[0022] in, The segmentation loss of the upper W-layer network, This is the segmentation loss of the lower-layer W network.
[0023] Therefore, the present invention employs the above-mentioned semi-supervised 3D left atrial segmentation method based on DoubleW-Net, which has the following beneficial effects:
[0024] (1) This invention introduces a DoubleW-Net network architecture and combines it with the designed channel attention module and global attention module. The two modules are only used in the upper W network and not in the lower network. The purpose is to make the two networks different, one weak network and one strong network, so that the two networks can learn from each other and train, which greatly improves the training ability and feature extraction ability of the network.
[0025] (2) The present invention designs a channel attention module and a global attention module. The channel attention block is introduced into the first layer of the input part of the segmentation network, and the global attention block is introduced into the skip connection stage of the upper network. At the same time, a position encoding is introduced so that the model can better focus on the spatial position information of the image.
[0026] (3) The present invention employs a channel reduction strategy to reduce the number of network parameters.
[0027] (4) Based on various improvements, this invention achieves new state-of-the-art performance in semi-supervised left atrial segmentation tasks on the LA database compared with other methods in the prior art, and has strong robustness.
[0028] (5) The network in this invention can be easily combined with other shape constraint models to enhance the segmentation results and has strong generalization ability.
[0029] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Attached Figure Description
[0030] Figure 1This is a flowchart illustrating a semi-supervised 3D left atrial segmentation method based on DoubleW-Net in this invention.
[0031] Figure 2 This is a schematic diagram of the DoubleW-Net network model structure in an embodiment of the present invention;
[0032] Figure 3 This is a schematic diagram of the CAB module structure in an embodiment of the present invention;
[0033] Figure 4 This is a schematic diagram of the GAB module structure in an embodiment of the present invention. Detailed Implementation
[0034] The following detailed description of embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the invention without inventive effort are within the scope of protection of the invention.
[0035] Unless otherwise defined, the technical or scientific terms used in this invention shall have the ordinary meaning as understood by one of ordinary skill in the art to which this invention pertains.
[0036] The terms "comprising" or "including" as used in this invention mean that the element preceding the term encompasses the element listed after the term, and do not exclude the possibility of encompassing other elements. Terms such as "inner," "outer," "upper," and "lower" indicate the orientation or positional relationship based on the orientation or positional relationship shown in the accompanying drawings, and are only for the convenience of describing the invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limitations on the invention. When the absolute position of the described object changes, the relative positional relationship may also change accordingly. In this invention, unless otherwise explicitly specified and limited, the term "attached" and similar terms should be interpreted broadly. For example, it can refer to a fixed connection, a detachable connection, or an integral part; it can refer to a direct connection or an indirect connection through an intermediate medium; it can refer to the internal communication of two elements or the interaction relationship between two elements. Those skilled in the art can understand the specific meaning of the above terms in this invention according to the specific circumstances.
[0037] Example
[0038] A semi-supervised 3D left atrial segmentation method based on DoubleW-Net, such as Figure 1 As shown, it includes the following steps:
[0039] S1: Acquire and preprocess the 3D left atrial magnetic resonance image to be tested;
[0040] Preprocessing specifically includes:
[0041] The information region around the left atrial MRI image to be tested was removed, retaining only the central image portion, and the image was cropped to... Remove edge information areas from the image and save the cropped data in an h5 file.
[0042] S2: Construct a training dataset based on labeled 3D left atrial magnetic resonance imaging (MRI) data and unlabeled 3D left atrial MRI data;
[0043] Specifically, data augmentation is performed on the training dataset, including random rotation, random cropping, and random flipping of 3D left atrial MRI images, and the same data augmentation strategy is used for both labeled and unlabeled samples.
[0044] S3: Construct a semi-supervised segmentation neural network, such as Figure 2 As shown, based on the training dataset, a segmentation loss function is used for training to obtain a trained semi-supervised segmentation neural network; the semi-supervised segmentation neural network is a parallel DoubleW-Net structure, including an upper W network constructed by integrating channel attention modules and global attention modules, and a lower W network;
[0045] The upper W network uses standard 3×3×3 convolutional kernels for feature extraction and integrates a channel attention block (CAB) module in the first layer at the network input and a global attention module in the skip connection stage; the lower W network uses depthwise separable convolutions to construct the encoder-decoder and does not integrate channel attention modules or global attention modules.
[0046] The channel attention module obtains the weights of each channel through parallel adaptive average pooling and adaptive max pooling. After processing by the ReLU activation function, the weights are multiplied with the original feature map channel by channel and then added to the residual of the input feature map, so as to emphasize or suppress different feature channels according to the needs of the task.
[0047] Specifically, such as Figure 3 As shown, an adaptive average pooling and an adaptive max pooling are used in parallel to obtain the importance of each channel of the feature map. Then, this importance is used to assign a weight value to each feature. The feature is then activated by the ReLU function. After that, the tensor y is expanded to be exactly the same as x by dimensional expansion, and the data in it is copied. The learned weight y is then multiplied with the original feature map x, and finally added to the input map to obtain a more useful feature map x.
[0048] The Global Attention Mechanism (GAB) module is a Transformer-like structure based on 3D convolutions, such as... Figure 4 As shown, it integrates positional encoding to capture long-range dependencies of feature maps during the skip connection stage, thereby enhancing the global context feature representation capability.
[0049] Specifically, the GAB module is a 3D convolutional Transformer-like block. In particular, when skip connections are made, each layer in the encoding stage enters this module to add positional encoding, which helps to inject spatial positional information and allows the model to learn positional relationships.
[0050] Both the upper and lower W networks use transposed convolutions for upsampling. The feature maps from the upsampling stage of the lower W network are added element-wise to the feature maps from the upsampling stage of the upper W network to obtain richer features.
[0051] After completing all upsampling stages, a 1×1×1 convolutional block is used to fuse the feature maps output by the upper W network and the lower W network to obtain the final segmentation result.
[0052] Semi-supervised segmentation neural networks employ a channel reduction strategy to reduce the number of network parameters.
[0053] The segmentation loss function consists of the segmentation loss of the upper-layer W network and the segmentation loss of the lower-layer W network. The segmentation loss function is expressed as:
[0054] ;
[0055] in, The segmentation loss of the upper W-layer network, This is the segmentation loss of the lower-layer W network.
[0056] Finally, the network is adjusted based on the different inputs and the predicted outputs to adapt to specific application scenarios. The number of pixel-level labels is reduced, and the experiment is repeated continuously to observe changes in network performance.
[0057] S4: Input the preprocessed 3D left atrial magnetic resonance image to be tested into the trained semi-supervised segmentation neural network to obtain the segmentation result of the left atrial magnetic resonance image.
[0058] Example 1
[0059] To further verify the high segmentation accuracy of the method proposed in this application, this embodiment uses Dice score, intersection-over-union ratio, 95% Hausdorff distance, and average surface distance as evaluation metrics to measure the performance of the algorithm provided in this application compared with existing semi-supervised and fully supervised segmentation algorithms in the field of atrial segmentation. The network performance is evaluated, and the specific results are shown in Table 1 below:
[0060] Table 1 Evaluation Results
[0061]
[0062] As shown in Table 1, using only 10% labeled data, the method proposed in this application achieves the highest Dice score and intersection-over-union ratio (IoU) in both key metrics, at 88.86% and 80.35%, respectively. Simultaneously, its 95% Hausdorff distance and mean surface distance are also the lowest in the group, at 9.18 and 2.49, respectively, demonstrating superior overall segmentation performance compared to existing semi-supervised and fully supervised methods. Furthermore, with 20% labeled data, this method continues to lead in Dice score and IoU, reaching 91.58% and 83.71%, respectively. The 95% Hausdorff distance and mean surface distance are further reduced to 5.61 and 1.59, respectively, showing the best performance among all compared methods. This effectively verifies the effectiveness and advancement of this method in left atrial segmentation tasks.
[0063] Therefore, this invention employs a semi-supervised 3D left atrial segmentation method based on DoubleW-Net, introducing a channel attention mechanism module into the DoubleW-Net network architecture. This module learns from the first layer of the upper network input to adjust the feature representations of subsequent network channels, simultaneously extracting features from multiple aspects. The network uses parallel processing for feature interaction, avoiding information loss during feature extraction. Furthermore, a Transformer-like self-attention global attention block is introduced, which better focuses on global information during skip connection stages, thereby improving the accuracy and generalization ability of the convolutional neural network model in atrial segmentation. By improving the accuracy of medical image segmentation, this invention opens new avenues for computer-aided diagnostic technology.
[0064] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can still be made to the technical solutions of the present invention, and these modifications or equivalent substitutions cannot cause the modified technical solutions to deviate from the spirit and scope of the technical solutions of the present invention.
Claims
1. A semi-supervised 3D left atrial segmentation method based on DoubleW-Net, characterized in that, Includes the following steps: S1: Acquire and preprocess the 3D left atrial magnetic resonance image to be tested; S2: Construct a training dataset based on labeled 3D left atrial magnetic resonance imaging (MRI) data and unlabeled 3D left atrial MRI data; S3: Construct a semi-supervised segmentation neural network and train it using a segmentation loss function based on the training dataset to obtain the trained semi-supervised segmentation neural network; the semi-supervised segmentation neural network is a parallel DoubleW-Net structure, including an upper W network constructed by integrating channel attention modules and global attention modules, and a lower W network; The upper W network uses standard 3×3×3 convolutional kernels for feature extraction, and integrates a channel attention module in the first layer at the network input and a global attention module in the skip connection stage; The lower-layer W network uses depthwise separable convolutions to construct the encoder-decoder, and does not integrate channel attention modules or global attention modules; The channel attention module obtains the weights of each channel through parallel adaptive average pooling and adaptive max pooling. After processing by the ReLU activation function, the weights are multiplied with the original feature map channel by channel and added to the residual of the input feature map, so as to emphasize or suppress different feature channels according to the needs of the task. The global attention module is a Transformer-like structure based on 3D convolution and integrates positional encoding to capture long-range dependencies of feature maps during the skip connection stage, thereby enhancing the global context feature representation capability. S4: Input the preprocessed 3D left atrial magnetic resonance image to be tested into the trained semi-supervised segmentation neural network to obtain the segmentation result of the left atrial magnetic resonance image.
2. The semi-supervised 3D left atrial segmentation method based on DoubleW-Net as described in claim 1, characterized in that: Preprocessing in S1 specifically includes: Remove the information region around the 3D left atrial MRI image to be tested, retain the central image portion, crop the image to 112*112*80, remove the edge information region in the image, and save the cropped data in an h5 file.
3. The semi-supervised 3D left atrial segmentation method based on DoubleW-Net as described in claim 2, characterized in that: Both the upper and lower W networks use transposed convolutions for upsampling. The feature maps from the upsampling stage of the lower W network are added element-wise to the feature maps from the upsampling stage of the upper W network to obtain richer features.
4. The semi-supervised 3D left atrium segmentation method based on DoubleW-Net according to claim 3, characterized in that: After completing all upsampling stages, a 1×1×1 convolutional block is used to fuse the feature maps output by the upper W network and the lower W network to obtain the final segmentation result.
5. The semi-supervised 3D left atrium segmentation method based on DoubleW-Net according to claim 4, characterized in that: Semi-supervised segmentation neural networks employ a channel reduction strategy to reduce the number of network parameters.
6. The semi-supervised 3D left atrial segmentation method based on DoubleW-Net as described in claim 5, characterized in that: In step S3, the segmentation loss function consists of the segmentation loss of the upper W network and the segmentation loss of the lower W network.
7. The semi-supervised 3D left atrium segmentation method based on DoubleW-Net according to claim 6, characterized in that: The segmentation loss function is expressed as: ; wherein, is the segmentation loss for the upper W network, is the segmentation loss for the lower W network.