A light multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By introducing an enhanced polarization multiscale module and a multiscale lightweight fusion module, the lightweight multiscale polyp segmentation system solves the balance problem between model lightweighting and segmentation accuracy in existing technologies, and achieves high-precision polyp segmentation in resource-constrained scenarios such as colonoscopy.

CN122201660APending Publication Date: 2026-06-12NANTONG UNIV

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: NANTONG UNIV
Filing Date: 2026-03-30
Publication Date: 2026-06-12

Application Information

Patent Timeline

30 Mar 2026

Application

12 Jun 2026

Publication

CN122201660A

IPC: G16H30/40; G16H50/20; G06T7/11; G06T5/60; G06T3/4053; G06V10/764; G06V10/44; G06V10/80; G06V10/82; G06V10/46; G06V10/26; G06N3/0464

AI Tagging

Application Domain

Image enhancement Image analysis

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

Existing polyp segmentation algorithms struggle to achieve efficient utilization of multi-scale features and high-precision edge segmentation while maintaining lightweight characteristics. This is especially true in computationally limited scenarios such as colonoscopy, where they suffer from loss of detail and insufficient segmentation accuracy.

⚗Method used

A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling is adopted. By introducing an enhanced polarization multi-scale module in the encoding stage and a multi-scale lightweight fusion module in the decoding stage, combined with a parallel multi-scale self-attention mechanism and depthwise separable convolution, the system achieves collaborative expression of global channel semantics and local details and high-precision segmentation.

🎯Benefits of technology

While reducing computational complexity, it significantly improves the edge accuracy and segmentation effect of polyp segmentation, especially for small polyps and polyps with blurred boundaries, making it suitable for clinical application scenarios with limited computing resources.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN122201660A_ABST

Patent Text Reader

Abstract

The application discloses a kind of based on space channel grouping decoupling light multi-scale enhanced polyp segmentation system, comprising: the colonoscope image acquisition module for obtaining colonoscope image, image coding module for extracting different scale intestinal polyp feature map;Global attention fusion module for extracting global self-attention feature map from different scale intestinal polyp feature map, coding fusion module for carrying out convolution fusion to different scale intestinal polyp feature map and obtaining first fusion feature map;Enhanced polarized multi-scale module for extracting intestinal polyp feature map containing local context and polyp lesion boundary details from first fusion feature map;Feature fusion module for performing main element addition with global self-attention feature map, and obtaining second fusion feature map;Image decoding module for fusing different scale intestinal polyp feature map in second fusion feature map, obtains the segmentation result of intestinal polyp.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of computer vision and medical image processing technology, specifically to a lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling. Background Technology

[0002] Colorectal cancer, a prevalent malignant tumor worldwide, often originates from colorectal polyps. Therefore, early screening for intestinal polyps has become a crucial measure to reduce the incidence and mortality of colorectal cancer. Colonoscopy is currently the mainstream screening method, but its diagnostic accuracy is highly dependent on the operator's experience, and differences between operators can easily lead to missed diagnoses and inconsistent diagnoses. Introducing polyp segmentation algorithms to reduce subjective interpretation bias through automated pixel-level analysis can effectively improve the objectivity and consistency of colonoscopy. Among deep learning methods, Bo Dong, Wenhai Wang, and others proposed the Transformer method, which has stronger global modeling capabilities. However, its self-attention mechanism has high computational complexity, making real-time inference difficult to achieve on mobile devices or in resource-constrained scenarios. Furthermore, its reliance on pooling to obtain multi-scale information leads to loss of detail, resulting in insufficient polyp segmentation accuracy.

[0003] In summary, existing polyp segmentation algorithms have not effectively solved the balance between lightweight models and segmentation accuracy. There is an urgent need for a polyp segmentation system that can maintain lightweight characteristics while achieving efficient utilization of multi-scale features, collaborative expression of global dependencies and local details, and high-precision edge segmentation. Summary of the Invention

[0004] To address the problems existing in the prior art, this invention provides a lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling. By introducing an enhanced polarization multi-scale module in the encoding stage and a multi-scale lightweight fusion module in the decoding stage, it can achieve high-precision segmentation of polyp regions while maintaining the lightweight characteristics of the polyp segmentation system. It is suitable for clinical application scenarios with limited computing resources, such as colonoscopy.

[0005] To achieve the above technical objectives, the present invention adopts the following technical solution:

[0006] A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling includes: a colonoscopy image acquisition module, an image encoding module, an encoding fusion module, a global attention fusion module, an enhanced polarization multi-scale module, a feature fusion module, and an image decoding module that incorporates multi-scale lightweight fusion. The colonoscopy image acquisition module is used to acquire colonoscopy images containing intestinal polyps; The image encoding module is used to extract intestinal polyp feature maps at different scales from colonoscopy images; The global attention fusion module is used to stitch together feature maps of intestinal polyps at different scales and extract a global self-attention feature map. The encoding fusion module is used to perform convolutional fusion on the extracted intestinal polyp feature maps of different scales to obtain a first fused feature map; The enhanced polarization multiscale module is used to perform deep enhancement on the first fused feature map and extract intestinal polyp feature map containing local context and polyp lesion boundary details; The feature fusion module is used to perform principal element addition on the global self-attention feature map and the intestinal polyp feature map containing local context and polyp lesion boundary details to obtain a second fused feature map; The image decoding module that introduces multi-scale lightweight fusion is used to fuse intestinal polyp feature maps of different scales in the second fusion feature map to obtain the segmentation result of intestinal polyps.

[0007] Furthermore, it also includes a colonoscopy image processing module for uniformly processing the acquired colonoscopy images and performing image enhancement on the uniformly resolution colonoscopy images, including random mirroring and rotation.

[0008] Furthermore, the image encoding module consists of three cascaded downsampling units, used to progressively reduce the size of the intestinal polyp feature map and increase the number of channels.

[0009] Furthermore, the enhanced polarization multi-scale module includes: a parallel multi-scale self-attention feature extraction unit, a multi-scale feature fusion unit, a channel modeling unit, a spatial modeling unit, a spatial channel decoupling enhancement unit, and a depthwise separable convolution unit; The parallel multi-scale self-attention feature extraction unit is used to extract complementary global channel intestinal polyp feature maps and global spatial intestinal polyp feature maps from the first fused features; The multi-scale feature fusion unit is used to stitch together the global channel intestinal polyp feature map and the global spatial intestinal polyp feature map in terms of channel dimension, and adjust the number of channels of the stitched feature to be consistent with the number of channels of the first fusion feature to obtain a multi-scale fusion feature map. The channel modeling unit is used to perform channel dimension modeling on the multi-scale fused feature map and extract the global channel enhancement feature map related to intestinal polyp segmentation. The spatial modeling unit is used to perform spatial dimension modeling on the multi-scale fused feature map and extract a global spatial enhancement feature map that represents the local details of intestinal polyps. The spatial channel decoupling enhancement unit is used to add the global channel enhancement feature map and the global spatial enhancement feature map element by element to obtain the spatial channel decoupling enhancement feature map; The depthwise separable convolutional unit is used to enhance the local context of the spatial channel decoupling enhancement feature map through depthwise separable convolution, and to perform residual connection with the spatial channel decoupling enhancement feature map to extract the intestinal polyp feature map containing local context and polyp lesion boundary details.

[0010] Furthermore: The global channel intestinal polyp feature map The extraction process is as follows:

[0011] The extraction process of the global spatial intestinal polyp feature map is as follows:

[0012] in, Indicates the first fusion feature, This represents the target feature extraction function corresponding to a 1×1 convolution transformation. This represents the reference feature extraction function corresponding to a 1×1 convolution transformation. This represents the feature generation function corresponding to a 1×1 convolution transformation. This represents the target feature extraction function corresponding to the 3×3 convolution transformation. This represents the reference feature extraction function corresponding to the 3×3 convolution transformation. This represents the feature generation function corresponding to a 3×3 convolution transformation. This indicates the transpose operation. C represents the channel dimension. This represents the spatial dimension H×W.

[0013] Furthermore, the extraction process of the global channel enhancement feature map is as follows: i: Divide the scale-fused feature map into feature groups evenly along the channel dimension, and the features of each feature group... Linear projection is performed using a 2D convolution with a kernel size of 1×1, and the channel attention weights associated with the segmentation of intestinal polyps for each feature group are calculated. ,in, This represents a 1×1 convolution transform function used to extract global target features within a channel group. This represents the 1×1 convolution transform function used to extract global reference features within a channel group. This represents the total number of channels in the scale-fused feature map. Indicates the transpose operation; ii: Utilize the channel attention weights associated with the segmentation of intestinal polyps for each feature group to analyze the features of the feature group. Enhancement is performed to obtain the channel enhancement features for each feature group. ; iii: Concatenate the channel enhancement features of all feature groups along the channel dimension to obtain a global channel enhancement feature map related to intestinal polyp segmentation.

[0014] Furthermore, the process of extracting the local detailed feature map of the intestinal polyp is as follows: i: Divide the scale-fused feature map into region groups along the spatial dimension, and the features of each region group... Linear projection is performed using a 3×3 2D convolution kernel to calculate the spatial attention weights for each region group. ,in, This represents the target feature extraction function corresponding to the 3×3 convolution transformation. This represents the reference feature extraction function corresponding to the 3×3 convolution transformation. Indicates the height of the scale-fused feature map. This represents the width of the scale-fused feature map. Indicates the transpose operation; ii: Utilize the spatial attention weights of each region group to assess the features of the region group. Enhancement was performed to obtain spatially enhanced feature maps characterizing the local details of intestinal polyps in each region group. ; iii: Stitch together the spatial augmentation feature maps of all regions along the spatial dimension to obtain a global spatial augmentation feature map that represents the local details of intestinal polyps.

[0015] Furthermore, the image decoding module incorporating multi-scale lightweight fusion includes: a first upsampling unit, a multi-scale lightweight fusion unit, and a second upsampling unit; The first upsampling unit is used to upsample the two fused feature maps; The multi-scale lightweight fusion unit is used to concatenate the upsampled two-fusion feature maps after multi-scale depthwise separable convolution to extract the feature maps of intestinal polyps expressed at multiple scales. The second upsampling unit is used to concatenate the feature map of the intestinal polyp multiscale expression with the feature map extracted by the first downsampling unit to obtain the segmentation result of the intestinal polyp.

[0016] Furthermore, the multi-scale lightweight fusion unit includes: parallel multi-scale depth-separable convolutional layers, fusion layers, and enhancement layers; The parallel multi-scale depth-separable convolutional layers are used to extract multi-scale feature maps from the upsampled two-fusion feature maps using depth-separable convolutional kernels of 1×1, 3×3, 5×5 and 7×7, respectively. The fusion layer is used to stitch together the extracted multi-scale feature maps; The enhancement layer is used to obtain an enhanced feature map by uniformly stitching together multi-scale feature maps through 1×1 depth-separable convolutions.

[0017] Furthermore, before deployment, the lightweight multi-scale enhanced polyp segmentation system needs to be trained until the Dice loss is incorporated. IoU loss and accuracy loss Composite loss function convergence; The composite loss function Specifically:

[0018] in, , This indicates the degree of overlap between the predicted segmented regions of intestinal polyps and the actually labeled segmented regions of intestinal polyps. , This represents a predicted segmentation map of intestinal polyps. This is a segmentation diagram of intestinal polyps with actual annotations. Represents positive terms. express Weighting coefficients; , This indicates the percentage overlap between the predicted segmented regions of intestinal polyps and the actually labeled segmented regions of intestinal polyps. , express Weighting coefficients; , This represents the overall accuracy of the lightweight multi-scale augmented polyp segmentation system in classifying each pixel. , This represents the number of samples that were correctly predicted as positive. This represents the number of samples that were correctly predicted as negative. This represents the number of negative samples that were incorrectly predicted as positive. This represents the number of positive samples that were incorrectly predicted as negative. express The weighting coefficients.

[0019] Compared with the prior art, the present invention has the following beneficial effects: This invention presents a lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling. By constructing an encoder-decoder architecture, an enhanced polarization multi-scale module (EPM) is introduced in the encoding stage. A parallel multi-scale self-attention mechanism fuses global channel semantics and global spatial structure features. A parallel channel-space grouping decoupling attention structure is then employed, grouping the multi-scale fused feature map into channels and spaces. Attention calculations are performed on both the channel and spatial dimensions within each group, reducing computational complexity while precisely capturing local feature patterns of polyp lesions. Furthermore, a 3×3 depthwise separable convolution is introduced for local context enhancement, effectively improving the perception of polyp boundaries, reducing boundary blurring and inaccuracies in the segmentation results, and significantly improving edge accuracy. Simultaneously, addressing the issue of detail loss due to pooling in existing technologies for acquiring multi-scale information, this invention introduces a lightweight multi-scale fusion module (MSLFM) in the decoding stage. Parallel multi-scale depthwise separable convolution replaces traditional pooling operations, effectively capturing feature information of intestinal polyps at different scales, improving segmentation accuracy for polyps of different sizes and shapes, especially for small polyps and polyps with blurred boundaries. Through the collaborative work of the above modules, this invention achieves high-precision segmentation of polyp regions while maintaining the lightweight characteristics of the enhanced polyp segmentation system, significantly reducing the number of model parameters and computational complexity, and is suitable for clinical application scenarios with limited computing resources, such as colonoscopy. Attached Figure Description

[0020] Figure 1 This is a schematic diagram of the lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling of the present invention; Figure 2 This is a schematic diagram of the enhanced polarization multiscale module in this invention; Figure 3 This is a schematic diagram of the multi-scale lightweight fusion module in this invention; Figure 4 This is a schematic diagram illustrating the segmentation effect of the present invention. Detailed Implementation

[0021] The technical solution of the present invention will be further explained and described below with reference to the accompanying drawings.

[0022] like Figure 1 This is a schematic diagram of the lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling of the present invention. The lightweight multi-scale enhanced polyp segmentation system includes: a colonoscopy image acquisition module, an image encoding module, an encoding fusion module, a global attention fusion module, an enhanced polarization multi-scale module, a feature fusion module, and an image decoding module that introduces multi-scale lightweight fusion.

[0023] The colonoscopy image acquisition module is used to acquire colonoscopy images containing intestinal polyps; The image encoding module is used to extract feature maps of intestinal polyps at different scales from colonoscopy images; The global attention fusion module is used to stitch together feature maps of intestinal polyps at different scales and extract a global self-attention feature map; The encoding fusion module is used to convolve and fuse the extracted intestinal polyp feature maps of different scales to obtain the first fused feature map; The Enhanced Polarization Multiscale Module (EPM) is used to perform deep enhancement on the first fused feature map using a channel-space grouping decoupled attention mechanism, reducing computational complexity and achieving fine-grained feature enhancement, and extracting intestinal polyp feature maps containing local context and polyp lesion boundary details; The feature fusion module is used to perform principal element addition on the global self-attention feature map and the intestinal polyp feature map containing local context and polyp lesion boundary details to obtain the second fused feature map; A multi-scale lightweight fusion image decoding module is introduced to fuse intestinal polyp feature maps of different scales in the second fusion feature map, so as to realize the gradual recovery and refinement of features and obtain the segmentation results of intestinal polyps.

[0024] This invention presents a lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling. By constructing an encoder-decoder architecture, an enhanced polarization multi-scale module (EPM) is introduced in the encoding stage. A parallel multi-scale self-attention mechanism fuses global channel semantics and global spatial structure features. A parallel channel-space grouping decoupling attention structure is then employed, grouping the multi-scale fused feature map into channels and spaces. Attention calculations are performed on both the channel and spatial dimensions within each group, reducing computational complexity while precisely capturing local feature patterns of polyp lesions. Furthermore, a 3×3 depthwise separable convolution is introduced for local context enhancement, effectively improving the perception of polyp boundaries, reducing boundary blurring and inaccuracies in the segmentation results, and significantly improving edge accuracy. Simultaneously, addressing the issue of detail loss due to pooling in existing technologies for acquiring multi-scale information, this invention introduces a lightweight multi-scale fusion module (MSLFM) in the decoding stage. Parallel multi-scale depthwise separable convolution replaces traditional pooling operations, effectively capturing feature information of intestinal polyps at different scales, improving segmentation accuracy for polyps of different sizes and shapes, especially for small polyps and polyps with blurred boundaries. Through the collaborative work of the above modules, this invention achieves high-precision segmentation of polyp regions while maintaining the lightweight characteristics of the enhanced polyp segmentation system, significantly reducing the number of model parameters and computational complexity, and is suitable for clinical application scenarios with limited computing resources, such as colonoscopy.

[0025] In one technical solution of the present invention, a colonoscopy image processing module is also included, which is used to perform uniform resolution processing on the acquired colonoscopy images and to perform image enhancement on the colonoscopy images with uniform resolution, including random mirroring and rotation, so as to improve the diversity and generalization ability of colonoscopy image data, and improve the stability of numerical calculation through normalization.

[0026] In one technical solution of the present invention, the image encoding module consists of three cascaded downsampling units, used to progressively reduce the size of the intestinal polyp feature map and increase the number of channels. The first-level downsampling unit processes the colonoscopy image to generate a first-scale feature map X1; the second-level downsampling unit processes the first-scale feature map X1 to generate a second-scale feature map X2; and the third-level downsampling unit processes the second-scale feature map X2 to generate a third-scale feature map X3.

[0027] In one technical solution of the present invention, such as Figure 2 The enhanced polarization multi-scale module includes: a parallel multi-scale self-attention feature extraction unit, a multi-scale feature fusion unit, a channel modeling unit, a spatial modeling unit, a spatial-channel decoupling enhancement unit, and a depthwise separable convolution unit. The parallel multi-scale self-attention feature extraction unit fuses global semantic and local structural features, and then the parallel channel modeling unit and spatial modeling unit reduce computational complexity and achieve fine-grained feature enhancement through a grouping mechanism. Finally, the depthwise separable convolution enhances the local context and polyp lesion boundary details of the features.

[0028] Parallel multi-scale self-attention feature extraction units are used to extract complementary global channel intestinal polyp feature maps and global spatial intestinal polyp feature maps from the first fusion features through convolution of 1×1 and 3×3 receptive fields. These are used to capture the global channel correlation between intestinal polyps and the background, as well as the global spatial structure of polyps such as edges, textures, and shapes, thereby enhancing the semantic discrimination ability of lesions and improving the localization accuracy of lesion boundaries.

[0029] Global pathway intestinal polyp feature map The extraction process is as follows:

[0030] Global spatial feature map of intestinal polyps The extraction process is as follows:

[0031] in, Indicates the first fusion feature, This represents the target feature extraction function corresponding to a 1×1 convolution transformation. It extracts feature representations of intestinal polyp candidate regions using a 1×1 convolution transformation. This represents the reference feature extraction function corresponding to a 1×1 convolution transformation, which extracts the feature representation of the global context through a 1×1 convolution transformation. This represents the feature generation function corresponding to a 1×1 convolution transformation, which extracts global context feature maps through 1×1 convolution. This represents the target feature extraction function corresponding to the 3×3 convolution transformation. It extracts feature representations of intestinal polyp candidate regions using the 3×3 convolution transformation. This represents the reference feature extraction function corresponding to the 3×3 convolution transformation, which extracts feature representations of the local context through the 3×3 convolution transformation. This represents the feature generation function corresponding to the 3×3 convolution transformation, which extracts local context feature maps through 3×3 convolution. This indicates the transpose operation. C represents the channel dimension. This represents the spatial dimension H×W.

[0032] The multi-scale feature fusion unit is used to integrate global channel intestinal polyp feature maps. Global spatial feature map of intestinal polyps Perform channel-dimensional concatenation, adjusting the number of channels in the concatenated features to match the number of channels in the first fused feature, to obtain a multi-scale fused feature map:

[0033] Concat() represents the concatenation operation along the channel dimension.

[0034] The channel modeling unit is used to model the channel dimension of the multi-scale fused feature map and extract the global channel enhancement feature map related to intestinal polyp segmentation; specifically: i: The scale-fused feature map is uniformly divided into feature groups G along the channel dimension. To achieve the optimal balance between computational efficiency, feature granularity, and segmentation accuracy of minor polyps and blurred boundaries, in this invention, a channel group size of less than 4 would lead to computational redundancy and coarse granularity, while a channel group size of more than 4 would lead to feature fragmentation and semantic loss. Therefore, in this invention, G=4. The channel dimension of each group is Gg=C / G, where C is the number of channels.

[0035] The features of each feature group are denoted as follows: For each feature group Linear projection is performed using a 2D convolution with a kernel size of 1×1, and the channel attention weights associated with the segmentation of intestinal polyps for each feature group are calculated. ,in, This represents a 1×1 convolution transform function used to extract global target features within a channel group. This represents the 1×1 convolution transform function used to extract global reference features within a channel group. This represents the total number of channels in the scale-fused feature map. Indicates the transpose operation; ii: Utilize the channel attention weights associated with the segmentation of intestinal polyps for each feature group to analyze the features of the feature group. Enhancement is performed to obtain the channel enhancement features for each feature group. ; iii: Concatenate the channel enhancement features of all feature groups along the channel dimension to obtain a global channel enhancement feature map related to intestinal polyp segmentation. : .

[0036] The spatial modeling unit is used to model the spatial dimensions of multi-scale fused feature maps and extract global spatially enhanced feature maps that characterize local details of intestinal polyps; specifically: i: The scale-fused feature map is divided into R region groups along the spatial dimension. To achieve the optimal balance between computational efficiency, feature granularity, and segmentation accuracy for small polyps and blurred boundaries, this invention uses a 2×2 grid. If there are too few grid groups, the regions will be too large, background interference will be strong, and polyp boundaries will be easily blurred; if there are too many grid groups, the computational cost will increase, and excessively fragmented features will lose global context, which will also affect segmentation accuracy. The spatial resolution of each region is Hr=H / 2, Wr=W / 2.

[0037] The characteristics of each region group are denoted as follows: The features of each region group are denoted as follows: Linear projection is performed using a 3×3 2D convolution kernel to calculate the spatial attention weights for each region group. ,in, This represents the target feature extraction function corresponding to the 3×3 convolution transformation. This represents the reference feature extraction function corresponding to the 3×3 convolution transformation. Indicates the height of the scale-fused feature map. This represents the width of the scale-fused feature map. Indicates the transpose operation; ii: Utilize the spatial attention weights of each region group to assess the features of the region group. Enhancement was performed to obtain spatially enhanced feature maps characterizing the local details of intestinal polyps in each region group. ; iii: Concatenate the spatial augmentation feature maps of all region groups along the spatial dimension to obtain a global spatial augmentation feature map representing the local details of intestinal polyps. .

[0038] Spatial channel decoupling enhancement unit is used to enhance the global channel feature map and global spatial augmentation feature map Element-wise addition is performed to obtain the spatial channel decoupling enhancement feature map. : .

[0039] Depthwise separable convolutional units are used to enhance the local context of the spatial channel decoupling enhancement feature map through depthwise separable convolution. Residual connections are then made with the spatial channel decoupling enhancement feature map to extract intestinal polyp feature maps containing local context and polyp lesion boundary details. .

[0040] In one technical solution of the present invention, the image decoding module incorporating multi-scale lightweight fusion includes: a first upsampling unit, a multi-scale lightweight fusion unit MSLFM, and a second upsampling unit; The first upsampling unit is used to upsample the two fused feature maps; The Multi-Scale Lightweight Fusion Unit (MSLFM) is used to concatenate upsampled binary fusion feature maps through multi-scale depthwise separable convolutions to extract multi-scale feature maps of intestinal polyps; for example... Figure 3 The multi-scale lightweight fusion unit includes: parallel multi-scale depth-separable convolutional layers, fusion layers, and enhancement layers; Parallel multi-scale depth-separable convolutional layers are used to extract multi-scale feature maps from upsampled binary fusion feature maps using 1×1, 3×3, 5×5 and 7×7 depth-separable convolutional kernels respectively. Each convolutional kernel corresponds to a feature extraction branch. Parallel convolution replaces traditional pooling operations to obtain multi-scale information and avoids the loss of detail caused by pooling. The fusion layer is used to stitch together the extracted multi-scale feature maps; The enhancement layer is used to increase the number of channels of multi-scale feature maps uniformly stitched together by 1×1 depthwise separable convolutions, thereby obtaining enhanced feature maps and enriching the multi-scale representation of features.

[0041] The second upsampling unit is used to concatenate the feature maps of intestinal polyps expressed at multiple scales with the feature maps extracted by the first downsampling unit to obtain the segmentation results of intestinal polyps.

[0042] In one technical solution of the present invention, before the lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling is put into use, it needs to be trained until the Dice loss is fused. IoU loss and accuracy loss Composite loss function Convergence is achieved by supervising the model from three dimensions: region overlap, boundary consistency, and pixel classification accuracy, thereby improving the segmentation accuracy of the lightweight multi-scale enhanced polyp segmentation system for colonoscopy polyp regions.

[0043] The composite loss function in this invention Specifically:

[0044] in, , This indicates the degree of overlap between the predicted segmented regions of intestinal polyps and the actually labeled segmented regions of intestinal polyps. , This represents a predicted segmentation map of intestinal polyps. This is a segmentation diagram of intestinal polyps with actual annotations. To represent extremely small positive terms, and to prevent the denominator from being 0, express Weighting coefficients; , This indicates the percentage overlap between the predicted segmented regions of intestinal polyps and the actually labeled segmented regions of intestinal polyps. , express Weighting coefficients; , This represents the overall accuracy of the lightweight multi-scale augmented polyp segmentation system in classifying each pixel. , This represents the number of samples that were correctly predicted as positive. This represents the number of samples that were correctly predicted as negative. This represents the number of negative samples that were incorrectly predicted as positive. This represents the number of positive samples that were incorrectly predicted as negative. express The weighting coefficients.

[0045] Since intestinal polyp segmentation is a typical class-imbalanced task, Dice is more suitable for this type of scenario. IoU can be used as a second supervision term to enhance overlap consistency, while ACC is easily "biased" by background pixels when the foreground is sparse, so it is more suitable to place it with a smaller weight. This invention suggests setting the weight coefficient to [value missing]. , and .

[0046] The performance of the lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling based on the present invention for intestinal polyp segmentation was quantitatively evaluated on the Kvasir-SEG dataset, as shown in Table 1. The performance of the present invention's method on the Kvasir-SEG dataset was compared with that of the existing lightweight polyp segmentation algorithm PMFSNet. It can be seen that the Dice, IoU, and Acc of the present invention's lightweight multi-scale enhanced polyp segmentation system on the Kvasir-SEG dataset are all higher than those of PMFSNet on the same dataset, indicating that the present invention has high segmentation accuracy for intestinal polyps in colonoscopy images. Simultaneously, the number of model parameters and computational cost of the present invention are lower than those of PMFSNet, indicating that the present invention is more suitable for clinical applications with limited computational resources, such as colonoscopy. Table 2 shows the results of ablation experiments, used to verify the impact of EPM and MSLFM on the intestinal polyp segmentation performance of the present invention. Ablation experiments are a core method for verifying the effectiveness of modules in the fields of deep learning and machine learning. In this study, by removing EPM and MSLFM respectively, it was clearly demonstrated that the optimal segmentation effect can only be obtained through the complementary cooperation between EPM and MSLFM.

[0047] Table 1: Performance of the present invention and the existing lightweight polyp segmentation algorithm PMFSNet on the Kvasir-SEG dataset.

[0048] Table 2: Comparison of ablation experiments. The second row of the table shows the ablation of EPM, and the third row shows the ablation of MSLFM.

[0049] To further qualitatively analyze the polyp segmentation effect of the lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling of this invention, such as... Figure 4 This invention can accurately extract spatial detail information from small-scale regions with low contrast in complex polyp morphology. Even in images with optical noise interference, it can accurately distinguish polyp tissue from interfering information and exhibits stronger robustness against artifact interference and boundary blurring. Simultaneously, the number of parameters is less than 1M, and the FLOPs are only 1.74G, meeting the requirements for lightweight design. The above results demonstrate that this invention's lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling can achieve precise polyp segmentation, providing a reliable auxiliary tool for clinical diagnosis and treatment, and has significant application value, especially in scenarios such as colonoscopy.

[0050] The above are merely preferred embodiments of the present invention. The scope of protection of the present invention is not limited to the above embodiments. All technical solutions falling within the scope of the present invention's concept are within the scope of protection of the present invention. It should be noted that for those skilled in the art, any improvements and modifications made without departing from the principles of the present invention should be considered within the scope of protection of the present invention.

Claims

1. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling, characterized in that, include: The system includes a colonoscopy image acquisition module, an image encoding module, an encoding fusion module, a global attention fusion module, an enhanced polarization multi-scale module, a feature fusion module, and an image decoding module that incorporates multi-scale lightweight fusion. The colonoscopy image acquisition module is used to acquire colonoscopy images containing intestinal polyps; The image encoding module is used to extract intestinal polyp feature maps at different scales from colonoscopy images; The global attention fusion module is used to stitch together feature maps of intestinal polyps at different scales and extract a global self-attention feature map. The encoding fusion module is used to perform convolutional fusion on the extracted intestinal polyp feature maps of different scales to obtain a first fused feature map; The enhanced polarization multiscale module is used to perform deep enhancement on the first fused feature map and extract intestinal polyp feature map containing local context and polyp lesion boundary details; The feature fusion module is used to perform principal element addition on the global self-attention feature map and the intestinal polyp feature map containing local context and polyp lesion boundary details to obtain a second fused feature map; The image decoding module that introduces multi-scale lightweight fusion is used to fuse intestinal polyp feature maps of different scales in the second fusion feature map to obtain the segmentation result of intestinal polyps.

2. The lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 1, characterized in that, It also includes a colonoscopy image processing module, which is used to perform uniform resolution processing on the acquired colonoscopy images and to perform image enhancement on the uniform resolution colonoscopy images, including random mirroring and rotation.

3. The lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 1, characterized in that, The image encoding module consists of three cascaded downsampling units, which are used to progressively reduce the size of the intestinal polyp feature map and increase the number of channels.

4. The lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 1, characterized in that, The enhanced polarization multi-scale module includes: a parallel multi-scale self-attention feature extraction unit, a multi-scale feature fusion unit, a channel modeling unit, a spatial modeling unit, a spatial channel decoupling enhancement unit, and a depthwise separable convolution unit; The parallel multi-scale self-attention feature extraction unit is used to extract complementary global channel intestinal polyp feature maps and global spatial intestinal polyp feature maps from the first fused features; The multi-scale feature fusion unit is used to stitch together the global channel intestinal polyp feature map and the global spatial intestinal polyp feature map in terms of channel dimension, and adjust the number of channels of the stitched feature to be consistent with the number of channels of the first fusion feature to obtain a multi-scale fusion feature map. The channel modeling unit is used to perform channel dimension modeling on the multi-scale fused feature map and extract the global channel enhancement feature map related to intestinal polyp segmentation. The spatial modeling unit is used to perform spatial dimension modeling on the multi-scale fused feature map and extract a global spatial enhancement feature map that represents the local details of intestinal polyps. The spatial channel decoupling enhancement unit is used to add the global channel enhancement feature map and the global spatial enhancement feature map element by element to obtain the spatial channel decoupling enhancement feature map; The depthwise separable convolutional unit is used to enhance the local context of the spatial channel decoupling enhancement feature map through depthwise separable convolution, and to perform residual connection with the spatial channel decoupling enhancement feature map to extract the intestinal polyp feature map containing local context and polyp lesion boundary details.

5. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 4, characterized in that: The global channel intestinal polyp feature map The extraction process is as follows: The extraction process of the global spatial intestinal polyp feature map is as follows: in, Indicates the first fusion feature, This represents the target feature extraction function corresponding to a 1×1 convolution transformation. This represents the reference feature extraction function corresponding to a 1×1 convolution transformation. This represents the feature generation function corresponding to a 1×1 convolution transformation. This represents the target feature extraction function corresponding to the 3×3 convolution transformation. This represents the reference feature extraction function corresponding to the 3×3 convolution transformation. This represents the feature generation function corresponding to a 3×3 convolution transformation. This indicates the transpose operation. Indicates channel dimension, Indicates spatial dimension.

6. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 4, characterized in that, The extraction process of the global channel enhancement feature map is as follows: i: Divide the scale-fused feature map into feature groups evenly along the channel dimension, and the features of each feature group... Linear projection is performed using a 2D convolution with a kernel size of 1×1, and the channel attention weights associated with the segmentation of intestinal polyps for each feature group are calculated. ,in, This represents a 1×1 convolution transform function used to extract global target features within a channel group. This represents the 1×1 convolution transform function used to extract global reference features within a channel group. This represents the total number of channels in the scale-fused feature map. Indicates the transpose operation; ii: Utilize the channel attention weights associated with the segmentation of intestinal polyps for each feature group to analyze the features of the feature group. Enhancement is performed to obtain the channel enhancement features for each feature group. ; iii: Concatenate the channel enhancement features of all feature groups along the channel dimension to obtain a global channel enhancement feature map related to intestinal polyp segmentation.

7. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 4, characterized in that, The extraction process of the local detail feature map of the intestinal polyp is as follows: i: Divide the scale-fused feature map into region groups along the spatial dimension, and the features of each region group... Linear projection is performed using a 3×3 2D convolution kernel to calculate the spatial attention weights for each region group. ,in, This represents the target feature extraction function corresponding to the 3×3 convolution transformation. This represents the reference feature extraction function corresponding to the 3×3 convolution transformation. Indicates the height of the scale-fused feature map. This represents the width of the scale-fused feature map. Indicates the transpose operation; ii: Utilize the spatial attention weights of each region group to assess the features of the region group. Enhancement was performed to obtain spatially enhanced feature maps characterizing the local details of intestinal polyps in each region group. ; iii: Stitch together the spatial augmentation feature maps of all regions along the spatial dimension to obtain a global spatial augmentation feature map that represents the local details of intestinal polyps.

8. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 3, characterized in that, The image decoding module that incorporates multi-scale lightweight fusion includes: a first upsampling unit, a multi-scale lightweight fusion unit, and a second upsampling unit; The first upsampling unit is used to upsample the two fused feature maps; The multi-scale lightweight fusion unit is used to concatenate the upsampled two-fusion feature maps after multi-scale depthwise separable convolution to extract the feature maps of intestinal polyps expressed at multiple scales. The second upsampling unit is used to concatenate the feature map of the intestinal polyp multiscale expression with the feature map extracted by the first downsampling unit to obtain the segmentation result of the intestinal polyp.

9. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 8, characterized in that, The multi-scale lightweight fusion unit includes: parallel multi-scale depth-separable convolutional layers, fusion layers, and enhancement layers; The parallel multi-scale depth-separable convolutional layers are used to extract multi-scale feature maps from the upsampled two-fusion feature maps using depth-separable convolutional kernels of 1×1, 3×3, 5×5 and 7×7, respectively. The fusion layer is used to stitch together the extracted multi-scale feature maps; The enhancement layer is used to obtain an enhanced feature map by uniformly stitching together multi-scale feature maps through 1×1 depth-separable convolutions.

10. A lightweight multi-scale enhanced polyp segmentation system based on spatial channel grouping decoupling according to claim 1, characterized in that, Before deployment, the lightweight multi-scale enhanced polyp segmentation system needs to be trained until the Dice loss is incorporated. IoU loss and accuracy loss Composite loss function convergence; The composite loss function Specifically: in, , This indicates the degree of overlap between the predicted segmented regions of intestinal polyps and the actually labeled segmented regions of intestinal polyps. , This represents a predicted segmentation map of intestinal polyps. This is a segmentation diagram of intestinal polyps with actual annotations. Represents positive terms. express Weighting coefficients; , This indicates the percentage overlap between the predicted segmented regions of intestinal polyps and the actually labeled segmented regions of intestinal polyps. , express Weighting coefficients; , This represents the overall accuracy of the lightweight multi-scale augmented polyp segmentation system in classifying each pixel. , This represents the number of samples that were correctly predicted as positive. This represents the number of samples that were correctly predicted as negative. This represents the number of negative samples that were incorrectly predicted as positive. This represents the number of positive samples that were incorrectly predicted as negative. express The weighting coefficients.