Method and system for full-automatic segmentation and classification of pathological images based on deep learning
By combining the dilated-SE residual unit and self-attention dense unit in the deep learning framework with tissue region guided branching, the accuracy and efficiency problems of cell nucleus segmentation and classification in pathological images are solved, achieving efficient fully automatic segmentation and classification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NANJING UNIV OF SCI & TECH
- Filing Date
- 2025-07-17
- Publication Date
- 2026-06-12
Smart Images

Figure CN120635897B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of medical image processing technology, specifically relating to a method and system for fully automatic segmentation and classification of pathological images based on deep learning. Background Technology
[0002] The advent of whole-slice scanning imaging technology has made digital pathology play a crucial role in modern clinical practice. Nuclear segmentation and classification are fundamental tasks in digital pathology. As a key step in the analysis process, accurate nuclear segmentation provides a reliable basis for subsequent histological analysis and clinical prognostic prediction, helping to assess disease outcomes. Furthermore, nuclear feature extraction and analysis have wide applications in survival prediction, pathological grading, and cancer diagnosis. Different types of nuclei may also have potential biological significance. For example, the tumor microenvironment can influence cancer progression by recruiting specific types of neural progenitor cells (NPCs).
[0003] However, manual nuclear analysis is often affected by inter- and intra-observer variability and is typically limited by processing efficiency and classification accuracy, especially in large-scale tissue section analysis. Therefore, automated segmentation methods, particularly deep learning-based nuclear segmentation and classification techniques, are increasingly becoming a focus of pathological image analysis research.
[0004] Despite some progress, nucleus segmentation remains a challenging task, especially in certain tissues such as tumor tissue, where nuclei frequently cluster or overlap, significantly increasing the difficulty of automated segmentation. Furthermore, inconsistencies in color and blurred nucleus boundaries caused by uneven manual manipulation also affect the quality of nucleus segmentation. For nucleus classification tasks, class imbalance in datasets has always been a persistent problem. In addition, misclassification, overlap, and blurred boundaries of nuclei remain significant challenges in research.
[0005] In addition to the challenges mentioned above, some methods treat cell nucleus segmentation and classification as a two-stage task: first, cell nuclei are segmented or detected, and then classified based on nuclear or surrounding environmental features. The two-stage approach fails to fully utilize the spatial relationships between nuclei and overall tissue structure information, affecting classification accuracy; furthermore, errors in the first-stage segmentation results (such as missed detections or incorrect contours) directly impact the second-stage classification; and it inevitably increases training time. Summary of the Invention
[0006] The purpose of this invention is to provide a method and system for fully automatic segmentation and classification of pathological images based on deep learning, which solves the problems of incorrect classification due to insufficient extraction of environmental information, large number of training parameters, and slow speed in existing methods for segmenting and classifying cell nuclei in pathological images, and achieves the goal of accurate segmentation and classification of cell nuclei.
[0007] To achieve the above-mentioned objectives, the present invention adopts the following technical solution: Firstly, the present invention provides a method for fully automatic segmentation and classification of pathological images based on deep learning, comprising the following steps:
[0008] (1) Select the tissue slice image to be processed and perform preliminary image preprocessing, including pre-segmentation, content filtering and color normalization;
[0009] (2) Feature extraction is performed using an encoder composed of dilated-SE residual units and a unified feature decoder composed of self-attention dense units;
[0010] (3) Based on the extracted features, kernel pixel maps, horizontal and vertical distance maps, and classification maps are generated in the segmentation and classification modules. In the classification module, tissue regions are introduced to guide branches and improve classification accuracy.
[0011] (4) Each instance is classified by performing pixel-by-pixel voting within each predicted cell nucleus region; the final cell nucleus segmentation classification result is obtained.
[0012] Furthermore, in step (1), an image suitable for deep learning size is obtained through pre-segmentation, and the color difference is eliminated through color normalization, and the processed image is used as the input image.
[0013] Further, step (2) specifically includes: an encoder composed of dilated-SE residual units extracting features; a unified feature decoder composed of self-attention dense units obtaining shared features; the dilated portion in the dilated-SE residual unit can be represented as:
[0014]
[0015] in, The energy of the geometric active contour model is the output feature map. It is the input feature map. These are the convolution kernel weights. It's the expansion rate, which controls the size of the receptive field. These are the height and width of the convolution kernel;
[0016] The SE layer can be represented as:
[0017]
[0018] in X is the output feature map, and X is the input feature map. This is a global average pooling operation. It is the ReLU activation function. It is the Sigmoid activation function. , These are the weight matrices of two linear layers, where ⊙ denotes element-wise multiplication;
[0019] For the SE layer, global information for each channel is obtained through global average pooling:
[0020]
[0021] in, It is the first The sample at the th Global average pooling results on each channel It is the input feature map, where It refers to the batch size. It is the number of channels. and These are the height and width of the feature map, respectively.
[0022] A fully connected layer is represented as:
[0023]
[0024] in, This is the desired channel attention weight vector. and These are the weight matrices of the two fully connected layers, It is the first The vector of global average pooling results for each sample;
[0025] The formula for calculating feature recalibration is:
[0026]
[0027] in, This is the final output feature map.
[0028] Furthermore, the encoder composed of the dilated-SE residual unit consists of the following parts: (1) a conv0 module, consisting of a (1) A convolutional layer with a stride of 1, a batch normalization layer and a ReLU activation layer are used as the starting module; (2) An encoder backbone module consisting of four feature extraction layers composed of different numbers of dilated-SE residual units is used to extract local and global features; (3) A convbot layer adjusts the feature dimension and improves the feature expression capability.
[0029] Furthermore, the unified feature decoder composed of self-attention dense units can be expressed as follows:
[0030]
[0031] in The output of the self-attention module integrates global information from all spatial locations and passes it through a learnable parameter. After weighting, add back to the input feature map. . The calculation formula is as follows:
[0032]
[0033] in, , , It is a query, key, and value matrix obtained through convolution transformation. It is a learnable weight matrix.
[0034] The unified feature encoder includes two feature transfer layers composed of self-attention dense units and two independent but internally shared feature modules, a classification and segmentation module, which further processes the feature maps to output results.
[0035] Furthermore, the segmentation module and the classification module are configured as follows: (1) The shared part is represented by the u1 module, which consists of two layers, the pad layer consisting of a convolutional kernel with a size of The convolutional layers are padded to ensure that the output size matches the input size during subsequent convolutional operations; the conva layer also consists of convolutional kernels with a kernel size of [missing information]. (1) The convolutional structure is used to perform preliminary feature extraction and transformation on the input features, providing shared feature representations for subsequent task branches; (2) The output of the u1 module is connected to two heads, each head consisting of a batch normalization layer, a ReLU activation function layer and a conv layer, which are responsible for giving the final output.
[0036] Furthermore, the classification module includes a tissue region segmentation auxiliary task that makes nucleus segmentation more environmentally conscious. The annotations for this task are entirely derived from existing nucleus annotations, and the pseudo-region mask is generated as follows: given an annotation type map... Each core category ,in To determine the total number of categories, an influence map is generated by aggregating the distance attenuation effect from foreground pixels. The final pseudo-region mask By classifying each category Impact diagram The calculation formula is obtained by double assignment.
[0037]
[0038] This dual assignment method comprehensively considers both the original annotation information of the foreground pixels and the influence of different categories on the background pixels, thus generating the final pseudo-region mask. This allows for an accurate approximation of tissue regions with minimal human intervention.
[0039] Secondly, the present invention provides a system for fully automated segmentation and classification of pathological images based on deep learning, for implementing the method described in the first aspect, comprising:
[0040] The first module is used to select tissue slice images to be processed and perform preliminary image preprocessing, including pre-segmentation, content filtering and color normalization.
[0041] The second module uses an encoder composed of dilated-SE residual units and a unified feature decoder composed of self-attention dense units for feature extraction.
[0042] The third module generates kernel pixel maps, horizontal and vertical distance maps, and classification maps in the segmentation and classification modules based on the extracted features. In the classification module, tissue regions are introduced to guide branches and improve classification accuracy.
[0043] The fourth module classifies each instance by performing pixel-by-pixel voting within each predicted cell nucleus region, ultimately yielding the cell nucleus segmentation and classification results.
[0044] Thirdly, the present invention provides a computer device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the method described in the first aspect.
[0045] Compared with existing technologies, the advantages of this invention are as follows: This invention introduces a novel deep learning framework aimed at improving the accuracy of cell nucleus segmentation and classification by combining unified feature decoding with tissue region guidance. This framework leverages shared feature extraction and context-aware capabilities to address significant challenges, including misclassification due to insufficient extraction of environmental information and the problems of large training parameters and slow speed. Attached Figure Description
[0046] Figure 1 This is a flowchart of a method for fully automatic segmentation and classification of pathological images based on deep learning, according to an embodiment of the present invention.
[0047] Figure 2This is a schematic diagram of straight line detection of an external conduit on a longitudinal section according to an embodiment of the present invention;
[0048] Figure 3 This is a flowchart of generating an initial standard cylindrical surface on a three-dimensional OCT image according to an embodiment of the present invention;
[0049] Figure 4 This is a schematic diagram of an initialized standard cylindrical surface generated on a three-dimensional OCT image according to an embodiment of the present invention;
[0050] Figure 5 This is a comparison of segmentation results of different models on images with relatively little interference, according to embodiments of the present invention;
[0051] Figure 6 This is a comparison of segmentation results of different models on images with significant interference, according to embodiments of the present invention. Detailed Implementation
[0052] This invention proposes a fully automated segmentation and classification method for pathological images based on deep learning, applicable to tissue slide image analysis in digital pathology. The method includes: selecting the tissue slide image to be processed, extracting the feature information of cell nuclei, and performing initialization processing; processing the image by introducing a unified feature decoder to integrate different feature information to improve the accuracy of segmentation and classification; using a convolutional neural network (CNN) for instance segmentation of cell nuclei, and combining a self-attention mechanism and a residual network to optimize the feature extraction process; simultaneously, using a classification module to predict the category of cell nuclei, fusing tissue region guidance information to improve classification accuracy. To overcome the problems of misclassification due to insufficient extraction of environmental information and the large number of training parameters and slow speed, this invention employs several innovative modules, including dilated-SE residual units and self-attention dense units, to enhance the network's context awareness and improve the model's robustness. Finally, through multi-task joint training, the segmentation and classification results are comprehensively optimized until a predetermined stopping condition is met, resulting in high-quality cell nucleus segmentation and classification results. This invention can automatically and efficiently segment and classify cell nuclei in digital pathological images, with high accuracy and stability. It is suitable for various complex tissue section analysis tasks and requires no manual intervention.
[0053] To make the objectives, technical solutions, and advantages of the present invention clearer, the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings.
[0054] Reference Figure 1 This invention proposes a fully automated segmentation and classification method for pathological images based on deep learning, comprising the following steps:
[0055] Step (1): Select the tissue slice image to be processed and perform preliminary image preprocessing, including pre-segmentation, content filtering, and color normalization.
[0056] In step (1), an image suitable for deep learning size is obtained through pre-segmentation, and the color difference is eliminated by the Stain-to-Stain Translation (STST) color normalization method. The processed image is then used as the input image.
[0057] The pathological slides to be processed are pre-segmented, decomposing them into multiple sub-images of 256 pixels × 256 pixels. The average pixel value of each image is calculated, and images with an average pixel value below 20 or above 200 are discarded to achieve preliminary image screening, filter backgrounds, and improve speed. Simultaneously, a color normalization method is used to eliminate interference from color. Figure 2 This is a schematic diagram of the entire image processing process according to an embodiment of the present invention.
[0058] Step (2) uses an encoder composed of dilated-SE residual units and a unified feature decoder composed of self-attention dense units for feature extraction and feature decoding.
[0059] The expansion portion in the expanded-SE residual element is represented as:
[0060]
[0061] in, The energy of the geometric active contour model is the output feature map. It is the input feature map. These are the convolution kernel weights. It's the expansion rate, which controls the size of the receptive field. These are the height and width of the convolution kernel;
[0062] For the SE (Squeeze-and-Excitation) layer, global information for each channel is obtained through global average pooling (Squeeze operation):
[0063]
[0064] in, It is the first The sample at the th Global average pooling results on each channel It is the input feature map, where It refers to the batch size. It is the number of channels. and These are the height and width of the feature map, respectively.
[0065] The fully connected layer (Excitation operation) can be represented as:
[0066]
[0067] in, This is the desired channel attention weight vector. and These are the weight matrices of the two fully connected layers, It is the Sigmoid function. It is the ReLU activation function. It is the first The vector of global average pooling results for each sample;
[0068] The calculation formula for the feature recalibration (Scale operation) is as follows:
[0069]
[0070] in, This is the final output feature map.
[0071] Feature extraction is performed by an encoder composed of dilated-SE residual units. The encoder includes: (1) a conv0 module, consisting of a (1) A convolutional layer with a stride of 1, a batch normalization layer and a ReLU activation layer are used as the starting module; (2) An encoder backbone module consisting of four feature extraction layers composed of different numbers of dilated-SE residual units is used to extract local and global features; (3) A convbot layer adjusts the feature dimension and improves the feature expression capability.
[0072] The features extracted above are decoded by a unified feature decoder composed of self-attention dense units. This includes: (1) the self.u3 submodule includes a conva layer with a kernel size of The convolutional layer with a stride of 1 and padding of 0 processes the input feature map; the self-attention dense unit layer receives the feature map output by the conva layer, and uses convolutional kernels of different sizes of 1x1 and 5x5, containing 8 self-attention dense units, and divides the feature map into 4 parts for processing, thereby enhancing feature propagation, alleviating the gradient vanishing problem and increasing feature reusability; finally, the convf layer, as another convolutional layer, processes the feature map output by the self-attention dense unit layer, and completes the adjustment of channel dimension and feature fusion through a 1x1 convolutional kernel with a stride of 1, padding of 0 and no bias term; (2) the self.u2 submodule receives the feature map output by self.u3 and continues to process it. The conva layer, as a two-dimensional convolutional layer, further reduces the number of channels in the input feature map. The kernel size is 5, the stride is 1, and the padding is 0. The self-attention dense unit layer receives the output feature map of the conva layer, uses 1x1 and 5x5 convolutional kernels, contains 4 self-attention dense units, and divides the feature map into 4 parts for processing, which enhances feature propagation and reuse. Finally, the convf layer, as a two-dimensional convolutional layer with 1x1 convolution, adjusts the channel dimension and fuses the feature map output by the self-attention dense unit layer. (3) Two independent but internally shared feature modules, the classification and segmentation module, further process the feature map to output the result. Specifically, the feature map is input into the segmentation module and the classification module, and the result map is output through convolution, pooling, activation and other operations. The result is then processed by the post-processing process of the fourth module to output the final result, such as Figure 3 As shown in A in the diagram.
[0073] Step (3) uses the extracted features to generate kernel pixel map (NP map) and horizontal and vertical distance map (HoVer map) in two independent modules, as well as classification map (TP map). In the classification module, tissue region guiding branch (AR branch) is introduced to improve classification accuracy.
[0074] First, we need to obtain the annotations for the auxiliary organizational area tasks, given the annotation type map. Each core category ,in To determine the total number of categories, an influence map is generated by aggregating the distance attenuation effect from foreground pixels. For each core category First, extract its binary mask:
[0075]
[0076] in For indicator functions, Indicates coordinates The category to which each foreground pixel belongs is labeled. The formula for calculating the spatial impact of propagation within the region is:
[0077]
[0078] in Here is the formula for calculating Euclidean distance. The attenuation coefficient is... This is the foreground weight parameter, which can be used to adjust the influence of each foreground pixel. For each pixel belonging to a category... foreground pixels Apply it to all pixels Impact Accumulated into the influence diagram In, it can also be expressed as:
[0079]
[0080] After analyzing all categories The category is obtained by calculating and accumulating the foreground pixels. Impact diagram It reflects the category The combined influence distribution of all foreground pixels across the entire image region. The final pseudo-region mask. By classifying each category Impact diagram The calculation formula is obtained by double assignment.
[0081]
[0082] T(x,y) represents the category of the pixel at coordinates (x,y), and BG represents the background, which is the pixel that does not belong to the cell nucleus;
[0083] This dual assignment method comprehensively considers both the original annotation information of the foreground pixels and the influence of different categories on the background pixels, thus generating the final pseudo-region mask. This allows for an accurate approximation of tissue regions with minimal human intervention. For example... Figure 4 As shown, (a) is a pathological section stained with H&E, (b) is the distribution of cell nuclei on the pathological section, (c) is the tissue pseudo-region mask generated by this method based on the distribution of cell nuclei, and (d) is the real region mask of the tissue region. It can be seen that our tissue region labeling is quite close to the real labeling.
[0084] The segmentation and classification module is structured as follows: (1) The shared part is represented by the u1 module, which consists of two layers. The pad layer consists of a convolutional kernel with a size of 1. The convolutional layers are padded to ensure that the output size matches the input size during subsequent convolutional operations; the conva layer also consists of convolutional kernels with a kernel size of [missing information]. The convolutional structure is used to perform preliminary feature extraction and transformation on the input features, providing a shared feature representation for subsequent task branches; (2) The output of the u1 module is connected to two heads, each head consisting of a batch normalization layer, a ReLU activation function layer, and a conv layer, responsible for providing the final output. Figure 3 As shown in B and C, kernel pixel maps (NP maps) and horizontal and vertical distance maps (HoVer maps) can be obtained, as well as classification maps (TP maps) and tissue region segmentation maps (only during the training phase).
[0085] Step (4) classifies each instance by performing pixel-by-pixel voting within each predicted nucleus region. This yields an accurate nucleus segmentation and classification result.
[0086] The following specific example will be used to verify the implementation effect of the method of the present invention.
[0087] For three datasets, totaling 15,916 images containing 468,743 cell nuclei, the deep learning-based fully automatic pathological image segmentation and classification method of this invention is compared with five other methods (see “H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid scene parsing network, in Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 2881–2890”, “W. Zhang, J. Pang, K. Chen, CC Loy, K-Net: Towards Unified ImageSegmentation, in NeurIPS (2021)”, “B. Cheng, I. Misra, AG Schwing, A. Kirillov, R. Girdhar, Masked-attention mask transformer for universal imagesegmentation, in Proceedings of the IEEE / CVF conference on computer vision and pattern recognition (2022), pp. 1290–1299”, “E. Xie, et al.”). The segmentation and classification results of "S. Graham, et al., SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. arXiv preprint arXiv:2105.15203 (2021)" and "S. Graham, et al., Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissuehistology images. Medical image analysis 58, 101563 (2019)" are compared, and the Dice coefficient is used to compare the segmentation and classification performance. The calculation formula is as follows:
[0088]
[0089] Where A represents the segmentation result, and G represents the actual target region manually labeled by the doctor. Dice measures the degree of overlap between the segmentation result and the actual target region; a higher Dice indicates that the experimental method is more effective. Table 1 compares the Dice values obtained by applying different segmentation models to the three datasets.
[0090] Table 1. Dice values corresponding to the segmentation of each model in the examples.
[0091] Model PanMix CoNSeP PanNuke PSPNet 0.719 0.797 0.707 KNet 0.759 0.757 0.748 Mask2Former 0.684 0.823 0.609 Segformer 0.762 0.813 0.637 Hovernet 0.714 0.742 0.660 The model of the present invention 0.802 0.823 0.751
[0092] Experiments show that the model of this invention outperforms the other five model methods when segmenting and classifying pathological images.
[0093] Figure 5 This is a comparison of segmentation results for different images from three datasets using different models. Figure 6This section describes the model performance under imbalanced data conditions. The first column shows the initial image, the second column shows the manually annotated ground truth object boundaries, and the third through seventh columns show other advanced models (see "H. Zhao, J. Shi, X. Qi, X. Wang, J. Jia, Pyramid sceneparsing network, in Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 2881–2890", "W. Zhang, J. Pang, K. Chen, CC Loy, K-Net: Towards Unified Image Segmentation, in NeurIPS (2021)", "B. Cheng, I. Misra, AG Schwing, A. Kirillov, R. Girdhar, Masked-attentionmask transformer for universal image segmentation, in Proceedings of the IEEE / CVF conference on computer vision and pattern recognition (2022), pp. 1290–1299", "E. Xie, et al., SegFormer: Simple and "Efficient Design for Semantic Segmentation with Transformers. arXiv preprint arXiv:2105.15203(2021).", "S. Graham, et al., Hover-net: Simultaneous segmentation and classification of nuclei in multi-tissue histology images. Medical imageanalysis 58, 101563 (2019)." The segmentation and classification results are shown in column eight, which represents the segmentation and classification results of the model of this invention. It can be seen that the segmentation and classification results obtained using the model of this invention are significantly better than other methods, especially in... Figure 6 The image shown is affected by data imbalance.
Claims
1. A method for fully automatic segmentation and classification of pathological images based on deep learning, characterized in that, Includes the following steps: (1) Select the tissue slice image to be processed and perform preliminary image preprocessing, including pre-segmentation, content filtering and color normalization; (2) Feature extraction is performed using an encoder composed of dilated-SE residual units and a unified feature decoder composed of self-attention dense units; The unified feature decoder, composed of self-attention dense units, is expressed as follows: ; in The output of the self-attention module integrates global information from all spatial locations and passes it through a learnable parameter. After weighting, add back to the input feature map. , The calculation formula is as follows: ; in, , , It is a query, key, and value matrix obtained through convolution transformation. It is a learnable weight matrix; The unified feature decoder includes: (1) the self.u3 submodule, which includes a conva layer with a kernel size of . The convolutional layer with a stride of 1 and padding of 0 processes the input feature map; the self-attention dense unit layer receives the feature map output by the conva layer, and uses convolutional kernels of different sizes of 1x1 and 5x5, containing 8 self-attention dense units, and divides the feature map into 4 parts for processing; finally, the convf layer, as another convolutional layer, processes the feature map output by the self-attention dense unit layer, using a 1x1 convolutional kernel, a stride of 1, padding of 0 and no bias term, to complete the adjustment of channel dimensions and feature fusion; (2) the self.u2 submodule receives the feature map output by the self.u3 submodule and continues to process it; the conva layer, as a two-dimensional convolutional layer, reduces the number of channels in the input feature map, with a convolutional kernel size of 5, a stride of 1, and padding of 0; the self-attention dense unit layer receives the output feature map of the conva, using 1x1 and 5x5 convolutional kernels, containing 4 self-attention dense units, and divides the feature map into 4 parts for processing, which plays a role in enhancing feature propagation and reuse; finally, the convf layer, as another convolutional layer, processes the feature map output by the self-attention dense unit layer, using a 1x1 convolutional kernel and 5x5, containing 4 self-attention dense units, and divides the feature map into 4 parts for processing, which plays a role in enhancing feature propagation and reuse; finally, the convf layer, as another convolutional layer, processes the feature map output by the self.u3 submodule, using a 1x1 convolutional kernel, a stride of 1, padding of 0 and no bias term, to complete the adjustment of channel dimensions and feature fusion; (2) the self.u2 submodule receives the feature map output by the self.u3 submodule and continues to process it; the conva layer, as a two-dimensional convolutional layer, reduces the number of channels in the input feature The layer is a two-dimensional convolutional layer with 1x1 convolution, which adjusts the channel dimension and fuses the feature map output by the self-attention dense unit layer; (3) Two independent but internally shared features different modules: segmentation module and classification module, which process the feature map to output the result; (3) Based on the extracted features, kernel pixel maps, horizontal and vertical distance maps, and classification maps are generated in the segmentation and classification modules. In the classification module, tissue regions are introduced to guide branches and improve classification accuracy. The classification module includes a tissue region segmentation auxiliary task that makes cell nucleus segmentation more environmentally conscious. The annotations for this task are entirely derived from existing cell nucleus annotations, and its pseudo-region mask generation method is as follows: given an annotation type map... Each core category ,in To determine the total number of categories, an influence map is generated by aggregating the distance attenuation effect from foreground pixels. For each kernel category First, extract its binary mask. ; in For indicator functions, Indicates coordinates The category to which each foreground pixel belongs is labeled. The formula for calculating the spatial impact of propagation within the region is: ; in Here is the formula for calculating Euclidean distance. The attenuation coefficient is... Foreground weights are parameters used to adjust the influence of each foreground pixel; for each pixel belonging to a category... foreground pixels Apply it to all pixels Impact Accumulated into the influence diagram In Chinese, it is expressed as: ; After analyzing all categories The foreground pixels are calculated and accumulated to obtain the category. Impact diagram It reflects the category The combined influence distribution of all foreground pixels across the entire image region; the final pseudo-region mask. By classifying each category Impact diagram The calculation formula is obtained by double assignment. ; Where T(x,y) represents the category of the pixel at coordinates (x,y), and BG represents the background; By employing a dual assignment method, the original annotation information of the foreground pixels and the influence of different categories on the background pixels are comprehensively considered to generate the final pseudo-region mask. ; (4) Each instance is classified by performing pixel-by-pixel voting within each predicted cell nucleus region; the final cell nucleus segmentation classification result is obtained.
2. The method for fully automatic segmentation and classification of pathological images based on deep learning according to claim 1, characterized in that, In step (1), an image suitable for deep learning size is obtained through pre-segmentation, and color normalization is used to eliminate color differences. The processed image is then used as the input image.
3. The method for fully automatic segmentation and classification of pathological images based on deep learning according to claim 1, characterized in that, The expanded portion in the expanded-SE residual unit is represented as follows: ; in, The energy of the geometric active contour model is the output feature map. It is the input feature map. These are the convolution kernel weights. It's the expansion rate, which controls the size of the receptive field. These are the height and width of the convolution kernel; For the SE layer, global information for each channel is obtained through global average pooling: ; in, It is the first The sample at the th Global average pooling results on each channel It is the input feature map, where It refers to the batch size. It is the number of channels. and These are the height and width of the feature map, respectively. A fully connected layer is represented as: ; in, This is the desired channel attention weight vector. and These are the weight matrices of the two fully connected layers, It is the Sigmoid function. It is the ReLU activation function. It is the first The vector of global average pooling results for each sample; The formula for calculating feature recalibration is: ; in, This is the final output feature map.
4. The method for fully automatic segmentation and classification of pathological images based on deep learning according to claim 3, characterized in that, The encoder composed of dilated-SE residual units consists of the following parts: (1) a conv0 module, consisting of one (1) A convolutional layer with a stride of 1, a batch normalization layer and a ReLU activation layer are used as the starting module; (2) An encoder backbone module consisting of four feature extraction layers composed of different numbers of dilated-SE residual units is used to extract local and global features; (3) A convbot layer adjusts the feature dimension to improve the feature expression capability.
5. The method for fully automatic segmentation and classification of pathological images based on deep learning according to claim 1, characterized in that, The segmentation module and classification module are composed as follows: (1) The shared part is represented by the u1 module, which consists of two layers, the pad layer consisting of a convolutional kernel with a size of The convolutional layers consist of padding the input to ensure that the output size matches the input size during subsequent convolutional operations; the conva layers consist of convolutional kernels with a size of... (1) The convolutional structure is used to perform preliminary feature extraction and transformation on the input features, providing shared feature representations for subsequent task branches; (2) The output of the u1 module is connected to two heads, each head consisting of a batch normalization layer, a ReLU activation function layer and a conv layer, which are responsible for giving the final output.
6. A system for fully automated segmentation and classification of pathological images based on deep learning, characterized in that, To implement the method according to any one of claims 1 to 5, comprising: The first module is used to select tissue slice images to be processed and perform preliminary image preprocessing, including pre-segmentation, content filtering and color normalization. The second module uses an encoder composed of dilated-SE residual units and a unified feature decoder composed of self-attention dense units for feature extraction. The third module generates kernel pixel maps, horizontal and vertical distance maps, and classification maps in the segmentation and classification modules based on the extracted features. In the classification module, tissue regions are introduced to guide branches and improve classification accuracy. The fourth module classifies each instance by performing pixel-by-pixel voting within each predicted cell nucleus region, ultimately yielding the cell nucleus segmentation and classification results.
7. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the program, it implements the steps of the method described in any one of claims 1 to 5.