An instance segmentation method and system for unstained epithelial cell images

By constructing the Yolov8-YLBNSwin model and combining deep feature fusion and parallel substructures, the problem of low generalization performance of epithelial cell models was solved, achieving lightweight and high-precision segmentation of unstained epithelial cell images, thus improving the model's generalization performance and segmentation accuracy.

CN118334328BActive Publication Date: 2026-06-19XI AN JIAOTONG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
XI AN JIAOTONG UNIV
Filing Date
2024-02-29
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies have low generalization performance and high model complexity in epithelial cell instance segmentation models. Furthermore, staining cells affects cell activity and morphology, which is not conducive to the study of drug action and disease pathogenesis mechanisms.

Method used

Based on the YOLOv8-seg instance segmentation network, and combined with the YOLOv8-BiFPN-Neck module with deep feature fusion and parallel substructure, a Swin-transformer structure is introduced to construct the YOLOv8-YLBNSwin model. The dataset is then optimized using Grad-CAM to achieve lightweight and high-performance segmentation of unstained epithelial cell images.

Benefits of technology

It improves the model's generalization performance and segmentation accuracy, reduces the number of model parameters, enhances cell boundary clarity, and improves the model's training convergence speed and segmentation accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN118334328B_ABST
    Figure CN118334328B_ABST
Patent Text Reader

Abstract

This invention discloses a method and system for instance segmentation of unstained epithelial cell images. A YOLOv8-BiFPN-Neck module is designed to replace the YOLOv8-seg Neck module, achieving lightweighting. A Swin-transformer is introduced into the YOLOv8-seg Backdone module to construct a YOLOv8-YLBNSwin instance segmentation network model, improving model performance while maintaining lightweighting. The model is trained, validated, and tested using five epithelial cell datasets from the LIVECell cell dataset. The interpretable model Grad-CAM is used to visualize the features extracted from the last layer of the model, guiding further improvements in segmentation performance. The generalization performance of the model is tested using three fibroblast cell datasets from the LIVECell cell dataset. Based on YOLOv8-seg, this invention borrows the feature fusion concept of BiFPN and the parallel substructure of ParNet, introduces a Swin-transformer structure, and utilizes the interpretable model Grad-CAM to improve the model's ability to segment epithelial cells.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of medical image processing technology, specifically relating to a method and system for instance segmentation of unstained epithelial cell images. Background Technology

[0002] Previous studies on epithelial cells have focused on specific cells, resulting in low generalization performance of instance segmentation models. Cell instance segmentation studies typically involve observing stained cells, which can hinder research on drug effects and disease mechanisms. The effectiveness of a model is often directly proportional to its size, while lightweight models are beneficial for the research and development of biomedical devices.

[0003] First, previous studies on epithelial cells have been conducted on specific cells, such as alveolar epithelium, skin epithelium, and oral epithelium. The generalization performance of these models is relatively low, and different cell types often require redesigning or changing the model.

[0004] Second, staining cells can affect cell activity and morphology, which is not conducive to the study of drug effects and disease mechanisms.

[0005] Third, generally speaking, the more complex the model, the better the results. However, a complex model places higher demands on the performance of the equipment, which is detrimental to equipment research and development. A lightweight model is beneficial for biomedical device research and development.

[0006] The Yolov8-YLBNSwin instance segmentation network model, which incorporates deep feature fusion, parallel substructures, and a Swin-transformer structure, demonstrates good performance in segmenting epithelial cells. It exhibits good generalization capabilities, has few parameters, and possesses significant practical application value. Summary of the Invention

[0007] The technical problem to be solved by this invention is to provide an instance segmentation method and system for unstained epithelial cell images, addressing the shortcomings of the prior art. This method utilizes a Yolov8-seg instance segmentation baseline network model, combines deep feature fusion and parallel substructures to reduce model parameters, and introduces a Swing-transformer structure to improve model performance. This achieves lightweight and high-performance instance segmentation of unstained epithelial cell images, solving the technical problems of complex unstained epithelial cell models and low segmentation accuracy.

[0008] The present invention adopts the following technical solution:

[0009] An instance segmentation method for an image of unstained epithelial cells includes the following steps:

[0010] S1. Divide the epithelial cell segmentation dataset into a training set, a validation set, and a test set;

[0011] S2. Construct a YOLOv8-YLBNSwin instance segmentation network model with a YOLOv8-BiFPN-Neck module and a Swin-transformer structure. Use the training set, validation set and test set obtained in step S1 to train, validate, test and ablation experiments on the YOLOv8-YLBNSwin instance segmentation model to obtain the evaluation index results of the model segmentation.

[0012] S3. Use the interpretable model to guide the optimization of the model segmentation evaluation index results obtained in step S2;

[0013] S4. Using the optimized model segmentation evaluation index results from step S3, train, validate, and test the Yolov8-YLBNSwin instance segmentation model obtained in step S2 to obtain new model segmentation evaluation index results and segmentation contour display images. Use the fibroblast cell dataset to test the generalization performance of the model and achieve instance segmentation of epithelial cells.

[0014] Preferably, in step S1, the entire LIVECell cell dataset is subjected to contrast-restricted adaptive histogram equalization; then, the five types of epithelial cell datasets in the LIVECell cell dataset are used to form an epithelial cell instance segmentation dataset, and the dataset is divided into training set, validation set and test set according to a ratio of 6:1:3; the three types of fibroblast cells are used to form a fibroblast cell dataset for generalization performance testing.

[0015] More preferably, the contrast-limited adaptive histogram equalization processing step is as follows:

[0016] The original image is divided into multiple sub-images, each of size M×N. The frequency of each pixel value within each sub-image is normalized to obtain the frequency P(j) of each pixel value. The cumulative distribution function (CDF) C(j) is then calculated. The equalized pixel value H(j) is calculated as follows:

[0017]

[0018] Where L represents the range of pixel values.

[0019] Preferably, step S2 specifically includes:

[0020] S201. Drawing on the feature fusion idea of ​​the weighted bidirectional feature pyramid network BiFPN and the parallel substructure of ParNet, we construct the YLBN module to replace the Neck module of Yolov8-seg, and introduce the Swin-transformer into Backdone to obtain the Yolov8-YLBNSwin instance segmentation network model.

[0021] S202. Remove the modules from the Yolov8-YLBNSwin instance segmentation network model obtained in step S201 to obtain the Yolov8-seg, Yolov8-YLBN, Yolov8-Swin and Yolov8-YLBNSwin instance segmentation network models.

[0022] S203. Using the training set, validation set, and test set obtained in step S1, train and validate the YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin instance segmentation networks obtained in step S202, respectively. Then, use the test set obtained in step S1 to test the trained YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin segmentation networks, respectively. Instance segmentation and bounding box AP are then used. 50 The test results are evaluated using mAP, the number of model parameters, and FLOPS to obtain the evaluation metrics for model segmentation.

[0023] More preferably, step S201 specifically includes:

[0024] S2011. For the YLBN module, firstly, the connection between the Neck layer and the Backdone has changed from P3, P4, P5 to P2, P3, P4, P5. Secondly, a jump connection corresponding to the layer has been added at Concat.

[0025] S2012. Replace Concat and Fusion modules with Fusion and Block modules;

[0026] S2013: Replace the two C2f modules after Backdone with the Swin-transformer structure, and change the number of repetitions of the first one to 9, while keeping the number of repetitions of the second one unchanged at 3.

[0027] S2014. Based on the YOLOv8-seg instance segmentation network model, the YLBN module is used to replace the Neck module of YOLOv8-seg, and the Swin-transformer is introduced into the Backdone of YOLOv8-seg to construct the YOLOv8-YLBNSwin instance segmentation network model.

[0028] More preferably, in step S202, the optimizer of the Yolov8-seg, Yolov8-YLBN, Yolov8-Swin and Yolov8-YLBNSwin instance segmentation network models is set to the SGD optimizer, the activation function is SiLU activation function, the initial learning rate is 0.01, the number of iterations is 100, the input image size is 704×520, the batch size is set to 4, and the loss functions are BCEWithLogitsLoss and DFL Loss+CIOU Loss.

[0029] More preferably, in step S203, AP is the area under the RP curve, and the recall R and precision P are calculated as follows:

[0030]

[0031]

[0032] Where TP represents the number of positive samples that are correctly classified; FN represents the number of positive samples that are misclassified; and FP represents the number of negative samples that are misclassified.

[0033] mAP is calculated as follows:

[0034]

[0035] Among them, AP 50 The value of AP when the IoU threshold is 0.5.

[0036] Preferably, in step S3, Grad-CAM is used to visualize the last layer output of the Yolov8-seg, Yolov8-YLBN, Yolov8-Swin, and Yolov8-YLBNSwin instance segmentation network models obtained in step S2, converting the image into a grayscale image and performing edge detection using the Sobel operator in the x and y directions respectively; then the convertScaleAbs() function is used to convert it back to the original uint8 form; taking into account the feature extraction characteristics of Yolov8-YLBNSwin, the cv2 library of Python is used to remove impurities smaller than 30 in the BT474 and MCF7 cell image data.

[0037] Preferably, step S4 specifically includes:

[0038] S401. Use the optimized unstained epithelial cell image dataset to train, validate, and test YOlov8-seg, YOlov8-YLBN, YOlov8-Swin, and YOlov8-YLBNSwin to obtain the evaluation index results and segmentation contour display map of the model segmentation.

[0039] On the optimized unstained epithelial cell image dataset, S402 and Yolov8-YLBNSwin achieved mAP values ​​of 0.442 for bounding box and 0.330 for segm. 50 The bounding box and segment size are 0.691 and 0.665, respectively. The model has 7.56M parameters and 11.0 GFLOPs of computation.

[0040] S403. Generalization performance was validated using a fibroblast-type cell dataset. The mAP bounding box and segm were 0.365 and 0.215, respectively. 50 The bounding box and segm are 0.582 and 0.515, respectively.

[0041] Secondly, embodiments of the present invention provide an instance segmentation system for unstained epithelial cell images, comprising:

[0042] The data module divides the epithelial cell segmentation dataset into training, validation, and test sets;

[0043] The network module constructs a Yolov8-YLBNSwin instance segmentation network model with a Yolov8-BiFPN-Neck module and a Swin-transformer structure. The training set, validation set and test set obtained from the data module are used to train, validate, test and ablation experiments on the Yolov8-YLBNSwin instance segmentation model to obtain the evaluation index results of the model segmentation.

[0044] The optimization module uses an interpretable model to guide the optimization of the model segmentation evaluation index results obtained by the network module.

[0045] The segmentation module uses the optimized model segmentation evaluation metrics from the optimization module to train, validate, and test the Yolov8-YLBNSwin instance segmentation model obtained from the network module. This yields new model segmentation evaluation metrics and segmentation contour visualizations. The generalization performance of the model is tested using a fibroblast cell dataset, achieving instance segmentation of epithelial cells.

[0046] Compared with the prior art, the present invention has at least the following beneficial effects:

[0047] This invention provides an instance segmentation method for images of unstained epithelial cells. It uses the YOLOv8-seg instance segmentation network as the base network to segment images of unstained epithelial cells. It combines deep feature fusion and parallel substructures to reduce model parameters and achieve lightweighting, and introduces a Swin-transformer structure to improve model performance. By visualizing the features extracted by the model and analyzing their characteristics, the dataset is optimized based on the feature extraction characteristics of YOLOv8-YLBNSwin to further improve model performance.

[0048] Furthermore, the initial data is subjected to Limit Contrast Adaptive Histogram Equalization (CLAHE), which enhances the contrast between cells and the background. The image is segmented into sub-images, and histogram equalization is applied to each sub-image. Bilinear interpolation is used at the junctions of sub-images to ensure pixel continuity. Compared to the original image, after CLAHE processing, the image contrast is enhanced, cell boundaries are clearer, and the convergence speed of network training is accelerated.

[0049] Furthermore, using yolov8-seg as the baseline network for instance segmentation ensures the lower bound of the network's segmentation results, playing a crucial role in subsequent network design. The yolov8-seg instance network is a single-stage instance segmentation network that directly outputs the location and boundary information of objects. It comprises three parts: Backdone, Neck, and Head. Backdone employs a series of convolutional and deconvolutional layers to extract features, while also using residual connections and bottleneck structures to reduce network size and improve performance. It uses C2f modules as basic building blocks, which have fewer parameters and superior feature extraction capabilities. The Neck employs multi-scale feature fusion technology, fusing feature maps from different stages of Backdone to enhance feature representation capabilities. It includes one SPPF module, one PAA module, and two PAN modules. The Head includes a detection head, a classification head, and a segmentation head. The detection head contains a series of convolutional and deconvolutional layers to generate detection results; the classification head uses global average pooling to classify each feature map.

[0050] Furthermore, the Neck structure is replaced with a YLBN module that incorporates the feature fusion concept of BiFPN and the parallel substructure of ParNet. The bidirectional connection mechanism in BiFPN facilitates efficient information flow. During upsampling and downsampling, BiFPN enables cross-scale information transfer and interaction through top-down and bottom-up connections, thereby better controlling information flow and maintaining feature consistency.

[0051] ParNet's parallel substructure consists of a Fusion Module and a Block Module. The Fusion Module allows for multiple branches in parallel, each focusing on feature maps at different resolutions. By fusing multiple branches, features from different layers can be combined, helping the model better adapt to targets of different scales and sizes, thus improving detection performance. The Block Module enhances feature representation by stacking multiple residual blocks. These residual blocks have skip connections, allowing features to flow freely within the network, enriching feature representation and better capturing target details and contextual information. The combination of the Fusion and Block Modules can reduce the number of parameters without changing model performance.

[0052] Furthermore, the Swin-transformer structure is introduced into Backdone. The Swin Transformer employs a window-based image patching strategy, dividing the image into fixed-size patches. This significantly reduces computational and memory overhead when processing large images. The Swin-transformer extracts and organizes features from the input image in a hierarchical manner. This hierarchical feature representation helps the model model features at different levels, thereby improving its understanding of different scales and semantics within the image. The Swin Transformer introduces a window interaction mechanism, effectively capturing global and local relationships in the image through relative position encoding across windows and cross-attention between windows. Compared to traditional global self-attention, this window-based attention mechanism reduces computational complexity and improves the model's scalability.

[0053] Furthermore, Grad-CAM (Gradient Weighted Class Activation Mapping) is used to decode the importance of each feature map to a specific class by analyzing the gradient in the last convolutional layer. Grad-CAM is used in the last layer of the backbone network to generate class activation maps to highlight important regions in the image. Analysis of the different models' focus on feature extraction for different epithelial cell types guides the model, optimizes the dataset, and further improves model performance, increasing the accuracy of epithelial cell instance segmentation.

[0054] It is understandable that the beneficial effects of the second aspect mentioned above can be found in the relevant descriptions in the first aspect mentioned above, and will not be repeated here.

[0055] In summary, this invention uses the YOLOv8-seg instance segmentation network model as the baseline framework. It replaces the Neck module with the YOLOv8-BiFPN-Neck module, which incorporates feature fusion and parallel substructures, enhancing cross-scale information transfer and interaction while reducing the number of model parameters. Furthermore, the introduction of the Swin-transformer structure leverages its excellent hierarchical feature extraction capabilities to improve the accuracy of epithelial cell instance segmentation. Grad-CAM is used to analyze the gradients in the last convolutional layer to decode the importance of each feature map for a specific class, optimizing the dataset and further improving model performance.

[0056] The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. Attached Figure Description

[0057] Figure 1 This is the overall flowchart of the present invention;

[0058] Figure 2 To limit contrast adaptive histogram equalization (CLAHE);

[0059] Figure 3 A comparison of the histograms of the image after CLAHE processing and the original image;

[0060] Figure 4 The structure diagram of Yolov8-seg;

[0061] Figure 5 A schematic diagram of a BiFPN module;

[0062] Figure 6 Schematic diagram of Fusion module and Block module;

[0063] Figure 7 A schematic diagram of the YLBN module;

[0064] Figure 8 A comparison chart of Backdone after adding Swin-transformer and before;

[0065] Figure 9 The loss function curve for the Yolov8-YLBNSwin training process;

[0066] Figure 10 A comparison chart showing the visualization of the features extracted from the last layer of four models on the same image and the comparison with the real labels;

[0067] Figure 11 A schematic diagram of a computer device provided in an embodiment of the present invention;

[0068] Figure 12This is a block diagram of a chip provided according to an embodiment of the present invention. Detailed Implementation

[0069] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0070] In the description of this invention, it should be understood that the terms "comprising" and "including" indicate the presence of the described features, integrals, steps, operations, elements and / or components, but do not exclude the presence or addition of one or more other features, integrals, steps, operations, elements, components and / or collections thereof.

[0071] It should also be understood that the terminology used in this specification is for the purpose of describing particular embodiments only and is not intended to limit the invention. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” are intended to include the plural forms unless the context clearly indicates otherwise.

[0072] It should also be further understood that the term "and / or" as used in this specification and the appended claims refers to any combination and all possible combinations of one or more of the associated listed items, and includes such combinations. For example, A and / or B can represent three cases: A alone, A and B simultaneously, and B alone. Additionally, the character " / " in this invention generally indicates that the preceding and following objects have an "or" relationship.

[0073] It should be understood that although terms such as first, second, third, etc., may be used in the embodiments of the present invention to describe the preset range, these preset ranges should not be limited to these terms. These terms are only used to distinguish the preset ranges from one another. For example, without departing from the scope of the embodiments of the present invention, the first preset range may also be referred to as the second preset range, and similarly, the second preset range may also be referred to as the first preset range.

[0074] Depending on the context, the word "if" as used here can be interpreted as "when," "when," "in response to determination," or "in response to detection." Similarly, depending on the context, the phrase "if determination" or "if detection (of the stated condition or event)" can be interpreted as "when determination," "in response to determination," "when detection (of the stated condition or event)," or "in response to detection (of the stated condition or event)."

[0075] The accompanying drawings illustrate various structural schematic diagrams according to embodiments disclosed in this invention. These drawings are not to scale, and some details have been enlarged for clarity, and some details may have been omitted. The shapes of the various regions and layers shown in the drawings, as well as their relative sizes and positional relationships, are merely exemplary and may deviate from reality due to manufacturing tolerances or technical limitations. Furthermore, those skilled in the art can design regions / layers with different shapes, sizes, and relative positions as needed.

[0076] This invention provides an instance segmentation method for unstained epithelial cell images. Based on the YOLOv8-SEG instance segmentation network model, it borrows the feature fusion idea of ​​BiFPN and the parallel substructure of ParNet to construct a lightweight YOLOv8-BiFPN-Neck (YLBN) module. A Swin-transformer structure is introduced into Backdone to improve model performance, constructing the YOLOv8-YLBNSwin instance segmentation network model. The constructed unstained epithelial cell image dataset is used for training and validation, and the segmentation performance of the network model is tested. Grad-CAM (gradient-weighted class activation mapping) is used to visualize the features extracted by YOLOv8-YLBNSwin for various cells, optimize the dataset, and further improve the model's segmentation performance on the epithelial cell dataset. The YOLOv8-YLBNSwin instance segmentation model and model evaluation metrics are obtained through retraining, validation, and testing. The generalization performance of the model is verified using a fibroblast dataset.

[0077] Please see Figure 1 This invention discloses an instance segmentation method for unstained epithelial cell images. Based on the YOLOv8-SEG instance segmentation network model, it replaces the Neck structure of YOLOv8-SEG with a YLBN module featuring feature fusion and parallel substructures, enhancing information interaction between network layers and achieving lightweight implementation. A Swin-transformer structure is introduced into Backdone, making the network more concerned with detailed cell information, resulting in more accurate segmentation. Grad-CAM (gradient-weighted class activation mapping) is used to visualize the features extracted from various cells by YOLOv8-YLBNSwin, optimizing the dataset, and retraining and validating to obtain the YOLOv8-YLBNSwin instance segmentation model, further improving the model's segmentation performance on epithelial cell datasets. The specific steps are as follows:

[0078] S1. Preprocess the LIVECell dataset and combine the five types of epithelial cell data in the preprocessed LIVECell dataset into an unstained epithelial cell image dataset. The unstained epithelial cell image dataset is divided into a training set, a validation set, and a test set. The three types of fibroblast cell data are combined into an epithelial cell dataset.

[0079] S101, Dataset Acquisition

[0080] This invention obtained the publicly available LIVECell dataset. The LIVECell dataset contains eight cell types: five epithelial cell types and three epithelial cell types, totaling over 1.6 million annotated cells. Each cell type has annotated images showing growth from the initial seeding phase to a fully confluent monolayer, resulting in significant variations in cell size and shape, including large, flat SKOV3 cells and neuron-like SH-SY5Y cells. Each image is 704×520 pixels.

[0081] S102, Data Preprocessing and Partitioning

[0082] For the LIVECell dataset, Limit Contrast Adaptive Histogram Equalization (CLAHE) was used as the data preprocessing method to obtain unstained epithelial cell image datasets and fibroblast cell datasets. The datasets were divided according to a 6:1:3 ratio. The training set, validation set, and test set of the unstained epithelial cell image dataset consisted of 2089 images, 366 images, and 1044 images, respectively.

[0083] Please see Figure 2 and Figure 3 The implementation process is as follows:

[0084] S1021, The Limiting Contrast Adaptive Histogram Equalization (CLAHE) processing steps are as follows: Divide the original image into many small blocks (or sub-images), each block being M×N in size. Normalize the frequency of each pixel value within the sub-image to obtain the frequency of each pixel value, denoted as P(j). Calculate the cumulative distribution function (CDF) C(j), where... The equalized pixel value H(j) is calculated as follows:

[0085]

[0086] Where L represents the range of pixel values.

[0087] S1022. Pixels in the original image are processed in three ways according to their distribution: pixels in non-adjacent regions are directly transformed using the transformation function; pixels in a single adjacent region are transformed and then linearly interpolated with two adjacent sub-images; pixels in a double adjacent region are transformed and then bilinearly interpolated with four adjacent sub-images. The BT474 cell image after CLAHE processing is shown below. Figure 3 As shown.

[0088] S2. Construct a YOLOv8-YLBNSwin instance segmentation network model with a YOLOv8-BiFPN-Neck (YLBN) module and a Swin-transformer structure. Use the training set, validation set, and test set of the unstained epithelial cell image dataset obtained in step S1 to train, validate, test, and perform ablation experiments on the YOLOv8-YLBNSwin instance segmentation network model.

[0089] Please see Figures 4 to 9 The specific steps are as follows:

[0090] S201. Construct a YLBN module with feature fusion and parallel substructure, introduce the Swin-transformer structure in Backdone, and obtain the Yolov8-YLBNSwin instance segmentation network model.

[0091] The module construction process is as follows:

[0092] S2011, the entire structure of Yolov8-seg is as follows: Figure 4 As shown, it consists of a backdone, a neck, and a head. (Refer to BiFPN) Figure 5 The top-down and bottom-up multi-scale feature fusion idea firstly changes the connection between the Neck layer and the Backdone from P4, P6, P9 to P2, P4, P6, P9, and secondly adds a jump connection to the corresponding layer at the Concat point.

[0093] S2012, referencing the idea of ​​reducing model depth through parallel substructures in ParNet, uses Fusion and Block modules to replace Concat, such as... Figure 6 As shown, the Fusion module is used to reduce resolution and increase width to achieve multi-scale processing, while the Block module combines information from multiple resolutions to reduce the number of parameters. This forms the YLBN module, as shown below. Figure 7 As shown;

[0094] S2013. Introduce the Swin-transformer structure into the Backdone structure of yolov8-seg, replacing the two C2f modules therein. The implementation is as follows: Figure 8 As shown;

[0095] S2014. Based on the yolov8-seg instance segmentation network model, the Neck structure is replaced with the YLBN module, which has the idea of ​​feature fusion and parallel substructure. The Swin-transformer structure is introduced in Backdone to obtain the yolov8-YLBNSwin instance segmentation network model.

[0096] S202. Remove modules from the YOLOv8-YLBNSwin network model obtained in step S201 to obtain YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin segmentation networks; and use the epithelial cell segmentation training set, validation set, and test set obtained in step S1 to train, validate, and test the YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin instance segmentation networks respectively.

[0097] The parameters were set as follows: the optimizer was SGD, the activation function was SiLU, the initial learning rate was 0.01, the number of iterations was 100, the input image size was 702×520, the batch size was 4, and the loss functions used were BCEWithLogitsLoss and DFL Loss+CIOU Loss.

[0098] The formula for calculating the SiLU activation function is as follows:

[0099]

[0100] Here, x represents a pixel in the image. The SiLU function is a smoothing function that can make the network more stable and reduce the risk of overfitting. When the network has a relatively deep number of layers, gradients can propagate better due to the fundamental properties of the sigmoid function.

[0101] The three loss functions are calculated as follows:

[0102]

[0103] DFL(S i ,S i+1 )=-((y i+1 -y)log(S i )+(yy i )log(S i+1 ))

[0104]

[0105] in, This represents the probability that pixel y belongs to the target class, where y is the sample label and σ is the sigmoid function. i+1 y i The two labels closest to label y, y i ≤y≤y i+1 ,

[0106] Δ represents the distance loss.

[0107]

[0108]

[0109] Λ represents the angular loss:

[0110]

[0111] Among them, c h θ represents the height difference between the center points of the ground truth bounding box and the predicted bounding box, and θ represents the distance between the center points of the ground truth bounding box and the predicted bounding box.

[0112] The training loss function curve of the Yolov8-YLBNSwin segmentation network model is as follows: Figure 9 As shown.

[0113] S203. Using the epithelial cell segmentation test set obtained in step S1, test the trained YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin instance segmentation network models respectively; use the AP of instance segmentation (segm) and bounding box (bbox) to... 50 The test results are evaluated using mAP, the number of model parameters, and FLOPS to obtain the evaluation metrics for model segmentation.

[0114] AP is the area under the RP curve, where R (Recall) and P (Precision) are calculated using the following formulas:

[0115]

[0116]

[0117] Where TP represents the number of correctly classified positive samples; FN represents the number of misclassified positive samples; and FP represents the number of misclassified negative samples. 50 AP is the value of AP when the IoU threshold is 0.5. 75 The AP value is given when the IoU threshold is 0.75. The mAP is calculated as follows:

[0118]

[0119] The final segmentation metrics of the Yolov8-YLBNSwin instance segmentation network model on the epithelial cell segmentation test set are shown in Table 1. The total training loss of Yolov8-YLBNSwin is as follows: Figure 9 As shown.

[0120] Table 1. Results of Epithelial Cell Data Set Segmentation Experiment

[0121]

[0122] S3. Use the interpretable model to guide the Yolov8-YLBNSwin instance segmentation model trained in step S2, optimize the data, and further improve the model's segmentation performance.

[0123] Grad-CAM (Gradient Weighted Class Activation Mapping) was used to visualize the features of the last layer of YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin, extracting features from various cells and comparing them with the true labels. Figure 10 As shown, it was found that after introducing the combination of Swin-transformer and BIFPN, the model's ability to detect small targets was greatly enhanced. Impurities in the image were also treated as cells for segmentation, resulting in... The features and ground truth labels extracted by YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin are shown in the table below. Therefore, optimization is performed on the epithelial dataset.

[0124] Observation of the cell dataset images revealed that impurities were most prevalent in BT474 and MCF7 cell images, while they were rare in the other three types. To avoid additional losses caused by image manipulation, optimization was only performed on BT474 and MCF7. The optimization was mainly based on Python's OpenCV library, and the optimization steps are as follows:

[0125] First, convert the image to grayscale, and then use the Sobel operator to perform edge detection in the x and y directions respectively;

[0126] Then use the convertScaleAbs() function to convert it back to its original uint8 form;

[0127] Secondly, use cv2.connectedComponentsWithStats to perform connected component analysis, obtain the area of ​​each region, delete those with an area less than 30, and retain their location information;

[0128] Finally, cv2.inpaint is used to fill the data with the surrounding pixels to optimize it.

[0129] S4. The optimized dataset obtained in step S3 is used to retrain, validate, and test the Yolov8-YLBNSwin instance segmentation model obtained in step S2. The new instance segmentation evaluation metrics are shown in Table 2. The generalization performance of the model is tested using the fibroblast cell dataset. Instance segmentation of epithelial cells is achieved.

[0130] Table 2. Experimental results of the epithelial cell segmentation model test set.

[0131]

[0132] The fibroblast cell dataset was processed in the same way as the epithelial cell dataset, and generalization performance was tested using YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin. The generalization performance evaluation results of the model are shown in Table 3.

[0133] Table 3. Experimental results of the generalization performance of the epithelial cell segmentation model.

[0134]

[0135] In another embodiment of the present invention, an instance segmentation system for unstained epithelial cell images is provided. This system can be used to implement the above-mentioned instance segmentation method for unstained epithelial cell images. Specifically, the instance segmentation system for unstained epithelial cell images includes a data module, a network module, an optimization module, and a module.

[0136] The data module divides the epithelial cell segmentation dataset into a training set, a validation set, and a test set.

[0137] The network module constructs a Yolov8-YLBNSwin instance segmentation network model with a Yolov8-BiFPN-Neck module and a Swin-transformer structure. The training set, validation set and test set obtained from the data module are used to train, validate, test and ablation experiments on the Yolov8-YLBNSwin instance segmentation model to obtain the evaluation index results of the model segmentation.

[0138] The optimization module uses an interpretable model to guide the optimization of the model segmentation evaluation index results obtained by the network module.

[0139] The segmentation module uses the optimized model segmentation evaluation metrics from the optimization module to train, validate, and test the Yolov8-YLBNSwin instance segmentation model obtained from the network module. This yields new model segmentation evaluation metrics and segmentation contour visualizations. The generalization performance of the model is tested using a fibroblast cell dataset, achieving instance segmentation of epithelial cells.

[0140] In another embodiment of the present invention, a terminal device is provided, comprising a processor and a memory. The memory stores a computer program, which includes program instructions. The processor executes the program instructions stored in the computer storage medium. The processor may be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. It is the computing and control core of the terminal, suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions to achieve a corresponding method flow or corresponding function. The processor described in this embodiment of the present invention can be used for the operation of an instance segmentation method for unstained epithelial cell images, including:

[0141] The epithelial cell segmentation dataset was divided into training, validation, and test sets. A Yolov8-YLBNSwin instance segmentation network model with a Yolov8-BiFPN-Neck module and a Swin-transformer structure was constructed. The Yolov8-YLBNSwin instance segmentation model was trained, validated, tested, and subjected to ablation experiments using the training, validation, and test sets, respectively, to obtain the model segmentation evaluation metrics. The obtained model segmentation evaluation metrics were optimized using an interpretable model. The optimized model segmentation evaluation metrics were then used to train, validate, and test the obtained Yolov8-YLBNSwin instance segmentation model, resulting in new model segmentation evaluation metrics and segmentation contour visualizations. The generalization performance of the model was tested using a fibroblast cell dataset, achieving instance segmentation of epithelial cells.

[0142] Please see Figure 11The terminal device is a computer device. In this embodiment, the computer device 60 includes a processor 61, a memory 62, and a computer program 63 stored in the memory 62 and executable on the processor 61. When executed by the processor 61, the computer program 63 implements the fluid composition calculation method in the reservoir stimulation wellbore of this embodiment. To avoid repetition, these details are not elaborated here. Alternatively, when executed by the processor 61, the computer program 63 implements the functions of each model / unit in the instance segmentation system for unstained epithelial cell images of this embodiment. To avoid repetition, these details are not elaborated here.

[0143] Computer device 60 can be a desktop computer, laptop, handheld computer, cloud server, or other computing device. Computer device 60 may include, but is not limited to, a processor 61 and a memory 62. Those skilled in the art will understand that... Figure 11 This is merely an example of computer device 60 and does not constitute a limitation on computer device 60. It may include more or fewer components than shown, or combine certain components, or different components. For example, computer device may also include input / output devices, network access devices, buses, etc.

[0144] The processor 61 may be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or any conventional processor.

[0145] The memory 62 can be an internal storage unit of the computer device 60, such as a hard disk or RAM of the computer device 60. The memory 62 can also be an external storage device of the computer device 60, such as a plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, etc. equipped on the computer device 60.

[0146] Furthermore, the memory 62 may include both internal storage units of the computer device 60 and external storage devices. The memory 62 is used to store computer programs and other programs and data required by the computer device. The memory 62 can also be used to temporarily store data that has been output or will be output.

[0147] Please see Figure 12 The terminal device is a chip. In this embodiment, the chip 600 includes a processor 622, which may be one or more, and a memory 632 for storing computer programs executable by the processor 622. The computer program stored in the memory 632 may include one or more modules, each corresponding to a set of instructions. Furthermore, the processor 622 may be configured to execute the computer program to perform the instance segmentation method for unstained epithelial cell images described above.

[0148] Additionally, chip 600 may also include a power supply component 626 and a communication component 650. The power supply component 626 can be configured to perform power management of chip 600, and the communication component 650 can be configured to enable communication of chip 600, such as wired or wireless communication. Furthermore, chip 600 may also include an input / output interface 658. Chip 600 can operate on an operating system stored in memory 632.

[0149] In another embodiment of the present invention, a storage medium is provided, specifically a computer-readable storage medium, which is a memory device in a terminal device for storing programs and data. It is understood that the computer-readable storage medium here can include both the built-in storage medium in the terminal device and extended storage media supported by the terminal device. The computer-readable storage medium provides storage space that stores the terminal's operating system. Furthermore, the storage space also stores one or more instructions suitable for loading and execution by a processor; these instructions can be one or more computer programs. It should be noted that the computer-readable storage medium here can be high-speed RAM or non-volatile memory, such as at least one disk storage device.

[0150] One or more instructions stored in a computer-readable storage medium can be loaded and executed by a processor to implement the corresponding steps of the instance segmentation method for unstained epithelial cell images in the above embodiments; one or more instructions in the computer-readable storage medium are loaded and executed by the processor to perform the following steps:

[0151] The epithelial cell segmentation dataset was divided into training, validation, and test sets. A Yolov8-YLBNSwin instance segmentation network model with a Yolov8-BiFPN-Neck module and a Swin-transformer structure was constructed. The Yolov8-YLBNSwin instance segmentation model was trained, validated, tested, and subjected to ablation experiments using the training, validation, and test sets, respectively, to obtain the model segmentation evaluation metrics. The obtained model segmentation evaluation metrics were optimized using an interpretable model. The optimized model segmentation evaluation metrics were then used to train, validate, and test the obtained Yolov8-YLBNSwin instance segmentation model, resulting in new model segmentation evaluation metrics and segmentation contour visualizations. The generalization performance of the model was tested using a fibroblast cell dataset, achieving instance segmentation of epithelial cells.

[0152] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some, not all, of the embodiments of the present invention. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations. Therefore, the following detailed description of the embodiments of the present invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely to illustrate selected embodiments of the invention. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without inventive effort are within the scope of protection of the present invention.

[0153] Compared with traditional Mask-RCNN and CenterMask models, the YOLOv8-seg instance segmentation model has significantly improved the segmentation accuracy of epithelial cell images, while greatly reducing the number of parameters and floating-point operations. The improved model, YOLOv8-YLBNSwin instance segmentation network model, has a 39% reduction in the number of parameters compared with YOLOv8-seg, and the bounding box and segm of mAP have increased by 0.02 and 0.017, respectively, while the bounding box and segm of AP50 have increased by 0.015 and 0.018, respectively.

[0154] In summary, this invention provides an instance segmentation method and system for unstained epithelial cell images. It uses a Yolov8-seg network as the base network for instance segmentation of unstained epithelial cell images. It combines deep feature fusion and parallel substructures to reduce model parameters, achieving lightweight design. A Swin-transformer structure is introduced to improve model performance. Grad-CAM (gradient-weighted class activation mapping) is used to visualize the extracted features, analyze their characteristics, and optimize the dataset based on the feature extraction characteristics of Yolov8-YLBNSwin, further improving model performance. Under the same experimental conditions, its generalization performance is also superior to three other models.

[0155] Those skilled in the art will clearly understand that, for the sake of convenience and brevity, the above-described division of functional units and modules is merely an example. In practical applications, the above functions can be assigned to different functional units and modules as needed, that is, the internal structure of the device can be divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit. Furthermore, the specific names of the functional units and modules are only for easy differentiation and are not intended to limit the scope of protection of this application. The specific working process of the units and modules in the above system can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.

[0156] In the above embodiments, the descriptions of each embodiment have different focuses. For parts that are not described in detail or recorded in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0157] Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed in this invention can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementations should not be considered beyond the scope of this invention.

[0158] In the embodiments provided by this invention, it should be understood that the disclosed devices / terminals and methods can be implemented in other ways. For example, the device / terminal embodiments described above are merely illustrative. For instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.

[0159] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.

[0160] Furthermore, the functional units in the various embodiments of the present invention can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.

[0161] If the integrated module / unit is implemented as a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, all or part of the processes in the methods of the above embodiments can also be implemented by a computer program instructing related hardware. The computer program can be stored in a computer-readable storage medium, and when executed by a processor, it can implement the steps of the various method embodiments described above. The computer program includes computer program code, which can be in the form of source code, object code, executable files, or certain intermediate forms. The computer-readable medium can include: any entity or device capable of carrying the computer program code, recording media, USB flash drives, portable hard drives, magnetic disks, optical disks, computer memory, read-only memory (ROM), random-access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc. It should be noted that the content included in the computer-readable medium can be appropriately added or removed according to the requirements of legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to legislation and patent practice, computer-readable media do not include electrical carrier signals and telecommunication signals.

[0162] This application is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus, and computer program products according to embodiments of this application. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart... Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0163] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0164] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0165] The above content is only for illustrating the technical concept of the present invention and should not be construed as limiting the scope of protection of the present invention. Any modifications made to the technical solution based on the technical concept proposed in this invention shall fall within the scope of protection of the claims of this invention.

Claims

1. An example segmentation method of a non-stained epithelial type cell image, characterized by, Includes the following steps: S1. Divide the epithelial cell segmentation dataset into a training set, a validation set, and a test set; S2. Construct a YOLOv8-YLBNSwin instance segmentation network model with a YOLOv8-BiFPN-Neck module and a Swin-transformer structure. Use the training set, validation set, and test set obtained in step S1 to train, validate, test, and perform ablation experiments on the YOLOv8-YLBNSwin instance segmentation model, and obtain the evaluation index results of the model segmentation, specifically: S201. Drawing inspiration from the feature fusion concept of the Weighted Bidirectional Feature Pyramid Network (BiFPN) and the parallel substructure of ParNet, a YLBN module is constructed to replace the Neck module of Yolov8-seg. The Swin-transformer is then introduced into Backdone, resulting in the Yolov8-YLBNSwin instance segmentation network model. Specifically: S2011. For the YLBN module, firstly, the connection between the Neck layer and the Backdone has changed from P3, P4, P5 to P2, P3, P4, P5. Secondly, a jump connection corresponding to the layer has been added at Concat. S2012. Replace Concat and Fusion modules with Fusion and Block modules; S2013: Replace the two C2f modules after Backdone with the Swin-transformer structure, and change the number of repetitions of the first one to 9, while keeping the number of repetitions of the second one unchanged at 3. S2014. Based on the YOLOv8-seg instance segmentation network model, the YLBN module is used to replace the Neck module of YOLOv8-seg, and the Swin-transformer is introduced into the Backdone of YOLOv8-seg to construct the YOLOv8-YLBNSwin instance segmentation network model. S202. Remove the modules from the Yolov8-YLBNSwin instance segmentation network model obtained in step S201 to obtain the Yolov8-seg, Yolov8-YLBN, Yolov8-Swin and Yolov8-YLBNSwin instance segmentation network models. S203. Use the training set, validation set, and test set obtained in step S1 to train and validate the Yolov8-seg, Yolov8-YLBN, Yolov8-Swin, and Yolov8-YLBNSwin instance segmentation networks obtained in step S202, respectively. Use the test set obtained in step S1 to test the trained Yolov8-seg, Yolov8-YLBN, Yolov8-Swin, and Yolov8-YLBNSwin segmentation networks, respectively. AP using instance segmentation and detection boxes 50 The test results are evaluated using mAP, the number of model parameters, and FLOPS to obtain the evaluation metrics for model segmentation. S3. Use the interpretable model to guide the optimization of the model segmentation evaluation index results obtained in step S2; S4. Using the optimized model segmentation evaluation index results from step S3, train, validate, and test the Yolov8-YLBNSwin instance segmentation model obtained in step S2 to obtain new model segmentation evaluation index results and segmentation contour display images. Use the fibroblast cell dataset to test the generalization performance of the model and achieve instance segmentation of epithelial cells.

2. The instance segmentation method of a non-stained epithelial type cell image according to claim 1, characterized in that, In step S1, the entire LIVECell cell dataset is subjected to contrast-restricted adaptive histogram equalization. Then, the five epithelial cell datasets in the LIVECell cell dataset are used to form an epithelial cell instance segmentation dataset, which is divided into training set, validation set and test set according to 6:1:

3. The three fibroblast cell datasets are used to form a fibroblast cell dataset for generalization performance testing.

3. The instance segmentation method of a non-stained epithelial-type cell image according to claim 2, characterized in that, The steps for contrast-limited adaptive histogram equalization are as follows: The original image is divided into multiple sub-images, each sub-image being of size [size missing]. The frequency of each pixel value within the sub-image is obtained by normalizing the number of occurrences of each pixel value. P ( j Calculate the cumulative distribution function (CDF). C ( j ), Calculate the equalized pixel values H ( j ), represented as: in, Indicates the range of pixel values.

4. The instance segmentation method of a non-stained epithelial type cell image according to claim 1, characterized by, In step S202, the optimizer for the Yolov8-seg, Yolov8-YLBN, Yolov8-Swin, and Yolov8-YLBNSwin instance segmentation network models is set to the SGD optimizer, the activation function is SiLU, the initial learning rate is 0.01, the number of iterations is 100, the input image size is 704×520, the batch size is set to 4, and the loss functions used are BCEWithLogitsLoss and DFL Loss+CIOULoss.

5. The instance segmentation method for unstained epithelial cell images according to claim 1, characterized in that, In step S203, AP is the area under the RP curve, and the recall rate is... R precision P The calculation is as follows: in, This represents the number of samples that are correctly classified in the positive sample. This indicates the number of samples that were misclassified within the positive sample. This indicates the number of samples that were misclassified in the negative sample pool; mAP is calculated as follows: Among them, AP 50 The value of AP when the IoU threshold is 0.

5.

6. The instance segmentation method of a non-stained epithelial-type cell image according to claim 1, wherein, In step S3, Grad-CAM is used to visualize the last layer output of the YOLOv8-seg, YOLOv8-YLBN, YOLOv8-Swin, and YOLOv8-YLBNSwin instance segmentation network models obtained in step S2. The images are converted to grayscale images and then displayed in [the image format]. x and y The Sobel operator was used for edge detection; then the convertScaleAbs() function was used to convert it back to the original uint8 form; taking into account the feature extraction characteristics of Yolov8-YLBNSwin, Python's cv2 library was used to remove impurities smaller than 30 in the BT474 and MCF7 cell image data.

7. The instance segmentation method for unstained epithelial cell images according to claim 1, characterized in that, Step S4 is as follows: S401. Use the optimized unstained epithelial cell image dataset to train, validate, and test YOlov8-seg, YOlov8-YLBN, YOlov8-Swin, and YOlov8-YLBNSwin to obtain the evaluation index results and segmentation contour display map of the model segmentation. On the optimized unstained epithelial cell image dataset, S402 and Yolov8-YLBNSwin achieved mAP values ​​of 0.442 for bounding box and 0.330 for segm. 50 The bounding box and segment size are 0.691 and 0.665, respectively. The model has 7.56M parameters and 11.0 GFLOPs of computation. S403. Generalization performance was validated using a fibroblast-type cell dataset. The mAP bounding box and segm were 0.365 and 0.215, respectively. 50 The bounding box and segm are 0.582 and 0.515, respectively.

8. An example segmentation system of a non-stained epithelial type cell image, characterized by, include: The data module divides the epithelial cell segmentation dataset into training, validation, and test sets; The network module constructs a Yolov8-YLBNSwin instance segmentation network model with a Yolov8-BiFPN-Neck module and a Swin-transformer structure. The training, validation, and test sets obtained from the data module are used to train, validate, test, and conduct ablation experiments on the Yolov8-YLBNSwin instance segmentation model, obtaining the evaluation metrics for model segmentation. Specifically: By drawing inspiration from the feature fusion concept of the Bidirectional Feature Pyramid Network (BiFPN) and the parallel substructure of ParNet, a YLBN module is constructed to replace the Neck module of Yolov8-seg. Furthermore, the Swin-transformer is introduced into Backdone, resulting in the Yolov8-YLBNSwin instance segmentation network model, as follows: For the YLBN module, firstly, the connection between the Neck layer and Backdone changes from P3, P4, P5 to P2, P3, P4, P5. Secondly, a skip connection corresponding to the layer is added at Concat. The Fusion and Block modules are used to replace the Concat and Fusion modules. The two C2f modules at the end of Backdone are replaced with the Swin-transformer structure, and the repetition count of the first one is changed to 9 times, while the repetition count of the second one remains unchanged at 3 times. Based on the YOLOv8-seg instance segmentation network model, the YLBN module is used to replace the Neck module of YOLOv8-seg, and the Swin-transformer is introduced into the Backdone of YOLOv8-seg to construct the YOLOv8-YLBNSwin instance segmentation network model. Remove modules from the obtained Yolov8-YLBNSwin instance segmentation network model to obtain Yolov8-seg, Yolov8-YLBN, Yolov8-Swin and Yolov8-YLBNSwin instance segmentation network models; The obtained training set, validation set, and test set are used to train and validate the obtained Yolov8-seg, Yolov8-YLBN, Yolov8-Swin, and Yolov8-YLBNSwin instance segmentation networks, respectively. The obtained test set is used to test the trained Yolov8-seg, Yolov8-YLBN, Yolov8-Swin, and Yolov8-YLBNSwin segmentation networks, respectively. AP using instance segmentation and detection boxes 50 The test results are evaluated using mAP, the number of model parameters, and FLOPS to obtain the evaluation metrics for model segmentation. The optimization module uses an interpretable model to guide the optimization of the model segmentation evaluation index results obtained by the network module. The segmentation module uses the optimized model segmentation evaluation metrics from the optimization module to train, validate, and test the Yolov8-YLBNSwin instance segmentation model obtained from the network module. This yields new model segmentation evaluation metrics and segmentation contour visualizations. The generalization performance of the model is tested using a fibroblast cell dataset, achieving instance segmentation of epithelial cells.