Intelligent screening method and system for abnormal cells in liquid-based cervical cytological pathological images
By combining instance segmentation networks and cascaded classification strategies with attention-guided algorithms, the problems of inaccurate cell segmentation and insufficient localization of high-risk areas in cervical cancer screening are solved, achieving high-precision abnormal cell identification and efficient pathological diagnosis process.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- NANFANG HOSPITAL OF SOUTHERN MEDICAL UNIV
- Filing Date
- 2026-03-30
- Publication Date
- 2026-06-26
AI Technical Summary
Current technologies for cervical cancer screening suffer from insufficient cell segmentation accuracy, low matching degree between classification strategies and clinical diagnostic procedures, and inadequate ability to quickly locate high-risk areas, resulting in low screening accuracy and efficiency.
Cell contour extraction is performed using an instance segmentation network, abnormal cell classification is performed using a cascaded classification strategy, and high-risk fields of view are quickly located using an attention-guided algorithm. Combined with morphological feature extraction and the hierarchical diagnostic logic of the TBS reporting system, an abnormal cell annotation atlas and TBS classification assistance suggestions are generated.
It improved the accuracy of single cell contour segmentation, and the accuracy of abnormal cell identification reached over 92%, increasing the efficiency of pathologists in reading slides by 40%.
Smart Images

Figure CN122289201A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of medical image processing and artificial intelligence-assisted diagnosis technology, specifically relating to a liquid-based cytopathological image abnormal cell intelligent detection and classification method and system for cervical cancer screening. Background Technology
[0002] Cervical cancer is one of the most common malignant tumors among women worldwide. Early screening and timely intervention are crucial for reducing the incidence and mortality of cervical cancer. Liquid-based thin-layer cytology (TLC) has become the mainstream method for cervical cancer screening. It involves collecting, preparing, and microscopically examining exfoliated cervical cells to assess their morphological characteristics and determine the presence of precancerous or malignant lesions. However, traditional manual microscopic examination methods face technical challenges such as high workload, high false negative rates, and poor interpretation consistency.
[0003] In the prior art, Chinese invention patent CN109272492A discloses a method and system for processing cytopathological smears. This scheme uses a first convolutional neural network for cell detection and segmentation, a generative adversarial network for color adjustment of unevenly stained images, and a second convolutional neural network for hierarchical recognition of the adjusted cell images. Although this technical solution improves the image quality of cytopathological smears to some extent, it still has the following shortcomings: In terms of cell segmentation, this scheme uses a conventional convolutional neural network for detection and segmentation, which makes it difficult to accurately obtain the complete outline of a single cell, and the segmentation effect of overlapping cells and cells with blurred boundaries is poor; in terms of classification strategy, this scheme only uses a single classification network for positive and negative differentiation and hierarchical recognition, failing to fully utilize the hierarchical diagnostic logic of the TBS report system, and the accuracy of recognizing intermediate states such as atypical squamous cells needs to be improved; in terms of auxiliary slide reading, this scheme does not provide a function to prioritize high-risk fields of view, and pathologists still need to browse the entire slide image field by field, making it impossible to quickly locate abnormal areas that need to be focused on, affecting the efficiency of slide reading.
[0004] In view of the above-mentioned technical problems, it is necessary to propose a new intelligent screening method and system for abnormal cells in cervical liquid-based cytology images to solve the problems of insufficient cell segmentation accuracy, low matching degree between classification strategies and clinical diagnostic procedures, and lack of rapid localization capability of high-risk areas in existing technologies, thereby improving the accuracy and efficiency of cervical cancer screening. Summary of the Invention
[0005] The purpose of this invention is to provide an intelligent screening method and system for abnormal cells in cervical liquid-based cytology images, in order to solve the technical problems existing in the prior art, such as inaccurate cell contour segmentation, low matching degree between abnormal cell classification and TBS reporting system, and insufficient ability to quickly locate high-risk fields.
[0006] To achieve the above objectives, this invention provides an intelligent screening method for abnormal cells in cervical liquid-based cytology images, comprising the following steps: an image acquisition step, acquiring a full-field digital image of a liquid-based thin-layer cytology smear, performing color space normalization and contrast enhancement processing on the full-field digital image to obtain a preprocessed image; a cell instance segmentation step, inputting the preprocessed image into a cell instance segmentation network, performing contour detection on each cell in the preprocessed image, and outputting a cell mask and corresponding cell image blocks; a morphological feature extraction step, calculating cell nuclear area parameters, nucleocytoplasmic ratio parameters, chromatin distribution parameters, and nuclear membrane regularity parameters based on the cell mask, and combining these parameters into a morphological feature vector; and a cascade classification step, mapping the morphological features to... The input cascade classifier sequentially classifies cells through normal cell classification layers, atypical squamous cell classification layers, low-grade squamous intraepithelial lesion classification layers, and high-grade squamous intraepithelial lesion classification layers, outputting cell classification labels and cell abnormality probability values. The attention-guided localization step generates an attention weight map based on the cell abnormality probability values, and determines the spatial coordinates of abnormal cells in the full-view digital image based on the attention weight map. The high-risk field-of-view screening step calculates field-of-view priority scores based on the spatial coordinates and cell abnormality probability values, and outputs a list of high-risk fields of view sorted by priority scores. The result output step generates TBS classification results based on the cell classification labels, outputting an abnormal cell annotation atlas and TBS classification assistance suggestions.
[0007] Preferably, the chromatin distribution parameters include chromatin entropy and chromatin aggregation, and the nuclear membrane regularity parameter is calculated based on the roundness index of the cell nuclear boundary and the boundary curvature variance.
[0008] This invention also provides an intelligent screening system for abnormal cells in cervical liquid-based cytology images, comprising: an image acquisition module, a cell instance segmentation module, a morphological feature extraction module, a cascade classification module, an attention-guided localization module, a high-risk field-of-view screening module, and a result output module. Each module corresponds one-to-one with each step of the above-mentioned method, working together to complete the intelligent screening task for abnormal cells.
[0009] The beneficial effects of this invention are as follows: Using an instance segmentation network for cell contour extraction, compared to traditional convolutional neural network segmentation methods, it can more accurately obtain the complete contour of a single cell, improving the segmentation accuracy of overlapping cells and cells with blurred boundaries by approximately 15%; employing a cascaded classification strategy to sequentially distinguish normal cells, atypical squamous cells, low-grade and high-grade squamous intraepithelial lesions, it highly matches the hierarchical diagnostic logic of the TBS reporting system, achieving an overall abnormal cell identification accuracy of over 92%; and innovatively designing an attention-guided abnormal cell localization algorithm and a high-risk field-of-view priority screening mechanism, enabling pathologists to prioritize reviewing high-risk fields of view, improving slide reading efficiency by approximately 40%. Attached Figure Description
[0010] Figure 1 This is a flowchart of the intelligent screening method for abnormal cells in cervical liquid-based cytology images provided in this embodiment of the invention.
[0011] Figure 2 This is an architecture diagram of the intelligent screening system for abnormal cells in cervical liquid-based cytology images provided in this embodiment of the invention. Detailed Implementation
[0012] Please refer to the attached document. Figures 1-2 The present invention will now be described in detail with reference to the accompanying drawings and specific embodiments. The following embodiments are only used to illustrate the technical solutions of the present invention and do not constitute a limitation on the scope of protection of the present invention.
[0013] This invention provides a method for intelligent screening of abnormal cells in cervical liquid-based cytology images, such as... Figure 1 As shown, the method includes the following steps:
[0014] I. Image Acquisition Step: The image acquisition step is used to acquire a full-field digital image of the liquid-based thin-layer cell smear and preprocess it to obtain a standardized image suitable for subsequent analysis. In one embodiment of the present invention, the liquid-based thin-layer cell smear is acquired using a digital cytopathology scanner. The scanner employs a high-resolution CCD sensor and a precision optical system to perform a full-slice scan of the smear at an optical magnification of no less than 40 times. The resolution of the obtained full-field digital image is typically no less than 40,000 pixels by 40,000 pixels, with a pixel accuracy of 0.25 micrometers per pixel, which can clearly present the detailed features of the cell nucleus, cytoplasm, and background areas.
[0015] After acquiring the full-view digital image, it needs to undergo color space normalization and contrast enhancement processing. Color space normalization aims to eliminate color differences caused by different staining batches and scanning devices. In this embodiment, the RGB color space is first converted to the LAB color space. Then, histogram equalization is performed on the L channel, and the A and B channels are standardized to align their mean and variance with preset standard reference values. Finally, the processed image is converted back to the RGB color space. Contrast enhancement employs an adaptive contrast-limited histogram equalization method. This method divides the image into several sub-regions, performs histogram equalization on each sub-region separately, and limits the contrast enhancement amplitude to avoid noise amplification. Preferably, the sub-region size is set to 64 pixels by 64 pixels, and the contrast limit parameter is set to 3.0, which effectively enhances the contrast between the cell nucleus and cytoplasm while maintaining the overall naturalness of the image.
[0016] After preprocessing, the resulting preprocessed image retains the same resolution as the original image but has a more uniform color distribution and clearer cell structure contrast. The preprocessed image serves as input for subsequent cell instance segmentation steps.
[0017] II. Cell Instance Segmentation Step: This step segmentes each individual cell from the preprocessed image and outputs the corresponding cell mask and cell image patch. In this embodiment of the invention, cell instance segmentation employs a Mask R-CNN network structure, which is an end-to-end instance segmentation network capable of simultaneously performing object detection and semantic segmentation tasks.
[0018] The Mask R-CNN network comprises three main components: a feature pyramid network module, a region proposal network module, and a mask prediction branch module. The feature pyramid network module uses ResNet-101 as its backbone network to extract multi-scale features from the preprocessed input image. Each convolutional stage of the backbone network outputs feature maps at different resolutions, and the feature pyramid network fuses these feature maps into a multi-scale feature pyramid through top-down paths and lateral connections. In this embodiment, the feature pyramid contains five levels: P2, P3, P4, P5, and P6, with resolutions of one-quarter, one-eighth, one-sixteenth, one-thirty-second, and one-sixtieth of the original image, respectively, each with 256 feature channels. The multi-scale feature pyramid can simultaneously capture feature information from both large and small cells.
[0019] The region proposal network module generates candidate cell regions based on a multi-scale feature pyramid. At each feature pyramid level, the region proposal network slides a 3x3 convolutional kernel to predict the target probability of k anchor boxes and the bounding box regression offset for each spatial location. In this embodiment, k is set to 9, corresponding to three combinations of anchor boxes with different sizes and aspect ratios. The anchor box sizes are set to 32, 64, and 128 pixels, and the aspect ratios are set to 0.5, 1.0, and 2.0, covering the typical size range of cervical epithelial cells. After non-maximum suppression processing, the candidate regions output by the region proposal network are selected, and the top 2000 candidate regions with the highest scores are retained for the next stage of processing.
[0020] The mask prediction branch module performs fine segmentation on each candidate region to generate a cell mask. This module first extracts a fixed-size feature vector from the feature pyramid using a region-of-interest alignment operation, with the feature vector size set to 14 x 14 x 256. Then, the feature vector is upsampled through four consecutive 3 x 3 convolutional layers and one deconvolutional layer, ultimately outputting a 28 x 28 resolution binary mask. Each pixel value in the mask represents the probability that the location belongs to the cell foreground, and after thresholding, a binarized cell mask is obtained. Preferably, the threshold is set to 0.5.
[0021] During the training of the cell instance segmentation network, a multi-task loss function was used to jointly optimize the performance of object detection and mask segmentation. The training dataset contained cervical liquid-based cytology smear images annotated by professional pathologists, with annotations including the bounding box coordinates, class label, and pixel-level segmentation mask for each cell. During training, the initial learning rate was set to 0.02, and a learning rate decay strategy was adopted, reducing the learning rate to one-tenth of its original value in the 8th and 11th training epochs, for a total of 12 training epochs. The batch size was set to 16, and the optimizer used the stochastic gradient descent algorithm with a momentum parameter of 0.9 and a weight decay parameter of 0.0001.
[0022] After the cell instance segmentation step is completed, a set of cell masks and corresponding set of cell image patches are output. The cell mask is a binary image, where a foreground pixel value of 1 represents a cell region, and a background pixel value of 0 represents a non-cell region. The cell image patch is an RGB image cropped from the preprocessed image based on the cell bounding box. During cropping, several pixels are extended around the bounding box to preserve the complete cytoplasmic region. Preferably, the extension number is set to 16 pixels. The size of each cell image patch is normalized to 128 pixels by 128 pixels to facilitate batch processing by the subsequent feature extraction module.
[0023] III. Morphological Feature Extraction Step: This step extracts morphological features from the cell mask and cell image patches that characterize the degree of cellular pathological changes. In embodiments of this invention, the morphological features include four categories: nuclear area parameters, nucleocytoplasmic ratio parameters, chromatin distribution parameters, and nuclear membrane regularity parameters.
[0024] Calculating the nuclear area parameter requires first segmenting the nuclear region. In cell image blocks, the nucleus typically appears as a deep purple or bluish-black color, contrasting sharply with the lightly stained cytoplasm. This embodiment uses a method based on color thresholding and morphological operations to segment the nucleus: First, the cell image block is converted to the HSV color space. Based on the color characteristics of the nucleus, the threshold ranges are set as follows: H channel threshold range 100 to 180, S channel threshold range 50 to 255, and V channel threshold range 20 to 200, extracting preliminary candidate regions for the nucleus. Then, morphological opening operations are performed on the candidate regions to eliminate small noise areas, and morphological closing operations are performed to fill small pores inside the nucleus. The structuring element is a circular nucleus with a diameter of 5 pixels. Finally, the connected region with the largest area is retained as the final nucleus region. The nuclear area parameter is obtained by counting the number of pixels in the nucleus region and multiplying it by a unit pixel area coefficient. The unit pixel area coefficient is determined by the scanning resolution, which in this embodiment is 0.0625 square micrometers per pixel.
[0025] The nucleocytic ratio (NCR) is an important morphological indicator reflecting the degree of cell differentiation. Normal cells typically have a low NCR, while cancerous cells have a significantly increased NCR. The NCR is calculated by dividing the nuclear area by the cytoplasmic area, where the cytoplasmic area is the total cell area minus the nuclear area. The total cell area is obtained by counting the number of foreground pixels in the cell mask. In this embodiment, the typical NCR range is as follows: normal cells typically have a NCR less than 0.3, low-grade lesion cells have a NCR between 0.3 and 0.5, and high-grade lesion cells typically have a NCR greater than 0.5.
[0026] Chromatin distribution parameters are used to quantitatively describe the spatial distribution characteristics of chromatin within the cell nucleus and are key indicators for assessing nuclear atypia. In embodiments of this invention, chromatin distribution parameters include two sub-parameters: chromatin entropy and chromatin aggregation. Chromatin entropy characterizes the uniformity of gray-level distribution within the cell nucleus, and its calculation is based on the gray-level histogram of the nuclear region. The chromatin entropy calculation algorithm proposed in this invention is as follows:
[0027] ,
[0028] in, This represents the chromatin entropy value. The grayscale level is set to 256 in this embodiment; grayscale value The probability of appearing within the cell nucleus region is determined by the grayscale value. The sum is obtained by dividing the number of pixels by the total number of pixels in the cell nucleus region; the summation range is 0 to... This refers to all gray levels. A higher chromatin entropy value indicates a more uniform gray level distribution and a more diffuse chromatin distribution; a lower chromatin entropy value indicates a more concentrated gray level distribution and a more aggregated chromatin distribution. The chromatin entropy value of normal cells is usually between 6.5 and 7.5, while the chromatin entropy value of abnormal cells is usually below 6.5 or above 7.8.
[0029] Chromatin aggregation degree characterizes the spatial aggregation properties of high-grayscale regions and reflects the degree of chromatin condensation. The chromatin aggregation degree calculation algorithm proposed in this invention is as follows:
[0030] ,
[0031] in, This refers to the degree of chromatin aggregation. This represents the number of high-grayscale clustered regions within the cell nucleus. A high-grayscale clustered region is defined as a connected region whose grayscale value is greater than the average grayscale value of the cell nucleus plus one standard deviation. For the first The area of a high grayscale cluster region, expressed in pixels; For the first The compactness of a high-grayscale aggregated region is defined as the ratio of the region's area to the area of its circumscribed circle; the denominator is the total area of all high-grayscale aggregated regions, and the numerator is the area-weighted sum of compactness. Higher chromatin aggregation indicates a higher degree of chromatin condensation and more uneven distribution. Abnormal cells typically have higher chromatin aggregation than normal cells.
[0032] Nuclear membrane regularity parameters are used to quantify the regularity of the cell nuclear boundary and are a key indicator for identifying nuclear membrane irregularity, a typical abnormal cell feature. In the embodiments of this invention, the nuclear membrane regularity parameter is calculated based on the circularity index of the cell nuclear boundary and the variance of the boundary curvature. The nuclear membrane regularity calculation algorithm proposed in this invention is as follows:
[0033] ,
[0034] in, This is a parameter for nuclear membrane regularity, with a value ranging from 0 to 1. A larger value indicates a more irregular nuclear membrane. The roundness index is defined as follows:
[0035] ,
[0036] in, The area of the cell nucleus. This represents the perimeter of the cell nucleus. The roundness index ranges from 0 to 1, with a perfect circle having a roundness of 1. The more irregular the shape, the smaller the roundness. The boundary curvature variance is calculated as follows: First, the cell nucleus boundary pixel sequence is extracted using the chain code method. Then, the boundary is uniformly sampled to obtain a fixed number of boundary points. In this embodiment, the number of sampling points is set to 128. For each boundary sampling point, its local curvature value is calculated. Curvature is defined as the turning angle of the boundary at that point divided by the sampling interval. Finally, the variance of the curvature values of all boundary sampling points is calculated as the boundary curvature variance. and The weighting coefficients are set to 0.6 and 0.4 respectively in this embodiment. Experimental results have shown that these coefficients can effectively balance the contributions of roundness and curvature variance.
[0037] After calculating the above four types of morphological features, the parameters of nuclear area, nucleocytoplasmic ratio, chromatin entropy, chromatin aggregation, and nuclear membrane regularity are combined into a morphological feature vector. In this embodiment, the morphological feature vector has a dimension of 5, and each component is standardized before combination to make its mean 0 and variance 1, so as to eliminate the dimensional differences between different features.
[0038] IV. Cascaded Classification Steps: Cascaded classification steps are used to classify morphological feature vectors and determine the lesion type of each cell. In embodiments of this invention, a cascaded classification strategy is employed, decomposing the classification task into multiple binary or multi-class sub-tasks that are executed sequentially, matching the hierarchical diagnostic logic of the TBS reporting system.
[0039] The cascaded classifier comprises a normal cell classification layer, an atypical squamous cell classification layer, a low-grade squamous intraepithelial lesion classification layer, and a high-grade squamous intraepithelial lesion classification layer, connected sequentially. Each classification layer uses a support vector machine or a gradient boosting decision tree as the base classifier. In this embodiment, a gradient boosting decision tree is used, with the following parameters configured: maximum tree depth of 6, learning rate of 0.1, number of trees of 100, and minimum number of leaf node samples of 5.
[0040] The normal cell classification layer receives morphological feature vectors as input and determines whether the corresponding cells are normal cells. This classification layer outputs a probability value for normal cells. If the probability value is greater than a preset normal cell threshold, the cell is marked as normal, and the classification process terminates. If the probability value is less than or equal to the normal cell threshold, the morphological feature vector is passed to the atypical squamous cell classification layer for further evaluation. Preferably, the normal cell threshold is set to 0.7, which is determined through cross-validation to maintain sufficient sensitivity while ensuring high specificity.
[0041] The atypical squamous cell classification layer determines whether the cell corresponding to the input morphological feature vector is an atypical squamous cell. Atypical squamous cells include two subclasses: atypical squamous cells of indeterminate significance and atypical squamous cells whose high-grade lesions cannot be ruled out. This classification layer outputs the probability value of atypical squamous cells. If the probability value is greater than a preset ASC threshold, it further determines whether it belongs to the indeterminate significance type or the type whose high-grade lesions cannot be ruled out, and outputs the corresponding ASC label; if the probability value is less than or equal to the ASC threshold, the morphological feature vector is passed to the low-grade squamous intraepithelial lesion classification layer. Preferably, the ASC threshold is set to 0.5.
[0042] The low-grade squamous intraepithelial lesion (LSIL) classification layer determines whether the cell corresponding to the input morphological feature vector is a LSIL. This classification layer outputs a probability value for LSIL. If the probability value is greater than a preset LSIL threshold, the cell is marked as LSIL, and the classification process terminates. If the probability value is less than or equal to the LSIL threshold, the morphological feature vector is passed to the high-grade squamous intraepithelial lesion classification layer. Preferably, the LSIL threshold is set to 0.5.
[0043] The high-grade squamous intraepithelial lesion (HSIL) classification layer performs final classification on the input morphological feature vector, determining whether it belongs to high-grade squamous intraepithelial lesion or squamous cell carcinoma. This classification layer outputs probability values for HSIL and squamous cell carcinoma, selecting the category with the higher probability value as the final classification label.
[0044] In the cascade classification process, this invention proposes a cascade classification confidence transfer algorithm to calculate the final cell abnormality probability value by integrating the output probability values of each classification layer. The core idea of this algorithm is that each classification layer not only outputs the classification result but also the confidence level of that result; subsequent classification layers, when performing classification, fuse the confidence levels of the preceding classification layers as prior information. The cascade classification confidence transfer algorithm proposed in this invention is as follows:
[0045] ,
[0046] in, This represents the probability value of cell abnormality. The normal cell probability value output by the normal cell classification layer; The number of abnormal cell classification layers in the cascade classifier is 3 in this embodiment; For the first The corresponding category probability value output by each classification layer; For the first In this embodiment, the weight coefficients for the atypical squamous cell classification layer, the low-grade squamous intraepithelial lesion classification layer, and the high-grade squamous intraepithelial lesion classification layer are set to 0.3, 0.5, and 0.8, respectively. The weight coefficients increase with the severity of the lesion, reflecting the greater contribution of high-grade lesions to the abnormal probability value.
[0047] After the cascaded classification step is completed, the cell classification label and cell abnormality probability value for each cell are output. The cell classification label can be one of the following: normal, ASC-US, ASC-H, LSIL, HSIL, or squamous cell carcinoma. The cell abnormality probability value is a continuous value between 0 and 1, with a higher value indicating a higher probability of cell abnormality.
[0048] Fifth, the attention-guided localization step generates an attention weight map based on the cell abnormality probability value, determining the spatial coordinates of abnormal cells in the full-view digital image. This step is one of the core innovations of this invention, enabling pathologists to quickly locate abnormal areas requiring focused attention.
[0049] In an embodiment of the present invention, the process of generating the attention weight map is as follows: First, the full-view digital image is divided into grid regions. In this embodiment, the grid size is set to 512 pixels by 512 pixels. For a full-view image with a resolution of 40,000 pixels by 40,000 pixels, it is divided into approximately 78 by 78 grid regions. Then, the number of abnormal cells detected in each grid region and the corresponding cell abnormality probability value are counted. Finally, the attention weight of the grid region is calculated based on the number of abnormal cells and the abnormality probability value.
[0050] The attention-guided spatial localization algorithm proposed in this invention is as follows:
[0051] ,
[0052] in, For the first Attention weights for each grid region; For the first The number of abnormal cells detected within a grid area. An abnormal cell is defined as a cell whose abnormal probability value is greater than a preset abnormal probability threshold. Preferably, the preset abnormal probability threshold is set to 0.3. For the first The probability value of cell abnormality for each abnormal cell; For the first The normalized distance from the center of each abnormal cell to the center of the grid region, with the normalization coefficient being half the side length of the grid region; The standard deviation parameter of the Gaussian kernel is set to 0.5 in this embodiment; It is a natural exponential function. The physical meaning of this algorithm is: the attention weight of a grid region is obtained by weighted summation of the anomaly probabilities of all abnormal cells in the region, where abnormal cells located near the grid center contribute a larger weight and abnormal cells located at the grid edge contribute a smaller weight.
[0053] After generating the attention weight map, the spatial coordinates of abnormal cells are determined based on the attention weights. For each cell detected as abnormal, its spatial coordinates are determined by the coordinates of the center point of the cell bounding box, with coordinate values in pixels and the origin located at the upper left corner of the full-view digital image. The spatial coordinates, along with the cell classification label and the cell abnormality probability value, are stored together to form an abnormal cell location information set for use in subsequent high-risk field-of-view screening steps.
[0054] VI. High-risk field of view screening step: The high-risk field of view screening step is used to screen out high-risk fields of view that need to be reviewed first based on spatial location coordinates and cell abnormality probability values, so as to improve the efficiency of pathologists in reading slides.
[0055] In an embodiment of the present invention, the screening process for high-risk fields of view is as follows: First, the size of the field of view window is defined. The size of the field of view window should match the typical field of view size when a pathologist observes with a microscope. In this embodiment, it is set to 2048 pixels by 2048 pixels. Then, the entire field of view digital image is traversed in a sliding window manner. The sliding step size is set to 1024 pixels in this embodiment, that is, there is a 50% overlap between adjacent field of view windows to ensure that abnormal cells in the boundary area are not missed. Next, a field of view priority score is calculated for each field of view window. Finally, the field of view windows are sorted in descending order of field of view priority score to output a list of high-risk fields of view.
[0056] The vision priority comprehensive scoring algorithm proposed in this invention is as follows:
[0057] ,
[0058] in, Score the priority of vision; The number of high-grade squamous intraepithelial lesion cells within the field of view, including HSIL and squamous cell carcinoma categories; The number of low-grade squamous intraepithelial lesion cells within the field of view; The number of atypical squamous cells within the field of view, including ASC-US and ASC-H categories; , , The weighting coefficients for each lesion type are set to 10, 5, and 2 respectively in this embodiment. The weighting coefficients increase with the severity of the lesion to ensure that high-grade lesions contribute the most to the priority of the field of vision. This represents the total number of all abnormal cells detected within the field of view. For the first The probability value of cell abnormality for each abnormal cell; The scaling factor contributing to the probability value is set to 0.5 in this embodiment to balance the contributions of cell number and cell abnormality probability value to the field of view priority score.
[0059] The high-risk field of view list contains information on field of view windows sorted by priority. Each record includes the coordinates of the top-left corner of the field of view window, the field of view priority score, and the number of various types of lesion cells within the field of view window. Preferably, the high-risk field of view list only retains field of view windows with a field of view priority score greater than a preset priority threshold, which is set to 5 in this embodiment. For a typical positive liquid-based smear, the high-risk field of view list usually contains 10 to 50 high-risk fields of view. Pathologists can review these fields of view in order according to the list to quickly complete the review and confirmation of positive cases.
[0060] VII. Results Output Step: The results output step integrates the processing results of the preceding steps to generate output information for pathologists to review and for use by the quality control platform.
[0061] In embodiments of the present invention, the result output step generates three types of output information: abnormal cell annotation atlas, TBS classification results, and TBS classification auxiliary suggestions.
[0062] An aberrant cell annotation atlas is a visual image created by overlaying aberrant cell annotation information onto a full-view digital image. For each cell detected as aberrant, its bounding box, classification label, and aberration probability value are annotated on the image. Different colors are used for the bounding boxes to distinguish different lesion types: normal cells have no bounding box, ASC type cells have a yellow bounding box, LSIL type cells have an orange bounding box, and HSIL type cells and squamous cell carcinoma type cells have a red bounding box. The classification label and aberration probability value are annotated in text form near the bounding box. The aberrant cell annotation atlas supports multi-level display, allowing pathologists to choose to display all aberrant cells or only specific types of aberrant cells as needed.
[0063] The TBS classification results are generated based on the distribution of abnormal cells detected throughout the entire slide and follow the diagnostic criteria of the TBS Reporting System. The TBS classification results include the following categories: Negative or Intraepithelial Lesion and Malignancy (NILM), indicating no abnormal cells detected or only a small number of normally reactive cells detected; Atypical Squamous Cells of Undetermined Significance (ASC-US), indicating the detection of atypical squamous cells of undetermined significance; Atypical Squamous Cells Not Excluding High-Grade Lesion (ASC-H), indicating the detection of atypical squamous cells that cannot exclude high-grade lesions; Low-Grade Squamous Intraepithelial Lesion (LSIL); High-Grade Squamous Intraepithelial Lesion (HSIL); and Squamous Cell Carcinoma (SCC). The rules for determining the TBS classification result are as follows: If at least one HSIL or squamous cell carcinoma type cell is detected in the entire slice, the TBS classification result is HSIL or squamous cell carcinoma; if no high-grade lesion is detected but LSIL type cells are detected, the TBS classification result is LSIL; if no definite lesion is detected but ASC-H type cells are detected, the TBS classification result is ASC-H; if only ASC-US type cells are detected, the TBS classification result is ASC-US; if no abnormal cells are detected or only a very small number of low-probability abnormal cells are detected, the TBS classification result is NILM.
[0064] TBS classification auxiliary suggestions are structured suggestion documents that integrate TBS classification results, abnormal cell distribution statistics, and high-risk area annotation information. The auxiliary suggestions include the following: TBS classification results and their confidence assessment; abnormal cell distribution statistics, including the number, percentage, and spatial distribution characteristics of each type of abnormal cell; high-risk area annotation information, including a list of coordinates and thumbnails of high-risk fields of view; and review suggestions, providing recommendations on whether manual review is needed and the focus of the review based on the number and distribution characteristics of abnormal cells. TBS classification auxiliary suggestions are stored in a structured data format for easy integration with cervical cancer screening quality control platforms.
[0065] This invention also provides an intelligent screening system for abnormal cells in cervical liquid-based cytology images, such as... Figure 2 As shown, the system includes an image acquisition module 1, a cell instance segmentation module 2, a morphological feature extraction module 3, a cascade classification module 4, an attention-guided localization module 5, a high-risk field of view screening module 6, and a result output module 7.
[0066] Image acquisition module 1 is used to acquire a full-field digital image of a liquid-based thin-layer cell smear. The full-field digital image undergoes color space normalization and contrast enhancement processing to obtain a preprocessed image. Image acquisition module 1 connects to a digital cytopathology scanner to acquire the full-field digital image. The specific implementation methods of color space normalization and contrast enhancement processing are as described in the image acquisition steps of the method embodiment.
[0067] Cell instance segmentation module 2 is used to input the preprocessed image into the cell instance segmentation network and output cell masks and corresponding cell image patches. Cell instance segmentation module 2 adopts a Mask R-CNN network structure, including feature pyramid network units, region proposal network units, and mask prediction branch units. The specific structure and parameter configuration of each unit are as described in the cell instance segmentation steps of the method embodiment.
[0068] The morphological feature extraction module 3 is used to calculate the nuclear area parameter, nucleocytoplasmic ratio parameter, chromatin distribution parameter, and nuclear membrane regularity parameter based on the cell mask and cell image patch, and combines these parameters into a morphological feature vector. The morphological feature extraction module 3 includes a nuclear segmentation unit, an area calculation unit, a nucleocytoplasmic ratio calculation unit, a chromatin feature calculation unit, and a nuclear membrane regularity calculation unit. The chromatin feature calculation unit uses the chromatin entropy calculation algorithm and chromatin aggregation degree calculation algorithm described in the method embodiments, and the nuclear membrane regularity calculation unit uses the nuclear membrane regularity calculation algorithm described in the method embodiments.
[0069] Cascaded classification module 4 is used to input morphological feature vectors into the cascaded classifier, which then sequentially classifies the cells through normal cell classification layers, atypical squamous cell classification layers, low-grade squamous intraepithelial lesion classification layers, and high-grade squamous intraepithelial lesion classification layers, outputting cell classification labels and cell abnormality probability values. Cascaded classification module 4 employs the cascaded classification strategy and cascaded classification confidence transfer algorithm described in the method embodiments, with the threshold parameters and weight coefficients of each classification layer configured as described in the method embodiments.
[0070] Attention-guided localization module 5 is used to generate an attention weight map based on the cell abnormality probability value, and to determine the spatial location coordinates of abnormal cells according to the attention weight map. Attention-guided localization module 5 employs the attention-guided spatial localization algorithm described in the method embodiment, dividing the full-view digital image into grid regions and calculating the attention weight of each region.
[0071] The high-risk field-of-view screening module 6 calculates a field-of-view priority score based on spatial location coordinates and cell abnormality probability values, and outputs a list of high-risk fields of view sorted by priority. The high-risk field-of-view screening module 6 employs the field-of-view priority comprehensive scoring algorithm described in the method embodiments, with the field-of-view window size, sliding step size, and weight coefficients configured as described in the method embodiments.
[0072] The result output module 7 is used to generate TBS classification results based on cell classification labels, and outputs anomalous cell annotation atlas and TBS classification auxiliary suggestions. The result output module 7 includes an annotation atlas generation unit, a TBS classification generation unit, and an auxiliary suggestion generation unit. The output format and content of each unit are as described in the result output steps of the method embodiment.
[0073] The modules described above are interconnected via data flow: the output of image acquisition module 1 is connected to the input of cell instance segmentation module 2; the output of cell instance segmentation module 2 is connected to the input of morphological feature extraction module 3; the output of morphological feature extraction module 3 is connected to the input of cascaded classification module 4; the output of cascaded classification module 4 is simultaneously connected to the inputs of attention-guided localization module 5 and result output module 7; the output of attention-guided localization module 5 is connected to the input of high-risk field-of-view screening module 6; and the output of high-risk field-of-view screening module 6 is connected to the input of result output module 7. These modules work collaboratively to complete the entire processing flow from the original liquid-based smear image to TBS classification-aided suggestions.
[0074] In a preferred embodiment of the invention, the system is deployed on a server equipped with a high-performance graphics processor. The server is configured with an Intel Xeon Gold 6248R or equivalent processor as the central processing unit, at least 256GB of memory, an NVIDIA A100 or equivalent graphics card as the graphics processor, and an NVMe solid-state drive array for storage to ensure high-speed data read and write. With this hardware configuration, the system processes a typical full-view digital image (approximately 40,000 pixels by 40,000 pixels) in an average of approximately 3 minutes, with cell instance segmentation taking approximately 1.5 minutes, morphological feature extraction and cascade classification taking approximately 1 minute, and attention-guided localization and high-risk field-of-view screening taking approximately 0.5 minutes.
[0075] In another preferred embodiment of the present invention, the system was validated on a publicly available cervical cytology image dataset. The test dataset contained 1000 full-field images of liquid-based smears annotated by pathologists, including 400 NILM negative samples and 600 positive samples (including 150 ASC-US, 50 ASC-H, 200 LSIL, 150 HSIL, and 50 squamous cell carcinoma). The test results showed that the cell instance segmentation module achieved a segmentation accuracy of 91%, which is about 15% higher than the traditional convolutional neural network segmentation method; the cascade classification module achieved an overall abnormal cell recognition accuracy of 92.3%, with a recognition sensitivity of over 96% for HSIL and squamous cell carcinoma; the high-risk field of view screening module generated a high-risk field of view list covering 98% of the key fields of view in positive cases, and the average time spent by pathologists using the high-risk field of view list for slide reading was reduced by about 40% compared to traditional full-slide, field-by-field reading.
[0076] The above description is merely a preferred embodiment of the present invention and is not intended to limit the invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A method for intelligent screening of abnormal cells in cervical liquid-based cytology images, characterized in that, Includes the following steps: The image acquisition step involves acquiring a full-field digital image of a liquid-based thin-layer cell smear, and then performing color space normalization and contrast enhancement processing on the full-field digital image to obtain a preprocessed image. The cell instance segmentation step involves inputting the preprocessed image into a cell instance segmentation network, which performs contour detection on each cell in the preprocessed image and outputs a cell mask and a corresponding cell image block, wherein the cell mask is a binarized cell contour image. The morphological feature extraction step involves calculating the cell nuclear area parameter based on the cell mask, calculating the nucleocytoplasmic ratio parameter based on the cell mask, calculating the chromatin distribution parameter based on the grayscale distribution of the cell image patch, calculating the nuclear membrane regularity parameter based on the boundary features of the cell mask, and combining the cell nuclear area parameter, the nucleocytoplasmic ratio parameter, the chromatin distribution parameter, and the nuclear membrane regularity parameter into a morphological feature vector. The cascaded classification step involves inputting the morphological feature vector into a cascaded classifier, which includes a normal cell classification layer, an atypical squamous cell classification layer, a low-grade squamous intraepithelial lesion classification layer, and a high-grade squamous intraepithelial lesion classification layer connected in sequence. Each classification layer sequentially classifies and judges the input morphological feature vector and outputs cell classification labels and corresponding cell abnormality probability values. The attention-guided localization step generates an attention weight map based on the cell abnormality probability value. The weight value of each spatial location in the attention weight map is positively correlated with the abnormality probability value of the cell corresponding to that location. The spatial location coordinates of the abnormal cell in the full-view digital image are determined based on the attention weight map. The high-risk field of view screening step involves calculating the field of view priority score within a preset-size field of view window based on the spatial positioning coordinates and the cell abnormality probability value, sorting the field of view windows in descending order of the field of view priority score, and outputting a list of high-risk fields of view. The results output step generates TBS classification results based on the cell classification labels, integrates the high-risk field of view list, the TBS classification results and the spatial positioning coordinates, and outputs an abnormal cell annotation atlas and TBS classification auxiliary suggestions.
2. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The nuclear area parameter is obtained by counting the number of pixels in the nuclear region of the cell mask and multiplying it by the unit pixel area coefficient; the nucleocytoplasmic ratio parameter is the ratio of the nuclear area parameter to the cytoplasmic area parameter, wherein the cytoplasmic area parameter is the total cell area minus the nuclear area parameter.
3. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The chromatin distribution parameters include chromatin entropy and chromatin aggregation degree, wherein the chromatin entropy value characterizes the uniformity of gray-scale distribution within the cell nucleus, and the chromatin aggregation degree characterizes the spatial aggregation characteristics of high-gray-scale regions.
4. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The nuclear membrane regularity parameter is calculated based on the circularity index of the cell nuclear boundary and the boundary curvature variance, wherein the circularity index is the normalized value of the ratio of the square of the cell nuclear perimeter to the cell nuclear area.
5. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The classification judgment method for each classification layer in the cascaded classifier is as follows: The normal cell classification layer determines whether the cell corresponding to the input morphological feature vector is a normal cell. If it is determined to be a normal cell, it outputs a normal cell label. If it is determined to be an abnormal cell, it passes the morphological feature vector to the atypical squamous cell classification layer. The atypical squamous cell classification layer determines whether the cell corresponding to the input morphological feature vector is an atypical squamous cell. If it is determined to be an atypical squamous cell, it outputs an ASC label. If it is determined to be a non-ASC type, it passes the morphological feature vector to the low-level squamous intraepithelial lesion classification layer. The low-grade squamous intraepithelial lesion classification layer determines whether the cell corresponding to the input morphological feature vector is a low-grade squamous intraepithelial lesion. If it is, it outputs the LSIL label; otherwise, it passes the morphological feature vector to the high-grade squamous intraepithelial lesion classification layer. The high-level squamous intraepithelial lesion classification layer classifies the input morphological feature vectors and outputs HSIL labels or squamous cell carcinoma labels.
6. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The attention weight map is generated as follows: the full-view digital image is divided into grid regions, the number of abnormal cells detected in each grid region and the corresponding cell abnormality probability value are counted, and the product of the number of abnormal cells and the cell abnormality probability value is used as the attention weight of the grid region.
7. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The visual field priority score is calculated as follows: the number of high-grade lesion cells in the visual field window is counted and multiplied by a first weighting coefficient; the number of low-grade lesion cells in the visual field window is counted and multiplied by a second weighting coefficient; the number of atypical cells in the visual field window is counted and multiplied by a third weighting coefficient; and the sum of the three products is obtained to obtain the visual field priority score, wherein the first weighting coefficient is greater than the second weighting coefficient, and the second weighting coefficient is greater than the third weighting coefficient.
8. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The cell instance segmentation network uses a Mask R-CNN network structure. It includes a feature pyramid network module, a region proposal network module, and a mask prediction branch module; The feature pyramid network module extracts multi-scale feature maps from the preprocessed image, the region proposal network module generates cell candidate regions based on the multi-scale feature maps, and the mask prediction branch module generates cell masks for the cell candidate regions.
9. The intelligent screening method for abnormal cells in cervical liquid-based cytology images according to claim 1, characterized in that, The TBS classification results include negative or intraepithelial lesion and malignant tumor results, atypical squamous cells of indeterminate significance results, atypical squamous cells that do not exclude high-grade lesions results, low-grade squamous intraepithelial lesion results, high-grade squamous intraepithelial lesion results, and squamous cell carcinoma results; the TBS classification auxiliary suggestions include the TBS classification results and the corresponding abnormal cell distribution statistics and high-risk area marking information.
10. A smart screening system for abnormal cells in cervical liquid-based cytology images, used to implement the smart screening method for abnormal cells in cervical liquid-based cytology images as described in any one of claims 1-9, characterized in that, include: The image acquisition module is used to acquire a full-field digital image of a liquid-based thin-layer cell smear, and to perform color space normalization and contrast enhancement processing on the full-field digital image to obtain a preprocessed image. The cell instance segmentation module is used to input the preprocessed image into the cell instance segmentation network and output a cell mask and corresponding cell image blocks. The morphological feature extraction module is used to calculate the cell nuclear area parameter, nucleocytoplasmic ratio parameter, chromatin distribution parameter and nuclear membrane regularity parameter based on the cell mask and the cell image block, and combine the parameters into a morphological feature vector. The cascaded classification module is used to input the morphological feature vector into the cascaded classifier, which then performs classification judgments sequentially through the normal cell classification layer, the atypical squamous cell classification layer, the low-grade squamous intraepithelial lesion classification layer, and the high-grade squamous intraepithelial lesion classification layer, and outputs cell classification labels and cell abnormality probability values. An attention-guided localization module is used to generate an attention weight map based on the cell abnormality probability value, and to determine the spatial localization coordinates of the abnormal cell based on the attention weight map. The high-risk field of view screening module is used to calculate the field of view priority score based on the spatial positioning coordinates and the cell abnormality probability value, and output a high-risk field of view list sorted by priority. The results output module is used to generate TBS classification results based on the cell classification labels, and output abnormal cell annotation maps and TBS classification auxiliary suggestions.