Choroid plexus segmentation quality evaluation method, program, medium, and device

The uncertainty of the vesicle cluster segmentation model is evaluated by using MCDO and TTA methods. By combining feature extraction and random forest model, erroneous segmentation results are automatically detected, which solves the problem of lack of credibility of the vesicle cluster segmentation model in the prior art and achieves high-precision segmentation quality evaluation and reliability improvement.

CN122223328APending Publication Date: 2026-06-16ZHEJIANG UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHEJIANG UNIV
Filing Date
2026-03-20
Publication Date
2026-06-16

AI Technical Summary

Technical Problem

Existing ventricle cluster segmentation models cannot effectively assess prediction reliability and lack interpretability, making it difficult to guarantee segmentation quality and requiring extensive manual review.

Method used

Uncertainty is assessed using Monte Carlo random deactivation (MCDO) and test-time enhancement (TTA) methods. Combined with ventricle cluster features, erroneous segmentation results are automatically detected through U-Net network and random forest model, providing segmentation quality assessment.

Benefits of technology

It improves the predictive reliability of ventricle cluster segmentation, reduces the cost of manual review, and achieves high-precision segmentation quality assessment.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122223328A_ABST
    Figure CN122223328A_ABST
Patent Text Reader

Abstract

This invention discloses a method, system, medium, and device for evaluating the quality of choroid cluster segmentation, belonging to the field of medical image analysis. The invention first uses the image to be segmented as the input image of a choroid cluster segmentation model, and evaluates the uncertainty of the choroid cluster segmentation results during model inference, thereby obtaining a segmentation result map and an uncertainty map. Then, choroid cluster features and uncertainty features are extracted from these respectively and input together into a choroid cluster segmentation quality evaluation model trained based on a machine learning model to obtain the Dice coefficient of the segmentation result map. Finally, a threshold method is used to determine the segmentation quality of the segmentation result map. This invention evaluates prediction uncertainty while outputting the prediction result, treating it as an important component of model interpretability. Combined with the characteristics of choroid clusters, it automatically detects erroneous segmentation results, greatly improving prediction reliability.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of medical image analysis, specifically relating to a method, system, medium, and device for assessing the quality of choroid plexus segmentation. Background Technology

[0002] Hydrocephalus is a common disease caused by the excessive accumulation of cerebrospinal fluid (CSF) in the ventricles. The choroid plexus, as the main organ producing CSF, could benefit from automated segmentation, potentially providing new parameters for CSF dynamics assessment and subsequent surgical planning. However, the choroid plexus is a small, ribbon-like structure with a significant volumetric effect, making accurate measurement difficult in traditional image analysis. While some researchers have proposed deep learning models for choroid plexus segmentation, the limited availability of manually labeled gold standards hinders quality assurance, necessitating extensive manual review. In recent years, with the rapid development of artificial intelligence, deep learning methods have been widely applied in fields such as medical image analysis, including choroid plexus segmentation. However, most models focus on the accuracy of predictions, lacking assessment of prediction reliability, leading to issues such as overconfidence and lack of interpretability. In the medical field, the uncertainty of model outputs is also crucial. Existing research has proposed various methods to quantify uncertainty, and the resulting uncertainty estimates can be used for tasks such as out-of-distribution detection, active learning, failure detection, calibration, and ambiguity modeling. In segmentation tasks, both prediction and uncertainty estimation are performed at the voxel level; therefore, they need to be aggregated into image-level scores for edge sample identification. Jungo et al. designed various aggregation methods for the obtained voxel-level uncertainty maps, using them as features to predict Dice coefficients to detect abnormal segmentation (Jungo A, Balsiger F, Reyes M. Analyzing thequality and challenges of uncertainty estimations for brain tumorsegmentation[J]. Frontiers in neuroscience, 2020, 14: 282.); Kahl et al. designed a patch and threshold-based method to aggregate uncertainty (Kahl KC, Lüth CT, Zenk M, et al. Values: A framework for systematic validation of uncertainty estimation in semanticsegmentation[J]. arXiv preprint arXiv:2401.08501, 2024.). However, none of these methods considered the features of the segmented objects. Summary of the Invention

[0003] Current ventricle cluster segmentation models can only provide prediction results, but cannot assess the reliability of these predictions, resulting in a lack of reliability and an inability to automatically evaluate the quality of the segmentation results, requiring significant manual review and annotation costs. Therefore, the purpose of this invention is to address the aforementioned problems in the prior art and provide a method for evaluating the quality of ventricle cluster segmentation. This invention assesses prediction uncertainty while outputting the prediction results, treating it as an important component of model interpretability. Combining this with the characteristics of ventricles, it automatically detects erroneous segmentation results, thereby improving prediction reliability.

[0004] The specific technical solution adopted in this invention is as follows: In a first aspect, the present invention provides a method for evaluating the quality of chord cluster segmentation, comprising: S1. The image to be segmented is used as the input image of the ventricle segmentation model, and Monte Carlo dropout (MCDO) and / or Test-time Augmentation (TTA) methods are used to evaluate the uncertainty of the ventricle segmentation results during the model inference process, so as to obtain the segmentation result map and the uncertainty map. S2. Extract the ventricle cluster features from the segmentation result map, extract the uncertainty features from the uncertainty map, and then input the extracted ventricle cluster features and uncertainty features together into the ventricle cluster segmentation quality assessment model trained based on the machine learning model to obtain the Dice coefficient of the segmentation result map. S3. Based on the Dice coefficient of the segmentation result image, the segmentation quality of the segmentation result image is judged by the threshold method.

[0005] As a preferred embodiment of the first aspect, the ventricle segmentation model employs a U-Net network, and the U-Net network incorporates a Dropout layer with a controllable enabled state before the deepest pooling operation and after the first layer upsampling operation.

[0006] As a preferred embodiment of the first aspect above, in S1, when using the Monte Carlo dropout (MCDO) method to evaluate the uncertainty of the ventricle segmentation results during model inference, the Dropout layer is kept on and multiple model inferences are performed. The ventricle segmentation results obtained in all inference rounds are averaged to obtain the segmentation result map. At the same time, the normalized entropy of each voxel is calculated based on the ventricle segmentation results obtained in all inference rounds as an uncertainty metric to obtain the uncertainty map.

[0007] As a preferred embodiment of the first aspect above, in S1, when using Test-time Augmentation (TTA) to evaluate the uncertainty of the ventricle segmentation results during model inference, the Dropout layer is kept off, and multiple model inferences are performed by introducing augmentation strategies including random horizontal flipping, random rotation, and random translation. The segmentation results obtained in each inference round are then subjected to the inverse operation of the introduced augmentation strategies and used as the ventricle segmentation results. The ventricle segmentation results obtained in all inference rounds are averaged to obtain the segmentation result map. At the same time, the normalized entropy of each voxel is calculated based on the ventricle segmentation results obtained in all inference rounds and used as the uncertainty metric to obtain the uncertainty map.

[0008] As a preferred embodiment of the first aspect above, in S2, the vein cluster features extracted from the segmentation result map preferably include shape features, first-order statistical features, gray-level co-occurrence matrix features, gray-level run length matrix features, gray-level size region matrix features, gray-level dependency matrix features, and neighborhood gray-level difference matrix features. Preferably, a total of 16 uncertainty features are extracted from the uncertainty graph, namely: a) The mean and logarithmic sum of all voxels in the uncertainty plot; b) The mean and logarithmic sum of all voxels in the uncertainty graph, excluding the dividing boundaries; c) The mean and logarithm sum of all voxels in the uncertainty graph, weighted by their normalized distances to the dividing boundary; d) The mean and logarithm of all voxels in the uncertainty graph, except for the dividing boundary, weighted by their normalized distances from the dividing boundary; e) The sum of all voxel values ​​in the uncertainty graph after volume normalization by dividing the voxel value by the foreground volume; f) The mean values ​​of the foreground region, background region, and boundary region in the uncertainty diagram; g) The maximum and minimum values ​​in the three-dimensional average pooling result of the uncertainty plot; h) The mean and sum of the regions in the uncertainty graph whose voxel values ​​are greater than the uncertainty threshold, where the uncertainty threshold is a quantile in the uncertainty graph. The corresponding voxel value, This represents the average prospect proportion in the uncertainty plot.

[0009] As a preferred embodiment of the first aspect above, the vesicle cluster segmentation quality assessment model is trained on a labeled dataset based on a random forest model, wherein the sample input of each training sample is the vesicle cluster features and uncertainty features extracted from a single image to be segmented, and the sample label is the Dice coefficient calculated from the segmented image based on the vesicle cluster segmentation model and the ground truth labeled image of the vesicle cluster.

[0010] As a preferred embodiment of the first aspect above, in step S3, when judging the segmentation quality of the segmentation result image by the threshold method, it is determined whether the Dice coefficient of the segmentation result image is greater than a preset coefficient threshold. If it is greater, the segmentation result image is considered to be qualified; otherwise, the segmentation result image is considered to be unqualified and is sent to the manual inspection process.

[0011] In a second aspect, the present invention provides a computer program product, including a computer program / instruction, which, when executed by a processor, can implement the ventricle cluster segmentation quality assessment method as described in any of the solutions of the first aspect above.

[0012] Thirdly, the present invention provides a computer-readable storage medium storing a computer program that, when executed by a processor, enables the ventricle cluster segmentation quality assessment method as described in any of the solutions of the first aspect above.

[0013] Fourthly, the present invention provides a computer electronic device, which includes a memory and a processor; The memory is used to store computer programs; The processor is configured to, when executing the computer program, implement the ventricle cluster segmentation quality assessment method as described in any of the first aspects above.

[0014] Compared with the prior art, the present invention has the following advantages: This invention proposes an uncertainty quantification and evaluation framework suitable for vesicle cluster segmentation models. It evaluates prediction uncertainty simultaneously with the output prediction results, treating it as a crucial component of model interpretability. By leveraging the characteristics of vesicle clusters, it automatically detects erroneous segmentation results, improving prediction reliability. Experimental results show that a model combining MCDO-based uncertainty features and segmented image features achieves the best Dice coefficient prediction accuracy, with a 5-fold cross-validation result of MSE = 0.0067 and R0.0067. 2 = 0.7249, PCC = 0.8681, = 0.8533, AURC = 0.2560. After the model was fully retrained, segmentation error detection was performed on a 72-example test set. For this ligament cluster segmentation task, an acceptable Dice threshold of 0.60 was set, resulting in a classification accuracy of 0.889. Attached Figure Description

[0015] Figure 1 A schematic diagram illustrating the steps of a choroid plexus segmentation quality assessment method; Figure 2 This is a schematic diagram of the structure of a computer electronic device; Figure 3 This is a diagram of the overall evaluation framework in the embodiments; Figure 4 This is a diagram showing the distribution of Dice coefficients in the training set in this embodiment. Figure 5 This is an example of an uncertainty graph in the embodiments; Figure 6 This is the segmentation error detection confusion matrix in the embodiment; Figure 7 This is an example of a failure in vascular bundle segmentation in the embodiment. Detailed Implementation

[0016] To make the above-mentioned objects, features, and advantages of the present invention more apparent and understandable, specific embodiments of the present invention will be described in detail below with reference to the accompanying drawings. Many specific details are set forth in the following description to provide a thorough understanding of the present invention. However, the present invention can be practiced in many other ways different from those described herein, and those skilled in the art can make similar modifications without departing from the spirit of the present invention. Therefore, the present invention is not limited to the specific embodiments disclosed below. Technical features in various embodiments of the present invention can be combined accordingly without mutual conflict.

[0017] This invention provides a method for evaluating the quality of venule cluster segmentation. This method is suitable for quantifying and evaluating the uncertainty of venule cluster segmentation models. It assesses prediction uncertainty while outputting prediction results, treating it as an important component of model interpretability. Combining the characteristics of venule clusters, it automatically detects erroneous segmentation results, thereby improving prediction reliability. The specific implementation of this venule cluster segmentation quality evaluation method is described in detail below.

[0018] S1. The image to be segmented is used as the input image of the ventricle cluster segmentation model. During the model inference process, Monte Carlo dropout (MCDO) and / or Test-time Augmentation (TTA) methods are used to evaluate the uncertainty of the ventricle cluster segmentation results, thereby obtaining the segmentation result map and the uncertainty map.

[0019] It should be noted that the venous bundle segmentation model in this invention is not limited and can be any model capable of segmenting venous bundle images, such as the traditional U-Net or its variants. Furthermore, the image to be segmented is generally a three-dimensional image, preferably a T1W MRI image; however, theoretically, this method can also be applied to two-dimensional images.

[0020] In embodiments of the present invention, the convolutional cluster segmentation model employs a U-Net network, and a controllable Dropout layer is added to the U-Net network before the deepest pooling operation and after the first layer upsampling operation. When the Dropout layer is enabled, the features input to the Dropout layer need to be forcibly subjected to Dropout operation; when the Dropout layer is disabled, this layer is skipped directly without performing Dropout operation.

[0021] It should be noted that both Monte Carlo dropout (MCDO) and test-time augmentation (TTA) are existing techniques for assessing the uncertainty of prediction results. See the existing literature for reference: Gal Y, Ghahramani Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning[C] / / international conference on machine learning. PMLR, 2016: 1050-1059. and Wang G, Li W, Aertsen M, et al. Aleatoric uncertainty estimationwith test-time augmentation for medical image segmentation with convolutional neural networks[J]. Neurocomputing, 2019, 338: 34-45.

[0022] In an embodiment of the present invention, when using the Monte Carlo dropout (MCDO) method to evaluate the uncertainty of the vesicle cluster segmentation results during model inference, the Dropout layer is kept on and multiple model inferences are performed. The vesicle cluster segmentation results obtained in all inference rounds are averaged to obtain the segmentation result map. Simultaneously, the normalized entropy of each voxel is calculated based on the vesicle cluster segmentation results obtained in all inference rounds as an uncertainty metric, thus obtaining the uncertainty map. Specifically, in an embodiment of the present invention, the MCDO method enables the Dropout layer during inference, performs T (the value of T can be adjusted according to actual conditions) segmentation inferences on the input image to be segmented, averages the T groups of segmentation results (i.e., averages the classification probability distribution after segmentation) to obtain the segmentation result map, and simultaneously calculates the prediction entropy of each voxel based on the T groups of segmentation results to obtain the uncertainty map at the voxel level.

[0023] In an embodiment of the present invention, when using Test-time Augmentation (TTA) to evaluate the uncertainty of the ventricle segmentation results during model inference, the Dropout layer is kept off and multiple model inferences are performed by introducing enhancement strategies including random horizontal flipping, random rotation, and random translation. The segmentation results obtained in each inference round are then subjected to the inverse operation of the introduced enhancement strategies and used as the ventricle segmentation results. The ventricle segmentation results obtained in all inference rounds are averaged to obtain the segmentation result map. At the same time, the normalized entropy of each voxel is calculated based on the ventricle segmentation results obtained in all inference rounds and used as the uncertainty metric to obtain the uncertainty map. Specifically, in an embodiment of the present invention, the TTA method performs T enhancements on each group of input images to be segmented after the Dropout layer is turned off during inference. The enhancement strategies can be composed of the following three combinations: (1) random horizontal flipping ( (2) Random small-angle rotation (3) Random translation ( The T-group enhancement results are processed by the ventricle cluster segmentation model to obtain segmentation results. The segmentation results are then restored back to the original image space by the inverse operation of the enhancement strategy, and then averaged to obtain the segmentation result map. At the same time, the prediction entropy of each voxel is calculated based on the T-group restored images as an uncertainty estimate, and then the uncertainty map is obtained.

[0024] S2. Extract venous cluster features from the segmentation result map, extract uncertainty features from the uncertainty map, and then input the extracted venous cluster features and uncertainty features together into a venous cluster segmentation quality assessment model trained based on a machine learning model to obtain the Dice coefficient of the segmentation result map.

[0025] It should be noted that the vein cluster features extracted from the segmentation result image and the uncertainty features extracted from the uncertainty image can be optimized according to actual conditions. In an embodiment of the present invention, as a preferred embodiment, the vein cluster features extracted from the segmentation result image include shape-based 3D features, first-order statistics, gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), gray-level dependence matrix (GLDM), and neighborhood gray-tone difference matrix (NGTDM). The shape features, first-order statistical features, gray-level co-occurrence matrix features, gray-level run-length matrix features, gray-level size region matrix features, gray-level dependency matrix features, and neighborhood gray-level difference matrix features mentioned above can be automatically extracted by PyRadiomics (https: / / pyradiomics.readthedocs.io / ). Each category contains multiple sub-features. Based on PyRadiomics' default automatic extraction mode, a total of 14 shape features, 18 first-order statistical features, 24 gray-level co-occurrence matrix features, 16 gray-level run-length matrix features, 16 gray-level size region matrix features, 14 gray-level dependency matrix features, and 5 neighborhood gray-level difference matrix features are shown in the table below: Table 1. Vein cluster features based on segmentation result maps Feature categories Number of features Sub-features Shape features 14 Elongation, Flatness, Least Axis Length, Major Axis Length, Maximum 2DDiameter Column, Maximum 2DDiameter Row, Maximum 2DDiameter Slice, Maximum 3DDiameter, Mesh Volume, Minor Axis Length, Sphericity, Surface Area, Surface Area to Volume Ratio, Voxel Volume First-order statistical characteristics 18 10th percentile, 90th percentile, Energy, Entropy, Interquartile Range, Kurtosis, Maximum, Mean Absolute Deviation, Mean, Median, Minimum, Range, Robust Mean Absolute Deviation, Root Mean Squared, Skewness, Total Energy, Uniformity, Variance Gray-level co-occurrence matrix 24 Autocorrelation, Cluster Prominence, Cluster Shade, Cluster Tendency, Contrast, Correlation, Difference Average, Difference Entropy, Difference Variance, Inverse Difference, Inverse Difference Moment, Inverse Difference Moment Normalized, Inverse Difference Normalized, Informational Measure of Correlation 1, Informational Measure of Correlation 2, Inverse Variance, Joint Average, Joint Energy, Joint Entropy, Maximum Correlation Coefficient, Maximum Probability, Sum Average, Sum Entropy Entropy, and variance (Sum Squares) Gray-scale run length matrix 16 Gray Level Non-Uniformity, Normalized Gray Level Non-Uniformity, Gray Level Variance, High Gray Level Run Emphasis, Long Run Emphasis, Long Run High Gray Level Emphasis, Long Run Low Gray Level Emphasis, Low Gray Level Run Emphasis, Run Entropy, Run Length Non-Uniformity, Normalized Run Length Non-Uniformity, Run Percentage, Run Variance, Short Run Emphasis, Short Run High Gray Level Emphasis, Short Run Low Gray Level Emphasis Gray-scale region matrix 16 Gray-level non-uniformity, normalized gray-level non-uniformity, gray-level variance, high gray-level zone emphasis, large area emphasis, large area high gray-level emphasis, large area low gray-level emphasis, low gray-level zone emphasis, size-zone non-uniformity, normalized size-zone non-uniformity, small area emphasis, small area high gray-level emphasis, small area low gray-level emphasis, zone entropy. Entropy, Zone Percentage, Zone Variance Gray-level dependency matrix 14 Dependence Entropy, Dependence Non-Uniformity, Normalized Dependence Non-Uniformity, Dependence Variance, Gray Level Non-Uniformity, Gray Level Variance, High Gray Level Emphasis, Large Dependence Emphasis, Large Dependence High Gray Level Emphasis, Large Dependence Low Gray Level Emphasis, Low Gray Level Emphasis, Small Dependence Emphasis, Small Dependence High Gray Level Emphasis, Small Dependence Low Gray Level Emphasis Neighborhood gray level difference matrix 5 Busyness, Coarseness, Complexity, Contrast, Strength Similarly, as a preferred approach, a total of 16 uncertainty features are extracted from the uncertainty graph, namely: a) The mean and logarithmic sum of all voxels in the uncertainty plot; b) The mean and logarithmic sum of all voxels in the uncertainty graph, excluding the dividing boundaries; c) The mean and logarithm sum of all voxels in the uncertainty graph, weighted by their normalized distances to the dividing boundary; d) The mean and logarithm of all voxels in the uncertainty graph, except for the dividing boundary, weighted by their normalized distances from the dividing boundary; e) The sum of all voxel values ​​in the uncertainty graph after volume normalization by dividing the voxel value by the foreground volume; f) The mean values ​​of the foreground region, background region, and boundary region in the uncertainty diagram; g) The maximum and minimum values ​​in the three-dimensional average pooling result of the uncertainty plot; h) The mean and sum of the regions in the uncertainty graph whose voxel values ​​are greater than the uncertainty threshold, where the uncertainty threshold is a quantile in the uncertainty graph. The corresponding voxel value, The average foreground proportion of the uncertainty map (obtained by averaging the foreground proportions of the uncertainty maps of all training samples).

[0026] It should be noted that the segmentation boundary in the uncertainty graph above refers to the boundary between the foreground and background after segmentation. To ensure the reliability of the calculation, in the embodiments of the present invention, during actual calculation, the boundary can be extended by 2 voxels to both sides of the foreground and background, forming a boundary region with a width of 4 voxels. Then, this boundary region is used to segment the foreground region and the background region.

[0027] It should be noted that the "logarithmic sum" mentioned above refers to summing the logarithmic values ​​of all voxels involved in the calculation. Each voxel's voxel value corresponds to a measure of its uncertainty, namely the aforementioned prediction entropy.

[0028] It should be noted that in c) and d) above, each voxel participating in the operation needs to calculate its nearest image distance to the segmentation boundary (i.e., the boundary region with a width of 4 voxels). Then, the nearest image distances of all participating voxels are summed as the denominator for normalization. The nearest image distance of each voxel is divided by this summation value to obtain the weight of the voxel. Each voxel multiplies its own weight by its own voxel value to obtain the normalized voxel value, and then participates in the calculation of the mean and logarithmic sum.

[0029] It should be noted that the above-mentioned three-dimensional average pooling result of the uncertainty graph is the result of average pooling the uncertainty graph using a three-dimensional convolution kernel. Specifically, a three-dimensional window is used to slide across the uncertainty graph with a stride of 1. The voxel values ​​within each sliding window are averaged and used as one voxel value in the three-dimensional average pooling result. The preferred size of this three-dimensional window is 10*10*10.

[0030] It should be noted that the aforementioned vesicle cluster segmentation quality assessment model can theoretically be trained based on various machine learning models, and in the embodiments of this invention, a random forest model is preferred. Specifically, this vesicle cluster segmentation quality assessment model, based on a random forest model, can be trained on a labeled dataset. The input of each training sample consists of vesicle cluster features and uncertainty features extracted from a single image to be segmented. The sample label is the Dice coefficient calculated from the segmented image based on the vesicle cluster segmentation model and the ground truth vesicle cluster image. When calculating the Dice coefficient of the sample label, the ground truth vesicle cluster image can be obtained by experts through mask annotation on the image samples to be segmented. The specific training method of the random forest model is existing technology and will not be elaborated further.

[0031] S3. Based on the Dice coefficient of the segmentation result image, the segmentation quality of the segmentation result image is judged by the threshold method.

[0032] It should be noted that when judging the segmentation quality of the segmentation result image using the threshold method, a single threshold can be set for binary classification, or multiple thresholds can be set for multi-class classification. In the embodiments of the present invention, it is preferable to pre-optimize a coefficient threshold and use a binary classification threshold method to obtain the segmentation quality judgment result of the segmentation result image. Specifically: for the Dice coefficient of each segmentation result image, it is judged whether the Dice coefficient of the segmentation result image is greater than the preset coefficient threshold. If it is greater, the segmentation result image is considered to be qualified; otherwise, the segmentation result image is considered to be unqualified and can be sent to the manual inspection process for manual review and inspection, thereby greatly reducing the workload of manual review.

[0033] It should be noted that the method steps shown in S1 to S3 above can essentially be implemented in the form of computer programs or software functional modules.

[0034] Therefore, based on the same inventive concept, such as Figure 2 As shown, the present invention also provides a computer electronic device corresponding to the vesicle cluster segmentation quality assessment method provided in the above embodiments, which includes a memory and a processor; The memory is used to store computer programs; The processor is configured to implement the ventricle cluster segmentation quality assessment method as described above when executing the computer program.

[0035] Furthermore, the logical instructions in the aforementioned memory can be implemented as software functional units and, when sold or used as independent products, can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention.

[0036] Therefore, based on the same inventive concept, the present invention provides a computer-readable storage medium corresponding to the vesicle cluster segmentation quality assessment method, wherein the storage medium stores a computer program, and when the computer program is executed by a processor, it can realize the vesicle cluster segmentation quality assessment method as described above.

[0037] Therefore, based on the same inventive concept, the present invention provides a computer program product, including a computer program / instruction, which, when executed by a processor, can implement the aforementioned ligament cluster segmentation quality assessment method.

[0038] Specifically, in the computer-readable storage medium of the above three embodiments, the stored computer program is executed by a processor, which can perform the aforementioned steps S1 to S3.

[0039] It is understood that the aforementioned storage media may include random access memory (RAM) or non-volatile memory (NVM), such as at least one disk storage device. Furthermore, the storage media may also be various media capable of storing program code, such as USB flash drives, external hard drives, magnetic disks, or optical discs.

[0040] It is understood that the processors mentioned above can be general-purpose processors, including central processing units (CPUs), network processors (NPs), etc.; they can also be digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.

[0041] It should also be noted that those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process of the system described above can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here. In the embodiments provided in this application, the division of steps or modules in the system and method is merely a logical functional division, and there may be other division methods in actual implementation. For example, multiple modules or steps may be combined or integrated together, and a module or step may also be split.

[0042] The present invention will further demonstrate the detailed training and inference process and technical effects of the ventricle cluster segmentation quality assessment method shown in steps S1 to S3 on a specific dataset through a specific embodiment, so as to facilitate understanding of the essence of the present invention.

[0043] Example In this embodiment, the reasoning framework for the venule cluster segmentation quality assessment method refers to steps S1-S3, as follows: Figure 1 As shown below, the specific training and reasoning process will be explained in detail.

[0044] This embodiment uses the two-stage choroid plexus segmentation model of Li et al. (Li, Jiaxin, et al. "Associations between the choroid plexus and tau in Alzheimer's disease using an active learning segmentation pipeline." Fluids and Barriers of the CNS21.1 (2024): 56.) as the choroid plexus segmentation model in S1 above. To simulate various segmentation quality conditions, enrich the numerical distribution of Dice coefficients, and facilitate model training, the training dataset in this embodiment includes: Dataset (1): 20 whole-brain T1W MRI datasets from the ADNI database, each image was annotated by 5 experts (average 4 years of experience), and the ground truth (GT) label was determined by voting; Dataset (2): 20 examples of low-quality data from dataset (1) after Gaussian blurring, simulating low-resolution data; Dataset (3): A training subset of 20 segmentation models, from the ADNI database, annotated as GT by expert LZ (11 years of MRI processing experience); Dataset (4): 31 T1W MRI data from the HCP Young, HCP Development and ABIDE databases, annotated as GT by expert JL (8 years of MRI processing experience); Dataset (5): 72 T1W MRI data from the HCP Young, HCP Aging and IXI databases were used as a test set to test the quality assessment model and were labeled as GT by expert JL (8 years of MRI processing experience).

[0045] Among them, datasets (1) and (2) are whole-brain images. The lateral ventricle segmentation module needs to locate the lateral ventricle first, and then crop it to a size of (96, 96, 80) centered on it before inputting it into the choroid plexus segmentation model. Datasets (3) to (5) have been cropped and can be directly fed into the choroid plexus segmentation model. Since this choroid plexus segmentation task only has two categories, foreground and background, we directly predict the probability of each voxel being classified as foreground. If it is greater than 0.5, it is judged as foreground (i.e., choroid plexus), otherwise it is background.

[0046] The 91 samples from datasets (1) to (4) were used as the training set for the segmentation quality assessment model. They were then fed into the U-Net segmentation model with a Dropout layer, and the Dice coefficient distribution was as follows: Figure 4 As shown. For each sample data, the prediction uncertainty was evaluated using two methods: Monte Carlo random inactivation (MCDO) and test-time enhanced TTA. The specific process is as follows: 1. The MCDO method assesses forecast uncertainty: It is important to note that for MCDO to be performed, the segmentation network must contain Dropout layers. This embodiment adopts the optimal Dropout layer placement strategy that balances uncertainty calibration and segmentation performance, as described in the study by Jungo et al. (Jungo A, Balsiger F, Reyes M. Analyzing the quality and challenges of uncertainty estimations for brain tumor segmentation[J]. Frontiers in neuroscience, 2020, 14: 282.). Dropout layers are added before the pooling operation at the bottom layer of the U-Net network in the two-stage chord cluster segmentation model and after the first upsampling operation (p = 0.5).

[0047] During inference, all Dropout layers in the U-Net network are kept enabled, and each input sample data is processed... Sub-inference prediction, each voxel Obtain the foreground probabilities of group T The final predicted probability is obtained by taking the average. .like voxels Classified as vesicle clusters, and otherwise as background, the resulting segmentation image is obtained. Simultaneously, the final predicted probability distribution corresponding to the T group segmentation results is also calculated. Calculate each voxel The normalized entropy is used as a measure of uncertainty, as shown in the following formula (1).

[0048]

[0049] in Indicates the number of categories in a vesicle cluster segmentation task. There are 2 categories, divided into foreground and background. Voxel representation Predicted as category The probability that the foreground is predicted in this task is... The probability of predicting the background is The uncertainty graph is composed of the normalized entropy of all voxels.

[0050] 2. The TTA method assesses forecast uncertainty: During the TTA test, each input image is processed... The following random enhancements are performed, and each enhancement strategy must be executed, which includes the following three operations: 1) with Flip the input image horizontally; 2) Random small-angle rotation That is, rotation parameters ; 3) with Random translation, i.e., translation parameters .

[0051] The results of the T groups of enhanced samples are restored back to the original image space by inverse operations of their respective enhancement strategies. Then, the probability distribution of each voxel in the T groups of restored images is averaged to obtain the final foreground prediction probability. After thresholding, the final segmentation result map is obtained. Similarly, based on the final prediction probability distribution corresponding to the T groups of segmentation results, the normalized entropy of each voxel is calculated according to formula (1) to obtain the uncertainty map at the voxel level.

[0052] Based on the uncertainty assessment method described above, we can obtain both the segmentation result and the uncertainty graph at the output of the segmentation model. Figure 5 These are three examples; brighter colors indicate higher uncertainty estimates. Based on the uncertainty map, high-risk regions can be located for vesicle cluster segmentation. Observations show that high-uncertainty regions are concentrated at segmentation boundaries and error areas, which helps guide manual correction of segmentation results.

[0053] A) Extraction of uncertainty feature X1: The segmentation results and uncertainty maps of 91 sample data were used to train a ventricle segmentation quality assessment model. For the uncertainty map, 16 features (denoted as X1) were designed and extracted. Some features were extracted from the entire uncertainty map, while others were extracted from the region after removing the segmentation boundary (extending 2 voxels inward and outward from the foreground / background boundary, for a total width of 4 voxels). Their definitions are as follows: Feature 1. The mean and logarithm sum of the entire graph; Feature 2. The mean and logarithmic sum of all voxels remaining after masking the segmentation boundary (extending 2 voxels inward and outward from the boundary between the foreground and background, for a total width of 4 voxels); Feature 3. The Euclidean distance between each voxel and the nearest segmentation boundary within the entire map is used as the weight, and the weight of voxels within the boundary region is reset to 0 (i.e., the segmentation boundary is masked). After calculating the sum of the weights of all voxels, the weight of each voxel is divided by the sum of the weights for normalization. The normalized weights are then added to the voxel values. Finally, the mean and logarithmic sum of the weighted voxel values ​​within the entire map are calculated. Feature 4. The Euclidean distance between each voxel in the entire map and the nearest segmentation boundary is used as the weight, and the weight of voxels in the boundary region is reset to 0. The weights of all voxels are incremented by 1 (i.e., boundary voxels are considered), and then the weights are summed again to obtain the total weight. The weight of each voxel is then divided by the total weight for normalization. The normalized weights are then added to the voxel values. Finally, the mean and logarithmic sum of the weighted voxel values ​​in the entire map are calculated. Feature 5. Calculate the foreground volume obtained from the full image segmentation, divide the voxel value of each voxel within the full image range by the volume of the segmented foreground to normalize the volume, avoid the influence of the size of the segmented target, and calculate the mean and logarithm sum of the normalized full image volume; Feature 6. The mean values ​​of the foreground region, background region, and boundary region obtained from full image segmentation; Feature 7. Patch-based method: Use a sliding window of size 10*10*10 (step size 1, not considering image boundaries) to slide across the entire image, average all voxel values ​​within each sliding window, and then select the maximum and minimum values ​​from all averages; Feature 8. Threshold-based method: First calculate the average foreground ratio of all 91 samples. Then calculate the quantiles. The threshold is set to the voxel value corresponding to the quantile in the whole image, and the mean and sum are calculated for the regions in the whole image whose voxel values ​​are greater than the threshold.

[0054] B) X2 extraction of venule cluster features: For the venule cluster segmentation image, 107 features (denoted as X2) are automatically extracted using the default parameters of PyRadiomics, as shown in Table 1. These features include the following 7 categories: Feature 1: 14 shape features that describe the geometry of the segmented region, such as volume, surface area, sphericity, diameter, etc. Feature 2: 18 first-order statistical features that describe the statistical information of voxel intensity distribution, including mean, variance, energy, entropy, etc. Feature 3: 24 gray-level co-occurrence matrix features, describing the spatial relationship between voxel gray values, including contrast, correlation, etc.; Feature 4: 16 gray-scale run length matrix features, describing the distribution of continuous voxels with the same gray value in the segmented image, including long run emphasis, short run emphasis, etc. Feature 5: 16 gray-scale region matrix features, describing the distribution of regions with the same gray value in the segmented image, including large region emphasis, small region emphasis, etc. Feature 6: 14 gray-level dependency matrix features, describing the distribution of voxel gray values ​​and their dependencies, including dependency entropy, dependency variance, etc. Feature 7: Five neighborhood gray-level difference matrix features, describing the difference between voxel gray-level values ​​and their neighboring gray-level values, such as intensity, contrast, roughness, etc.

[0055] In this embodiment, to demonstrate the combined effect of uncertainty feature X1 and plexus cluster feature X2, a random forest model was trained using X1, X2, and the combined feature X1+X2 to predict the Dice coefficient. During training, the random forest regressor from scikit-learn was used with the following parameters: nestimators: 10, criterion: 'squared_error', max depth: None. Five-fold cross-validation was employed, and the training was repeated five times, with the average value used as the final prediction result. The evaluation results are shown in Table 2. Specific evaluation metrics included mean squared error (MSE) and coefficient of determination (R²). 2 Pearson correlation coefficient (PCC), Spearman correlation coefficient ( The area under the risk-coverage curve (AURC) is as follows: 1. MSE: Measures the square of the average error between the predicted value and the actual value. The smaller the value, the more accurate the prediction.

[0056] 2.R 2 This reflects the degree to which the model fits the data, that is, the proportion of the variation in the dependent variable that the model can explain. It takes a value from 0 to 1, and the closer it is to 1, the better the fit.

[0057] 3. PCC: Measures the strength and direction of the linear correlation between two variables, ranging from -1 to 1. The larger the absolute value, the stronger the correlation.

[0058] 4. Spearman correlation coefficient ( ): Measures the order consistency between two variables, with values ​​ranging from -1 to 1.

[0059] 5. AURC: Measures the relationship between the error rate and coverage of unselected samples at different screening thresholds. The smaller the value, the fewer the false predictions at a given coverage level, and the better the balance between risk and coverage.

[0060] Table 2. Results of 5-fold cross-validation of the segmentation quality assessment model

[0061] As shown in Table 1, compared to TTA, using uncertainty features and prediction result features based on MCDO can more accurately predict the Dice coefficient and better achieve segmentation quality assessment. The importance of 123 features was evaluated using the Gini impurity reduction method of random forest. Among them, the volume-normalized summation feature (score = 0.4717) had the highest importance score, indicating that it has the greatest impact on prediction performance.

[0062] Based on the above results, the optimal model selected in this embodiment is a random forest model using MCDO as the uncertainty assessment method and a combination of uncertainty feature X1 and ventricle cluster feature X2 as input. Therefore, after retraining this optimal model with 91 sample data, a ventricle cluster segmentation quality assessment model is obtained, and predictions are performed on a 72-sample test set. The inference process of the ventricle cluster segmentation quality assessment model during prediction is as follows: sample images from the test set are input into a two-stage ventricle cluster segmentation model with a Dropout layer, and the MCDO method is used during model inference. The process involves several inference steps to obtain a segmentation result map and an uncertainty map. Then, the ventricle cluster feature X1 is extracted from the segmentation result map, and the uncertainty feature X2 is extracted from the uncertainty map. The extracted ventricle cluster feature X1 and uncertainty feature X2 are then input into the aforementioned ventricle cluster segmentation quality evaluation model trained based on a random forest model to obtain the Dice coefficient of the segmentation result map. Finally, based on a set Dice threshold, the segmentation quality of the segmentation result map is judged by threshold binary classification.

[0063] In this embodiment, the test results are as follows: MSE = 0.0073, R 2 = 0.4856, PCC = 0.7697, =0.8109, AURC = 0.2718. Based on the predicted Dice coefficient, in order to further filter out samples that need to be manually checked, a Dice threshold needs to be set to divide the samples into two categories: accept (1) and reject (0). In this embodiment, for the current ventricle cluster segmentation task, the Dice threshold is set to 0.60, the classification accuracy is 0.889, and the confusion matrix is ​​as follows. Figure 6 As shown. Figure 7 These are examples of failed choroidal cluster segmentation that were automatically detected by this method in this embodiment and require manual inspection. In (a), the actual Dice is 0.5817 and the predicted Dice is 0.5368; in (b), the actual Dice is 0.4441 and the predicted Dice is 0.4826.

[0064] The embodiments described above are merely some preferred implementations of the present invention and are not intended to limit the invention. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, all technical solutions obtained through equivalent substitution or transformation fall within the protection scope of the present invention.

Claims

1. A method for evaluating the quality of venule cluster segmentation, characterized in that, include: S1. The image to be segmented is used as the input image of the ventricle cluster segmentation model. During the model inference process, Monte Carlo dropout (MCDO) and / or Test-time Augmentation (TTA) methods are used to evaluate the uncertainty of the ventricle cluster segmentation results, thereby obtaining the segmentation result map and the uncertainty map. S2. Extract the ventricle cluster features from the segmentation result map, extract the uncertainty features from the uncertainty map, and then input the extracted ventricle cluster features and uncertainty features together into the ventricle cluster segmentation quality assessment model trained based on the machine learning model to obtain the Dice coefficient of the segmentation result map. S3. Based on the Dice coefficient of the segmentation result image, the segmentation quality of the segmentation result image is judged by the threshold method.

2. The method for evaluating the quality of venule cluster segmentation as described in claim 1, characterized in that, The network cluster segmentation model uses a U-Net network, and a Dropout layer with a controllable enabled state is added to the U-Net network before the deepest pooling operation and after the first layer upsampling operation.

3. The method for evaluating the segmentation quality of venule clusters as described in claim 2, characterized in that, In S1, when the Monte Carlo dropout (MCDO) method is used to evaluate the uncertainty of the ventricle segmentation results during model inference, the Dropout layer is kept on and multiple model inferences are performed. The ventricle segmentation results obtained in all inference rounds are averaged to obtain the segmentation result map. At the same time, the normalized entropy of each voxel is calculated based on the ventricle segmentation results obtained in all inference rounds as an uncertainty metric to obtain the uncertainty map.

4. The method for evaluating the quality of choroid plexus segmentation as described in claim 2, characterized in that, In step S1, when using Test-time Augmentation (TTA) to evaluate the uncertainty of the ventricle segmentation results during model inference, the Dropout layer is kept off, and multiple model inferences are performed by introducing augmentation strategies including random horizontal flipping, random rotation, and random translation. The segmentation results obtained in each inference round are then subjected to the inverse operation of the introduced augmentation strategies and used as the ventricle segmentation results. The ventricle segmentation results obtained in all inference rounds are averaged to obtain the segmentation result map. At the same time, the normalized entropy of each voxel is calculated based on the ventricle segmentation results obtained in all inference rounds and used as the uncertainty metric to obtain the uncertainty map.

5. The method for evaluating the quality of choroid plexus segmentation as described in claim 1, characterized in that, In step S2, preferably, the vein cluster features extracted from the segmentation result image include shape features, first-order statistical features, gray-level co-occurrence matrix features, gray-level run length matrix features, gray-level size region matrix features, gray-level dependency matrix features, and neighborhood gray-level difference matrix features. Preferably, a total of 16 uncertainty features are extracted from the uncertainty graph, namely: a) The mean and logarithmic sum of all voxels in the uncertainty plot; b) The mean and logarithmic sum of all voxels in the uncertainty graph, excluding the dividing boundaries; c) The mean and logarithm sum of all voxels in the uncertainty graph, weighted by their normalized distances to the dividing boundary; d) The mean and logarithm of all voxels in the uncertainty graph, except for the dividing boundary, weighted by their normalized distances from the dividing boundary; e) The sum of all voxel values ​​in the uncertainty graph after volume normalization by dividing the voxel value by the foreground volume; f) The mean values ​​of the foreground region, background region, and boundary region in the uncertainty diagram; g) The maximum and minimum values ​​in the three-dimensional average pooling result of the uncertainty plot; h) The mean and sum of the regions in the uncertainty graph whose voxel values ​​are greater than the uncertainty threshold, where the uncertainty threshold is a quantile in the uncertainty graph. The corresponding voxel value, This represents the average prospect proportion in the uncertainty plot.

6. The method for evaluating the quality of choroid plexus segmentation as described in claim 1, characterized in that, The vesicle cluster segmentation quality assessment model is trained on a labeled dataset based on a random forest model. The input of each training sample consists of vesicle cluster features and uncertainty features extracted from a single image to be segmented. The sample label is the Dice coefficient calculated from the segmented image based on the vesicle cluster segmentation model and the ground truth labeled image of the vesicle cluster.

7. The method for evaluating the quality of choroid plexus segmentation as described in claim 1, characterized in that, In step S3, when judging the segmentation quality of the segmentation result image by the threshold method, it is determined whether the Dice coefficient of the segmentation result image is greater than the preset coefficient threshold. If it is greater, the segmentation result image is considered to be qualified; otherwise, the segmentation result image is considered to be unqualified and is sent to the manual inspection process.

8. A computer program product comprising a computer program / instructions, characterized in that, When the computer program / instruction is executed by the processor, it can implement the vesicle cluster segmentation quality assessment method as described in any one of claims 1 to 7.

9. A computer-readable storage medium, characterized in that, The storage medium stores a computer program, which, when executed by a processor, implements the vesicle cluster segmentation quality assessment method as described in any one of claims 1 to 7.

10. A computer electronic device, characterized in that, Including memory and processor; The memory is used to store computer programs; The processor is configured to implement the vesicle cluster segmentation quality assessment method as described in any one of claims 1 to 7 when executing the computer program.