Model training method, electronic device, and storage medium
By using a two-stage segmentation network model and data augmentation technology, the problems of low segmentation accuracy and heavy workload for doctors in prostate cancer diagnosis have been solved, achieving accurate and automated segmentation of areas such as the prostate and improving image processing efficiency.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHANGHAI UNITED IMAGING INTELLIGENCE CO LTD
- Filing Date
- 2022-11-23
- Publication Date
- 2026-06-26
AI Technical Summary
Existing medical image segmentation methods for prostate cancer diagnosis suffer from low accuracy, inability to achieve precise regional localization, and a heavy workload for doctors, especially in multimodal MRI diagnosis which requires a lot of repetitive work.
A two-stage segmentation network model is adopted. First, the region of interest is initially segmented through the first segmentation network. Then, attention-based data augmentation and pre-defined annotation information are used to update the parameters of the first and second segmentation networks. Finally, an image segmentation model is constructed to achieve accurate segmentation of the region of interest and automatic localization of sub-regions.
It improves the accuracy and efficiency of medical image segmentation, reduces the workload of doctors, and enables automated zoning and localization of areas such as the prostate, adapting to various changes in prostate morphology.
Smart Images

Figure CN116309626B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to model training methods, image segmentation methods, electronic devices and storage media. Background Technology
[0002] In medical research and practice, it is common to measure the shape, boundaries, and volume of human body parts in order to obtain pathological information about those parts, thereby assisting doctors in making accurate diagnoses.
[0003] The conventional clinical diagnostic method involves doctors visually reviewing and comparing multiple medical images to obtain the segmentation results of human body parts. However, doctors are subjective in manually marking boundary areas and often have to rely on their own experience to outline the contours. This process is lengthy, time-consuming, and inefficient, resulting in a heavy workload for doctors.
[0004] Therefore, there is an urgent need to provide model training methods, image segmentation methods, electronic devices, and storage media to improve existing technologies. Summary of the Invention
[0005] The purpose of this application is to provide a model training method, an image segmentation method, an electronic device and a storage medium that automatically provides the partitioning and localization results of medical images, with high image processing efficiency and reduced workload for doctors.
[0006] The objective of this application is achieved through the following technical solution:
[0007] Firstly, this application provides a model training method, the method comprising:
[0008] A sample medical image is acquired and input into a first segmentation network for first-stage segmentation to obtain a first segmentation result of the sample medical image for the region of interest.
[0009] The first segmentation result and the sample medical image are respectively input into the second segmentation network to perform the second stage segmentation, thereby obtaining the second segmentation result of the sample medical image for at least one sub-region of the region of interest;
[0010] The parameters of the first segmentation network are updated using the first segmentation result and the preset annotation information of the sample medical image. The parameters of the second segmentation network are updated using the second segmentation result and the preset annotation information. The preset annotation information includes region annotation information of at least one sub-region.
[0011] An image segmentation model is constructed based on the updated first segmentation network and the second segmentation network.
[0012] In some optional embodiments, the step of inputting the first segmentation result and the sample medical image into a second segmentation network for a second-stage segmentation to obtain a second segmentation result of the sample medical image for at least one sub-region of the region of interest includes:
[0013] Attention-based data augmentation is performed on the sample medical images to obtain augmented images corresponding to the sample medical images;
[0014] The first segmentation result and the enhanced image are respectively input into the second segmentation network for second-stage segmentation to obtain the second segmentation result of the sample medical image for at least one sub-region of the region of interest.
[0015] In some optional embodiments, the attention-based data augmentation of the sample medical image to obtain an augmented image corresponding to the sample medical image includes:
[0016] The sample medical image is input into a preset abnormal region segmentation model to obtain the abnormal region segmentation result of the sample medical image for the preset abnormal region;
[0017] Based on the abnormal region segmentation results, an abnormal region probability map with the same size as the sample medical image is generated.
[0018] Based on the probability map of the abnormal region and the preset intensity of data augmentation, the sample medical image is augmented to obtain the augmented image corresponding to the sample medical image.
[0019] In some optional embodiments, the second-stage segmentation process includes:
[0020] The feature map corresponding to the sample medical image is output using the encoding module of the second segmentation network;
[0021] The feature map is input into the segmentation decoding module of the second segmentation network to obtain a second segmentation result of the sample medical image for at least one sub-region of the region of interest.
[0022] In some optional embodiments, the method further includes:
[0023] The feature map is input into the classification module of the second segmentation network to obtain a binary classification result of the sample medical image for the region of interest. The binary classification result is used to indicate whether there are abnormal regions in the region of interest.
[0024] In some optional embodiments, updating the parameters of the second segmentation network using the second segmentation result and the preset annotation information includes:
[0025] Using the second segmentation result and the preset annotation information of the sample medical image, the segmentation loss of the second segmentation network is obtained;
[0026] Using the binary classification results and the classification annotation information of the sample medical images, the classification loss of the second segmentation network is obtained;
[0027] The parameters of the second segmentation network are updated based on the segmentation loss and the classification loss.
[0028] In some optional embodiments, updating the parameters of the second segmentation network based on the segmentation loss and the classification loss includes:
[0029] Obtain the segmentation weights corresponding to the segmentation loss and the classification weights corresponding to the classification loss, wherein the sum of the segmentation weights and the classification weights is a preset value;
[0030] The parameters of the second segmentation network are updated based on the segmentation loss, the segmentation weight, the classification loss, and the classification weight.
[0031] Secondly, this application provides an image segmentation method, characterized in that the method includes:
[0032] The image to be processed is input into an image segmentation model to obtain image segmentation results of the image to be processed for at least one sub-region of the region of interest;
[0033] The image segmentation model is trained using any of the above-mentioned model training methods.
[0034] Thirdly, this application provides an electronic device, characterized in that the electronic device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to implement the steps of any of the above-described model training methods or the steps of the above-described image segmentation methods.
[0035] Fourthly, this application provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of any of the above-described model training methods or the steps of the above-described image segmentation methods.
[0036] The aforementioned model training method, image segmentation method, electronic device, and storage medium, in the model training process, firstly input the sample medical image into the first segmentation network to obtain the first segmentation result for the region of interest (ROI). Then, the sample medical image and the first segmentation result are respectively input into the second segmentation network to obtain the second segmentation result for at least one sub-region of the ROI. Then, the first segmentation network is trained using the sub-region labeling information and the first segmentation result, and the second segmentation network is trained using the sub-region labeling information and the second segmentation result. The trained first and second segmentation networks are used to construct an image segmentation model. Using this image segmentation model, the entire ROI of the medical image can be segmented first, and then sub-regions can be segmented within the ROI. In other words, the image segmentation model can automatically provide the partitioning and localization results of medical images, with high image processing efficiency and reduced workload for doctors. Attached Figure Description
[0037] The present application will be further described below with reference to the accompanying drawings and embodiments.
[0038] Figure 1 This is a flowchart illustrating a model training method provided in an embodiment of this application.
[0039] Figure 2 This is a schematic diagram illustrating the principle of a model training method provided in an embodiment of this application.
[0040] Figure 3 This is a schematic diagram illustrating the principle of a first-stage segmentation provided in an embodiment of this application.
[0041] Figure 4 This is a schematic diagram of a second-stage segmentation process provided in an embodiment of this application.
[0042] Figure 5 This is a schematic diagram illustrating the principle of the training process of a second segmentation network provided in an embodiment of this application.
[0043] Figure 6 This is a schematic diagram of a data augmentation process provided in an embodiment of this application.
[0044] Figure 7 This is a schematic diagram comparing a sample medical image with an abnormal region probability map provided in an embodiment of this application.
[0045] Figure 8 This is a schematic diagram of a histogram statistical result provided in an embodiment of this application.
[0046] Figure 9 This is a schematic diagram showing the comparison results of a sample medical image before and after data enhancement, provided in an embodiment of this application.
[0047] Figure 10 This is a schematic diagram of another histogram statistical result provided in an embodiment of this application.
[0048] Figure 11 This is a schematic diagram showing the comparison results of another sample medical image before and after data enhancement, provided in an embodiment of this application.
[0049] Figure 12 This is a schematic diagram of a multi-task learning process provided in an embodiment of this application.
[0050] Figure 13 This is a schematic diagram of a process for updating a second segmentation network provided in an embodiment of this application.
[0051] Figure 14 This is a schematic diagram of another process for updating the second segmentation network provided in an embodiment of this application.
[0052] Figure 15 This is a structural block diagram of an electronic device provided in an embodiment of this application.
[0053] Figure 16 This is a structural block diagram of a program product provided in an embodiment of this application. Detailed Implementation
[0054] The present application will now be further described in conjunction with the accompanying drawings and specific embodiments. It should be noted that, without conflict, the various embodiments or technical features described below can be arbitrarily combined to form new embodiments.
[0055] In the embodiments of this application, "at least one" refers to one or more, and "more than one" refers to two or more. "And / or" describes the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A alone, A and B simultaneously, or B alone, where A and B can be singular or plural. The character " / " generally indicates that the preceding and following related objects are in an "or" relationship. "At least one of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one of a, b, or c can represent: a, b, c, a and b, a and c, b and c, a and b and c, where a, b, and c can be single or multiple. It is worth noting that "at least one" can also be interpreted as "one or more".
[0056] It should also be noted that, in the embodiments of this application, the words "exemplary" or "for example" are used to indicate that they are examples, illustrations, or descriptions. Any implementation or design scheme described as "exemplary" or "for example" in the embodiments of this application should not be construed as being more preferred or advantageous than other implementations or design schemes. Specifically, the use of words such as "exemplary" or "for example" is intended to present the relevant concepts in a specific manner.
[0057] A fundamental prerequisite for successful radiotherapy in treating cancer is confirming the location and size of the tumor while protecting vital organs surrounding the lesion. Therefore, accurately and efficiently extracting the contours of key organs and calculating the total tumor volume are crucial steps in radiotherapy and surgical navigation, and are of significant research importance. When lesions occur around the prostate, the prostate must be delineated separately and protected to avoid radiation exposure. Therefore, accurate prostate delineation is of paramount importance.
[0058] Prostate segmentation presents particular challenges. Firstly, the rectoprostate and bladder-prostate junctions have virtually no boundary and very little difference in grayscale. Secondly, the contents of the bladder and rectum change from one treatment procedure to another, thus their shape and size also change. Finally, the shape of the prostate itself is influenced by the bladder and rectum.
[0059] Magnetic resonance imaging (MRI) is increasingly valued by clinicians as a staging and diagnostic method for prostate cancer. However, during diagnosis, doctors need to repeatedly review and compare different modalities of prostate MRI using a prostate cancer analysis (PAC) system. This low diagnostic efficiency and heavy workload are major pain points for doctors in the field of prostate diagnosis.
[0060] PACs (Picture Archiving and Communication Systems) are systems used in hospital radiology departments. Their primary task is to digitize and store massive amounts of various medical images (including MRI, CT, PET, X-ray, PET-CT, and PET-MR images) generated daily through various interfaces (such as the DICOM protocol). When needed, these images can be quickly retrieved and used under certain authorization, while also providing auxiliary diagnostic and management functions. PACs play a crucial role in data transmission between various imaging devices and in the organization of data storage.
[0061] Among them, MRI, magnetic resonance imaging, also known as spin imaging, is a diagnostic technique that uses the nuclear magnetic resonance phenomenon of certain atomic nuclei in human tissues to process the obtained radio frequency signals through electronic computers and reconstruct an image of a certain layer of the human body.
[0062] CT (Computed Tomography) is a computed tomography scan; PET (Positron Emission Tomography) is a positron emission tomography scan. PET-CT integrates PET and CT, with PET providing detailed functional and metabolic molecular information about lesions, while CT provides precise anatomical localization of lesions. A single imaging session can obtain tomographic images of the whole body from all directions, featuring sensitivity, accuracy, specificity, and precise localization. It allows for a clear understanding of the overall condition of the body, achieving the goal of early detection of lesions and diagnosis of diseases. PET-MR is a large-scale functional metabolic and molecular imaging diagnostic device that combines the powerful features of PET and MR imaging. It has the examination functions of both PET and MR, achieving maximum complementary advantages.
[0063] DICOM (Digital Imaging and Communications in Medicine) is an international standard (ISO 12052) for medical images and related information. It defines a medical image format that meets clinical needs for data exchange. DICOM is widely used in radiology, cardiovascular imaging, and diagnostic radiology equipment (X-ray, CT, MRI, ultrasound, etc.), and is increasingly being applied in other medical fields such as ophthalmology and dentistry. Among tens of thousands of medical imaging devices in use, DICOM is one of the most widely deployed medical information standards.
[0064] In addition, when diagnosing prostate cancer, doctors also need to provide corresponding scores for PI-RADs based on multimodal MRI, which often requires a lot of repetitive work and further increases the workload of doctors.
[0065] PI-RADs (Prostate Imaging Reporting and Data System) are guidelines used to guide clinicians in accurately classifying the malignancy of prostate cancer.
[0066] PI-RADs provide a scoring method for assessing the likelihood of clinically significant prostate cancer based on the combined findings of prostate T2WI (T2-weighted imaging), DWI (diffusion-weighted imaging), and DCE (dynamic contrast enhancement). Specifically:
[0067] 1 point: Very low, extremely unlikely to have prostate cancer;
[0068] 2 points: Low, prostate cancer is unlikely;
[0069] 3 points: Moderate, prostate cancer suspected;
[0070] 4 points: High, prostate cancer may be present;
[0071] 5 points: Very high, prostate cancer is highly likely.
[0072] With the development of deep learning and other artificial intelligence algorithms in recent years, breakthroughs have been achieved in the field of medical image processing. Deep learning has been gradually applied to some existing disease diagnoses. In the field of prostate cancer analysis, some deep learning-based prostate segmentation algorithms have emerged. However, existing prostate segmentation still has the following shortcomings:
[0073] 1. Segmentation using traditional image processing methods is not very accurate;
[0074] 2. The inability to achieve precise partition positioning results in the inability to achieve full-automatic function, presenting doctors with a preliminary result of segmentation (the various partitions of the prostate are not segmented), which is a "semi-automatic" result.
[0075] 3. The standard template and registration method cannot adapt to the prostate morphology under various conditions (such as thinning of the peripheral zone, invasion of cancerous lesions, etc.).
[0076] The following section introduces a model training method, image segmentation method, electronic device, and storage medium that automatically provides the partitioning and localization results of medical images, with high image processing efficiency and reduced workload for doctors.
[0077] See Figure 1 and Figure 2 , Figure 1 This is a flowchart illustrating a model training method provided in an embodiment of this application. Figure 2 This is a schematic diagram illustrating the principle of a model training method provided in an embodiment of this application.
[0078] The method includes steps S101 to S104.
[0079] Step S101: Acquire a sample medical image and input the sample medical image into a first segmentation network for first-stage segmentation to obtain the first segmentation result of the sample medical image for the region of interest.
[0080] This application does not limit the sample medical images, which may include at least one of the following: MRI images, CT images, PET images, X-ray images, PET-CT images, and PET-MR images. Sample medical images can be retrieved from the PACs system.
[0081] In a specific application, the medical images of the sample may include MRI images, which may include at least one of the following: T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), and PD-weighted imaging (PDWI).
[0082] The embodiments of this application do not limit the first segmentation network and the second segmentation network mentioned below. The first segmentation network and the second segmentation network can be the same segmentation network or different segmentation networks.
[0083] The segmentation network can be a neural network, which can include at least one of the following: FCN, Unet, and Segnet.
[0084] This application does not limit the region of interest, which can correspond to body parts, organs, tissues, lesions, etc. In some embodiments, the region of interest can be a region connected to adjacent organs or the entire region corresponding to a single organ. For example, the region of interest can be the region where the rectum and prostate are connected, the region where the bladder and prostate are connected, or the region corresponding to the prostate itself.
[0085] In a specific application, the sample medical image includes a T2W image, and the region of interest is the region corresponding to the prostate. The sample medical image is input into a first segmentation network for first-stage segmentation to obtain the first segmentation result of the sample medical image for the region of interest. The first segmentation result is used to locate the region of interest.
[0086] The first stage of segmentation can be performed using image binarization, and the first segmentation result can include a binary image (binary mask) of the region of interest.
[0087] See Figure 3 , Figure 3 This is a schematic diagram illustrating the principle of a first-stage segmentation provided in an embodiment of this application.
[0088] Figure 3 In this process, the input to the first segmentation network is the T2W image corresponding to the prostate (prostate T2W), and the output of the first segmentation network is the binary mask of the prostate (i.e., the gland mask). Figure 3 (The area circled in white).
[0089] Step S102: Input the first segmentation result and the sample medical image into the second segmentation network to perform the second stage segmentation, and obtain the second segmentation result of the sample medical image for at least one sub-region of the region of interest.
[0090] In some implementations, the process of inputting the sample medical image and the first segmentation result into the second segmentation network can be as follows:
[0091] The region corresponding to the bounding box of the binary mask in the sample medical image is cropped out from the sample medical image and used as the input to the second segmentation network.
[0092] This method requires less computation; however, because the original image (sample medical image) is cropped, the size changes, which interferes with the segmentation process of the second segmentation network, resulting in lower partitioning accuracy.
[0093] In other embodiments, the process of inputting the sample medical image and the first segmentation result into the second segmentation network can be as follows:
[0094] The binary mask output by the first segmentation network is used as the first segmentation result. The sample medical image and the first segmentation result are input into the second segmentation network in a dual-channel input manner.
[0095] This approach, without changing the original image size, adds a binary mask output by the first segmentation network as an additional channel to provide attention information, which helps the second segmentation network to better learn the image features within the region of interest, further improving the accuracy of segmentation.
[0096] In some implementations, when the region of interest is a region where adjacent organs are connected, the sub-regions of the region of interest can be different organs; when the region of interest is a region corresponding to an organ, the sub-regions of the region of interest can be a part of that organ.
[0097] In other words, the second-stage segmentation can segment two adjacent organs or tissues, or it can segment adjacent regions of an organ or tissue.
[0098] For example, the region of interest is the area where the prostate and bladder are connected, and the corresponding sub-regions are the prostate region and the bladder region, respectively.
[0099] The region of interest is the prostate, and the corresponding subregions are the central area of the gland and the peripheral zone, respectively.
[0100] See Figure 4 and Figure 5 , Figure 4 This is a schematic diagram of a second-stage segmentation process provided in an embodiment of this application. Figure 5 This is a schematic diagram illustrating the principle of the training process of a second segmentation network provided in an embodiment of this application.
[0101] In some implementations, the step of inputting the first segmentation result and the sample medical image into a second segmentation network for a second stage of segmentation to obtain a second segmentation result of the sample medical image for at least one sub-region of the region of interest (i.e., step S102) may include steps S201 and S202.
[0102] Step S201: Perform attention-based data augmentation on the sample medical image to obtain the augmented image corresponding to the sample medical image.
[0103] See Figure 6 , Figure 6 This is a schematic diagram of a data augmentation process provided in an embodiment of this application.
[0104] Step S201 may include steps S301 to S303.
[0105] Step S301: Input the sample medical image into a preset abnormal region segmentation model to obtain the abnormal region segmentation result of the sample medical image for the preset abnormal region. The abnormal region segmentation result is used to indicate the location information of the preset abnormal region.
[0106] The embodiments of this application do not limit the preset abnormal area. The abnormality type corresponding to the preset abnormal area includes, but is not limited to, one or more of the following: inflammation, mass, congestion, hyperplasia and tumor.
[0107] The embodiments of this application do not limit the preset abnormal region segmentation model. The abnormal region segmentation model can adopt a neural network, which includes, but is not limited to, one or more of the following: FCN, Unet and Segnet.
[0108] Step S302: Based on the abnormal region segmentation results, generate an abnormal region probability map with the same size as the sample medical image.
[0109] In some implementations, in the probability map of abnormal regions, the pixel value of each pixel represents the probability value that the region is an abnormal region. Generally, the pixel value of each pixel in the probability map ranges from [0, 1]. Areas with high probability values are highlighted, while areas with lower probability values are darker.
[0110] See Figure 7 , Figure 7 This is a schematic diagram comparing a sample medical image and an abnormal region probability map provided in an embodiment of this application. In a specific application, the sample medical image is a T2W image of the prostate, and the abnormal region is a prostate cancer lesion.
[0111] Figure 7 In the image, the left side shows a T2W image of the prostate, and the right side shows a probability map of abnormal areas.
[0112] As can be seen, the size of the abnormal region probability map is consistent with that of the prostate T2W image, and the corresponding positions highlighted in the abnormal region probability map are ( Figure 7 At the mid-cross junction, the image features of the prostate T2W image are significantly different from those of the normal contralateral side (i.e., the right half of the prostate T2W image).
[0113] Step S303: Based on the abnormal region probability map and the preset intensity of data augmentation, perform data augmentation on the sample medical image to obtain the augmented image corresponding to the sample medical image.
[0114] In some implementations, data augmentation can be achieved through image normalization methods.
[0115] Specifically, image normalization is the process of scaling image data proportionally to make it fall into a small, specific range. For example, if the value of each pixel in the input image is between 0 and 255, after image normalization, the range of pixel values in the image is [-1, 1].
[0116] In a specific application, image normalization is used as the data augmentation method. The data augmentation process for the sample medical images is as follows:
[0117] I T2W_aug =normalization(I T2W )×(P lesion ×α×rand_int+1)
[0118] Among them, I T2W_aug This is the enhanced image corresponding to the T2W image, where normalization() is the image normalization method, and I... T2W T2W image of the prostate, P lesion This is the probability map of the abnormal region, α is the preset strength of data augmentation, and rand_int is a random number with a value of [0, 1].
[0119] The introduction of random numbers ensures that the model's input is perturbed to a certain extent during each training iteration, which helps improve the model's robustness.
[0120] In this application, the embodiment does not limit α. The larger α is, the greater the change in the data after data augmentation. α can be a positive number, such as 0.1, 0.3 or 0.5. α can also be a negative number, such as -0.5, -0.3 or -0.1.
[0121] In a specific application, α is set to 0.3. Histogram statistics are performed on the pixel values of the sample medical images and their corresponding enhanced images. The histogram statistics results are as follows: Figure 8 As shown, Figure 8 In the diagram, the horizontal axis represents the normalized pixel values, and the vertical axis represents the number of normalized pixels. Histograms with a left-hand diagonal fill pattern represent the central region of the gland, histograms with a right-hand diagonal fill pattern represent the peripheral zone, and histograms with a horizontal fill pattern represent abnormal areas. Correspondingly, the comparison results of the sample medical images before and after data augmentation are shown below. Figure 9 As shown.
[0122] Figure 8 The left side shows the histogram statistics of the sample medical images. It can be seen that the histogram distribution center of the abnormal regions in the sample medical images is around -0.25 (see [link to relevant documentation]). Figure 8 (The direction of the left arrow in the middle).
[0123] Figure 8 The right side shows the histogram statistics of the enhanced image (preset intensity α is 0.3). It can be seen that after positive data augmentation, the histogram distribution center of the abnormal region in the enhanced image is around -0.4 (see [link to image]). Figure 8 (The direction of the arrow on the right).
[0124] In other words, during the process of positively reinforcing the features of abnormal regions, the pixel value distribution of abnormal regions shifts away from 0.
[0125] Combination Figure 9 It can be seen that ( Figure 9 The left side shows the sample medical image, and the right side shows the enhanced image. Abnormal areas in the enhanced image (see...). Figure 9 The area circled in white on the right side of the middle image) is compared to the abnormal area in the sample medical image (see...). Figure 9 The area circled in white on the left is darker, meaning that the low-frequency signal characteristics of the abnormal area are enhanced after positive data augmentation.
[0126] In a specific application, α is set to -0.3. Statistical analysis is performed on the pixel values of the sample medical images and their corresponding enhanced images. The statistical results are as follows: Figure 10 As shown, Figure 10In the diagram, the horizontal axis represents the normalized pixel values, and the vertical axis represents the number of normalized pixels. Histograms with a left-hand diagonal fill pattern represent the central region of the gland, histograms with a right-hand diagonal fill pattern represent the peripheral zone, and histograms with a horizontal fill pattern represent abnormal areas. Correspondingly, the comparison results of the sample medical images before and after data augmentation are shown below. Figure 11 As shown.
[0127] Figure 10 The left side shows the histogram statistics of the sample medical images. It can be seen that the histogram distribution center of the abnormal regions in the sample medical images is around -0.25 (see [link to relevant documentation]). Figure 10 (The direction of the left arrow in the middle).
[0128] Figure 10 The right side shows the histogram statistics of the enhanced image (preset intensity α is -0.3). It can be seen that after negative data augmentation, the histogram distribution center of the abnormal region in the enhanced image is around -0.15 (see...). Figure 10 (The direction of the arrow on the right).
[0129] In other words, during the process of negatively reinforcing the characteristics of the abnormal region, the histogram distribution of the abnormal region shifts towards 0.
[0130] Combination Figure 11 It can be seen that ( Figure 11 The left side shows the sample medical image, and the right side shows the enhanced image. Abnormal areas in the enhanced image (see...). Figure 11 The area circled in white on the right side of the middle image) is compared to the abnormal area in the sample medical image (see...). Figure 11 The area circled in white on the left side of the middle is whiter, meaning that the low-frequency signal characteristics of the abnormal area are weakened after negative data enhancement.
[0131] By employing the aforementioned data augmentation method based on anomaly region attention, the second segmentation network's learning of the interference features of anomaly regions in the prostate is enhanced, thereby improving the generalization ability of the image segmentation model.
[0132] Step S202: Input the first segmentation result and the enhanced image into the second segmentation network to perform second-stage segmentation, and obtain the second segmentation result of the sample medical image for at least one sub-region of the region of interest.
[0133] The second segmentation result is used to indicate the location information of at least one sub-region (partition mask, see below). Figure 5 (The area circled in white on the right side of the middle section).
[0134] Abnormal regions often interfere with the partitioning task of the model. In this embodiment, the data augmentation method based on abnormal region attention is used to perform data augmentation on the sample medical images. By using a preset abnormal region segmentation model, the image values of the images in the abnormal regions are perturbed and enhanced, which can effectively enhance the generalization of the second segmentation network under different abnormal region conditions.
[0135] Step S103: Update the parameters of the first segmentation network using the first segmentation result and the preset annotation information of the sample medical image; update the parameters of the second segmentation network using the second segmentation result and the preset annotation information, wherein the preset annotation information includes region annotation information of at least one sub-region.
[0136] This application does not limit the preset annotation information. The preset annotation information can be manually annotated by annotators, who can be radiologists or other personnel with professional diagnostic skills.
[0137] At least one sub-region may include the central area and the peripheral zone of the gland. The annotator uses different labels to delineate the central area and the peripheral zone of the gland in the sample medical image, and obtains the region annotation information of the central area and the region annotation information of the peripheral zone (i.e., the gold standard of partitioning), forming a basic annotation dataset and using it as the preset annotation information.
[0138] In some implementations, updating the parameters of the first segmentation network in step S103 using the first segmentation result and the preset annotation information of the sample medical image may include:
[0139] The first segmentation result is compared with the preset annotation information to obtain the first comparison result. The corresponding first loss value is determined based on the first comparison result. Based on the first loss value, the network parameters of the first segmentation network are updated. Step S101 is continued until the first iteration stopping condition is met.
[0140] The stopping condition for the first iteration can be that the first loss value is less than or equal to a preset threshold, the number of training iterations reaches a preset iteration value, or the first loss value converges, meaning that the first loss value no longer decreases as training continues.
[0141] In some implementations, the region labeling information of all sub-regions in the preset labeling information can be merged to obtain merged labeling information. The parameters of the first segmentation network can then be updated using the first segmentation result and the merged labeling information.
[0142] In a specific application, the preset annotation information includes the regional annotation information of the central area of the gland and the regional annotation information of the peripheral zone. The regional annotation information of the central area of the gland and the regional annotation information of the peripheral zone are merged to obtain the region delineation of the gland, which is used as the merged annotation information.
[0143] In some implementations, updating the parameters of the second segmentation network in step S103 using the second segmentation result and the preset annotation information may include:
[0144] The second segmentation result is compared with the preset annotation information to obtain the second comparison result. The corresponding second loss value is determined based on the second comparison result. Based on the second loss value, the network parameters of the second segmentation network are updated, and step S101 is continued until the second iteration stopping condition is met.
[0145] The stopping condition for the second iteration can be that the second loss value is less than or equal to a preset threshold, the number of iterations reaches a preset iteration value, or the second loss value converges, meaning that the second loss value no longer decreases as training continues.
[0146] Step S104: Construct an image segmentation model based on the updated first segmentation network and the second segmentation network.
[0147] In some implementations, the updated first segmentation network and the second segmentation network can be cascaded (using the output of the first segmentation network as the input of the second segmentation network) to construct an image segmentation model.
[0148] The model training method of this application embodiment first inputs a sample medical image into a first segmentation network to obtain a first segmentation result for the region of interest. Then, the sample medical image and the first segmentation result are respectively input into a second segmentation network to obtain a second segmentation result for at least one sub-region of the region of interest. Then, the first segmentation network is trained using the region labeling information of the sub-region and the first segmentation result, and the second segmentation network is trained using the region labeling information of the sub-region and the second segmentation result. The trained first and second segmentation networks are used to construct an image segmentation model. Using this image segmentation model, the entire region of interest of the medical image can be segmented first, and then sub-regions can be segmented within the region of interest. In other words, the image segmentation model can automatically provide the partitioning and localization results of the medical image, with high image processing efficiency and reduced workload for doctors.
[0149] In some implementations, the second segmentation network can learn segmentation tasks as well as classification tasks.
[0150] See Figure 12 , Figure 12This is a schematic diagram of a multi-task learning process provided in an embodiment of this application.
[0151] In some optional embodiments, the second-stage segmentation process may include steps S401 to S402.
[0152] Step S401: Output the feature map corresponding to the sample medical image using the encoding module of the second segmentation network.
[0153] In some implementations, the feature map may include low-dimensional features and high-dimensional features. Low-dimensional features have high resolution and a small receptive field, and mainly contain local details. High-dimensional features have a high degree of abstraction and a large receptive field, and mainly contain global information.
[0154] High-dimensional features contain rich information about the image from the entire encoder and have both segmentation and classification annotations as drivers, which can further improve the accuracy of the second segmentation network.
[0155] Step S402: Input the feature map into the segmentation decoding module of the second segmentation network to obtain the second segmentation result of the sample medical image for at least one sub-region of the region of interest.
[0156] In some alternative embodiments, the method may further include step S403.
[0157] Step S403: Input the feature map into the classification module of the second segmentation network to obtain the binary classification result of the sample medical image for the region of interest. The binary classification result is used to indicate whether there is an abnormal region in the region of interest.
[0158] The aforementioned data augmentation method based on anomaly region attention can improve the generalization ability of image segmentation models to anomaly regions, but in actual use, there may be both anomalous data and normal, interference-free data.
[0159] To further help the image segmentation model learn the differences between the two data (normal and abnormal) distributions, this embodiment adds a multi-task learning module to the high-dimensional features output by the encoder of the second segmentation network to perform binary classification prediction on whether there are abnormal regions in the sample medical image.
[0160] The multi-task learning module includes a segmentation and decoding module and a classification module. The segmentation and decoding module is responsible for generating a binary segmentation map (second segmentation result) for the region of interest, while the classification module is responsible for generating a binary classification result to determine whether an abnormal region exists in the region of interest.
[0161] The region of interest can be the prostate, and the abnormal region can be prostate cancer (tumor). By performing multi-task learning in the output part of the encoder of the second segmentation network, the existence of abnormal regions (such as prostate cancer) in the region of interest can be binary classified. This allows the second segmentation network to better utilize the information of the positive or negative (presence or absence of abnormal regions) of the sample medical image, thereby improving the accuracy and robustness of the second segmentation network.
[0162] See Figure 13 , Figure 13 This is a schematic diagram of a process for updating a second segmentation network provided in an embodiment of this application.
[0163] In some optional embodiments, step S103, which uses the second segmentation result and the preset annotation information to update the parameters of the second segmentation network, may include steps S501 to S503.
[0164] Step S501: Using the second segmentation result and the preset annotation information of the sample medical image, obtain the segmentation loss of the second segmentation network.
[0165] Step S502: Using the binary classification result and the classification annotation information of the sample medical image, obtain the classification loss of the second segmentation network.
[0166] In some implementations, the classification and labeling information can be obtained by labelers or by outputting the aforementioned pre-defined abnormal region segmentation model.
[0167] The classification label information is used to indicate the corresponding classification label. If the segmentation result for the abnormal region is empty, the label value of the classification label is 0. If the segmentation result for the abnormal region is not empty, the label value of the classification label is 1.
[0168] Step S503: Update the parameters of the second segmentation network based on the segmentation loss and the classification loss.
[0169] See Figure 14 , Figure 14 This is a schematic diagram of another process for updating the second segmentation network provided in an embodiment of this application.
[0170] In some optional embodiments, step S503 may include steps S601 to S602.
[0171] Step S601: Obtain the segmentation weights corresponding to the segmentation loss and the classification weights corresponding to the classification loss, wherein the sum of the segmentation weights and the classification weights is a preset value.
[0172] The embodiments of this application do not limit the preset value, and the preset value can be 1, 2 or 3.
[0173] The segmentation weights and classification weights can be set according to the actual needs of the image segmentation model. The segmentation weights and classification weights can be different. For example, the segmentation weight can be 0.8, the classification weight can be 0.2, and the sum of the segmentation weights and classification weights can be 1.
[0174] Step S602: Update the parameters of the second segmentation network based on the segmentation loss, the segmentation weight, the classification loss, and the classification weight.
[0175] In some implementations, the calculation process for the total loss of the second segmentation network is expressed as follows:
[0176] Loss=β×L cls +(1-β)×L seg
[0177] Where Loss represents the total loss of the second segmentation network, L cls L represents the classification loss. seg Let β represent the segmentation loss, β represent the classification weight, and 1-β represent the segmentation weight.
[0178] In calculating the total loss of the second segmentation network, due to the multi-task learning design, there are two aspects: segmentation loss and classification loss. The segmentation loss is calculated using the second segmentation result output by the segmentation decoding module and pre-defined annotation information, while the classification loss is calculated using the binary classification result output by the classification module and classification annotation information.
[0179] During the training iterations of the second segmentation network, the total loss gradually decreases until it converges, at which point the training of the second segmentation network is considered complete.
[0180] This application also provides an image segmentation method.
[0181] The method includes: inputting the image to be processed into an image segmentation model to obtain image segmentation results of the image to be processed for at least one sub-region of the region of interest.
[0182] The image segmentation model is trained using any of the above-mentioned model training methods.
[0183] In some implementations, the image segmentation model consists of a first segmentation network and a second segmentation network cascaded together, and the process of using the image segmentation model includes first-stage segmentation and second-stage segmentation.
[0184] The first stage of segmentation involves taking the image to be processed as input to the first segmentation network and outputting the first segmentation result of the image to be processed for the region of interest.
[0185] The second-stage segmentation process involves taking the image to be processed and the first segmentation result as input to the second segmentation network, and outputting the second segmentation result of the image to be processed for at least one sub-region of the region of interest.
[0186] In a specific application, the image to be processed may be a prostate T2W image, the region of interest may be the prostate, and at least one sub-region may include the central area of the gland and the peripheral zone.
[0187] The specific image segmentation process is as follows:
[0188] First, the prostate T2W image is input into the first segmentation network to obtain the corresponding gland mask;
[0189] Then, the prostate T2W image and the corresponding gland mask are input into the second segmentation network in a dual-channel manner, and the corresponding partition mask is output. The partition mask can include masks of two regions: the central region of the gland and the peripheral zone.
[0190] Image segmentation models can automatically provide partitioning and localization results for medical images, resulting in high image processing efficiency and reducing the workload of doctors.
[0191] In some implementations, the image segmentation model also has a classification prediction function for the presence of abnormal regions, which may include one or more of the following: inflammation, mass, congestion, hyperplasia, and tumor.
[0192] The classification prediction process is as follows: the encoding module of the second segmentation network outputs the feature map corresponding to the image to be processed;
[0193] The feature map is input into the classification module of the second segmentation network to obtain the binary classification result of the image to be processed for the region of interest. The binary classification result is used to indicate whether there are abnormal regions in the region of interest.
[0194] Image segmentation models not only have the function of partitioning and locating, but can also provide classification prediction results for the presence of abnormal regions, which facilitates doctors' subsequent diagnostic analysis.
[0195] See Figure 15 This application also provides an electronic device, which includes at least one memory 210, at least one processor 220, and a bus 230 connecting different platform systems.
[0196] The memory 210 may include a readable medium in the form of volatile memory, such as random access memory (RAM) 211 and / or cache memory 212, and may further include read-only memory (ROM) 213.
[0197] The memory 210 also stores a computer program, which can be executed by the processor 220, causing the processor 220 to perform the steps of the method in the embodiments of this application. The specific implementation method is consistent with the implementation method and the technical effect achieved in the above method embodiments, and some contents will not be repeated.
[0198] The memory 210 may also include a utility 214 having at least one program module 215, such program module 215 including but not limited to: an operating system, one or more application programs, other program modules and program data, each or some combination of these examples may include an implementation of a network environment.
[0199] Accordingly, processor 220 can execute the aforementioned computer program, and can also execute utility 214.
[0200] Bus 230 can represent one or more of several types of bus structures, including a memory bus or memory controller, peripheral bus, graphics acceleration port, processor, or a local bus using any of the various bus structures.
[0201] The electronic device can also communicate with one or more external devices 240, such as a keyboard, pointing device, Bluetooth device, etc., and with one or more devices capable of interacting with the electronic device, and / or with any device that enables the electronic device to communicate with one or more other computing devices (e.g., a router, modem, etc.). This communication can be performed through input / output interface 250. Furthermore, the electronic device can communicate with one or more networks (e.g., local area network (LAN), wide area network (WAN), and / or public networks, such as the Internet) via network adapter 260. Network adapter 260 can communicate with other modules of the electronic device via bus 230. It should be understood that, although not shown in the figures, other hardware and / or software modules can be used in conjunction with the electronic device, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage platforms.
[0202] This application also provides a computer-readable storage medium for storing a computer program. When the computer program is executed, it implements the steps of the method in this application. The specific implementation method is consistent with the implementation method and the technical effect achieved in the above method embodiments, and some contents will not be repeated.
[0203] Figure 16 This embodiment illustrates a program product for implementing the above-described method, which may employ a portable compact disc read-only memory (CD-ROM) and include program code, and may run on a terminal device, such as a personal computer. However, the program product of this invention is not limited thereto. In this application, the readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. The program product may employ any combination of one or more readable media. A readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of readable storage media (a non-exhaustive list) include: an electrical connection having one or more wires, a portable disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof.
[0204] Computer-readable storage media may include data signals propagated in baseband or as part of a carrier wave, carrying readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. The readable storage medium may also be any readable medium capable of sending, propagating, or transmitting a program for use by or in conjunction with an instruction execution system, apparatus, or device. The program code contained on the readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wired, optical fiber, RF, or any suitable combination thereof. Program code for performing operations of the present invention may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java and C++, as well as conventional procedural programming languages such as C or similar languages. The program code may be executed entirely on a user computing device, partially on a user device, as a standalone software package, partially on a user computing device and partially on a remote computing device, or entirely on a remote computing device or server. In cases involving remote computing devices, the remote computing devices can be connected to user computing devices via any type of network, including local area networks (LANs) or wide area networks (WANs), or they can be connected to external computing devices (e.g., via the Internet using an Internet service provider).
[0205] This application describes the invention from the perspectives of purpose, performance, progress, and novelty, and it meets the functional enhancement and use requirements emphasized by the Patent Law. The above description and drawings are merely preferred embodiments of this application and are not intended to limit this application. Therefore, all structures, devices, features, etc., that are similar to or identical to those of this application, i.e., all equivalent substitutions or modifications made in accordance with the scope of this patent application, shall fall within the scope of protection of this patent application.
Claims
1. A model training method, characterized in that, The method includes: A sample medical image is acquired and input into a first segmentation network for first-stage segmentation to obtain a first segmentation result of the sample medical image for the region of interest; the first segmentation result includes a binary image for the region of interest. The first segmentation result and the sample medical image are input into the second segmentation network in a dual-channel input manner. After performing data augmentation based on abnormal region attention on the sample medical image, the second-stage segmentation is performed to obtain the second segmentation result of the sample medical image for at least one sub-region of the region of interest. The parameters of the first segmentation network are updated using the first segmentation result and the preset annotation information of the sample medical image. The parameters of the second segmentation network are updated using the second segmentation result and the preset annotation information. The preset annotation information includes region annotation information of at least one sub-region. An image segmentation model is constructed based on the updated first and second segmentation networks; The method further includes: merging the region labeling information of all sub-regions in the preset labeling information to obtain merged labeling information, and updating the parameters of the first segmentation network using the first segmentation result and the merged labeling information.
2. The model training method according to claim 1, characterized in that, The second-stage segmentation, which yields a second segmentation result of the sample medical image for at least one sub-region of the region of interest, includes: Attention-based data augmentation is performed on the sample medical images to obtain augmented images corresponding to the sample medical images; The first segmentation result and the enhanced image are respectively input into the second segmentation network for second-stage segmentation to obtain the second segmentation result of the sample medical image for at least one sub-region of the region of interest.
3. The model training method according to claim 2, characterized in that, The step of performing attention-based data augmentation on the sample medical image to obtain an augmented image corresponding to the sample medical image includes: The sample medical image is input into a preset abnormal region segmentation model to obtain the abnormal region segmentation result of the sample medical image for the preset abnormal region; Based on the abnormal region segmentation results, an abnormal region probability map with the same size as the sample medical image is generated. Based on the probability map of the abnormal region and the preset intensity of data augmentation, the sample medical image is augmented to obtain the augmented image corresponding to the sample medical image.
4. The model training method according to claim 1, characterized in that, The second-stage segmentation process includes: The feature map corresponding to the sample medical image is output using the encoding module of the second segmentation network; The feature map is input into the segmentation decoding module of the second segmentation network to obtain a second segmentation result of the sample medical image for at least one sub-region of the region of interest.
5. The model training method according to claim 4, characterized in that, The method further includes: The feature map is input into the classification module of the second segmentation network to obtain a binary classification result of the sample medical image for the region of interest. The binary classification result is used to indicate whether there are abnormal regions in the region of interest.
6. The model training method according to claim 5, characterized in that, The step of updating the parameters of the second segmentation network using the second segmentation result and the preset annotation information includes: Using the second segmentation result and the preset annotation information of the sample medical image, the segmentation loss of the second segmentation network is obtained; Using the binary classification results and the classification annotation information of the sample medical images, the classification loss of the second segmentation network is obtained; The parameters of the second segmentation network are updated based on the segmentation loss and the classification loss.
7. The model training method according to claim 6, characterized in that, The step of updating the parameters of the second segmentation network based on the segmentation loss and the classification loss includes: Obtain the segmentation weights corresponding to the segmentation loss and the classification weights corresponding to the classification loss, wherein the sum of the segmentation weights and the classification weights is a preset value; The parameters of the second segmentation network are updated based on the segmentation loss, the segmentation weight, the classification loss, and the classification weight.
8. An image segmentation method, characterized in that, The method includes: The image to be processed is input into an image segmentation model to obtain image segmentation results of the image to be processed for at least one sub-region of the region of interest; The image segmentation model is trained using the model training method described in any one of claims 1-7.
9. An electronic device, characterized in that, The electronic device includes a memory and a processor. The memory stores a computer program, and when the processor executes the computer program, it implements the steps of the model training method according to any one of claims 1-7 or the steps of the image segmentation method according to claim 8.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the steps of the model training method according to any one of claims 1-7 or the steps of the image segmentation method according to claim 8.