Three-dimensional medical image two-stage segmentation method and device based on shared CNN
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- JINAN UNIVERSITY
- Filing Date
- 2023-05-17
- Publication Date
- 2026-06-23
Smart Images

Figure CN116664594B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of image processing technology, and in particular to a two-stage segmentation method and apparatus for three-dimensional medical images based on shared CNN. Background Technology
[0002] Image segmentation refers to the process of separating objects of interest from an image. Medical image segmentation specifically refers to separating human tissues and organs of diagnostic interest from medical images such as MRI, CT, and ultrasound (US) images, removing them from the image background and various noise interferences for targeted diagnosis. For example, in the diagnosis and treatment of heart patients, to more clearly observe the diseased cardiac structures, it is often necessary to segment late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) images to obtain the patient's atrial structures. As a crucial part of quantitative analysis of medical images, the segmentation quality of LGE-MRI directly affects the accuracy of subsequent diagnoses. Traditional LGE-MRI segmentation relies on manual operation by experienced medical professionals. To reduce the workload of medical staff and the operational difficulty of image segmentation, a high-precision automated image segmentation scheme is needed. However, due to the low contrast of LGE-MRI images and the presence of physiological noise from the human body and thermal noise caused by the equipment, the segmentation task is difficult to complete using traditional image processing algorithms. Therefore, the current segmentation schemes are still mainly based on manual or semi-automated methods.
[0003] In recent years, artificial intelligence deep learning technology has developed rapidly, providing new research ideas for automated intelligent operations in various fields. In the medical field, electronic medical data and internet-based healthcare have further promoted the development of emerging medical models. Deep learning-based assisted diagnostic systems can replace manual labor to achieve higher-quality and more efficient automated medical analysis tasks such as image localization, segmentation, and classification. Furthermore, the accuracy of computer-aided automatic medical image segmentation has surpassed that of manual operations. Compared to manual or semi-manual segmentation schemes, a deep learning-based 3D atrial segmentation algorithm is a superior choice for achieving 3D medical image segmentation.
[0004] 2D network convolutional kernels lack the z-axis dimension, failing to guarantee the continuity of images between adjacent z-axis layers. In contrast, 3D networks can perceive z-axis contextual information, resulting in segmentation results that are closer to the true values. Furthermore, image segmentation can be understood as performing binary classification on each pixel. However, in an image, most pixels are not the objects to be identified; the key structures to be segmented are only a small portion of the image. This imbalance between the number of foreground and background pixels often occurs in medical images.
[0005] To reduce interference from background or image noise, a CNN can be used to first detect the location of key structures in a medical image. Based on the detected location information, a Region of Interest (ROI) is cropped from the original image, and finally, the ROI is segmented to obtain the target structure. The Double 3D U-Net model structure uses this idea. To achieve detection first and segmentation later, two CNN networks are used, increasing the network depth. This inevitably increases the computational load and GPU memory consumption, making the network more "bloated." The increased ROI size can easily lead to GPU memory overflow, and the model's versatility is poor. Summary of the Invention
[0006] In view of this, embodiments of the present invention provide a versatile two-stage segmentation method for three-dimensional medical images based on shared CNN.
[0007] On one hand, embodiments of the present invention provide a two-stage segmentation method for three-dimensional medical images based on shared CNNs, including:
[0008] The initial image is input into the target model for image segmentation to obtain the initial segmentation map;
[0009] The target location information of the ROI in the initial segmentation map is obtained based on the initial segmentation map;
[0010] Based on the target location information of the ROI and the preset ROI side length, determine the position and size of the ROI clipping frame;
[0011] Based on the position and size of the ROI cropping box, the initial segmentation map and the initial image are cropped to obtain a first feature map and a second feature map;
[0012] The first feature map and the second feature map are fused to obtain a fused feature map;
[0013] The fused feature map is input into the target model for image segmentation to obtain the target segmentation result.
[0014] Optionally, in the step of inputting the initial image into the target model for image segmentation to obtain the initial segmentation map, the target model is a 3D U-Net network model.
[0015] Optionally, the step of inputting the initial image into the target model for image segmentation to obtain an initial segmentation map includes:
[0016] The encoder of the target model extracts features from the initial image to obtain a five-layer feature map;
[0017] The five-layer feature maps are fused using the decoder of the target model to obtain an initial segmentation map.
[0018] Optionally, the step of obtaining the target location information of the ROI of the initial segmentation map based on the initial segmentation map includes:
[0019] The initial segmented image is processed by a fully connected layer to extract features and obtain the first location information.
[0020] The first location information is mapped using an activation function to obtain the second location information;
[0021] Based on the second location information, the ROI side length, and the size of the initial image, the target location information is obtained.
[0022] Optionally, the step of fusing the first feature map and the second feature map to obtain a fused feature map includes fusing the features by directly adding the first feature map and the second feature map together.
[0023] Optionally, in the step of extracting features from the initial image through the encoder of the target model to obtain a five-layer feature map, the encoder consists of five downsampling modules, each of which consists of two convolutional layers and one max pooling layer; wherein the size of the two convolutional layers is 3x3x3, and the size of the max pooling layer is 2x2x2.
[0024] Optionally, in the step of fusing the five feature maps through the decoder of the target model to obtain the initial segmentation map, the decoder consists of five upsampling modules, each of which consists of a deconvolutional layer, a skip connection layer, and two convolutional layers; wherein, the two convolutional layers are both 3x3x3 in size.
[0025] On the other hand, embodiments of the present invention also provide a two-stage segmentation device for three-dimensional medical images based on shared CNN, comprising:
[0026] The first module is used to input the initial image into the target model for image segmentation to obtain the initial segmentation map;
[0027] The second module is used to obtain the target location information of the ROI of the initial segmentation map based on the initial segmentation map;
[0028] The third module is used to determine the position and size of the ROI clipping frame based on the target position information of the ROI and the preset ROI side length;
[0029] The fourth module is used to crop the first segmentation image and the second segmentation image according to the position and size of the ROI cropping box to obtain the first feature image and the second feature image.
[0030] The fifth module is used to fuse the first feature map and the second feature map to obtain a fused feature map;
[0031] The sixth module is used to input the fused feature map into the target model for image segmentation to obtain the target segmentation result.
[0032] On the other hand, embodiments of the present invention also provide an electronic device, including a processor and a memory; the memory is used to store a program; the processor executes the program to implement the two-stage segmentation method for three-dimensional medical images based on shared CNN as described above.
[0033] On the other hand, embodiments of the present invention also provide a computer-readable storage medium storing a program that is executed by a processor to implement the aforementioned two-stage segmentation method for three-dimensional medical images based on shared CNN.
[0034] This invention also discloses a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium. A processor of a computer device can read the computer instructions from the computer-readable storage medium and execute the computer instructions, causing the computer device to perform the aforementioned method.
[0035] The embodiments of the present invention include at least the following beneficial effects: The present invention obtains a fused feature map by fusing the first feature map and the second feature map, and updates the 3D U-Net network model with the information learned from the two feature maps, so that the detection and segmentation tasks can promote each other and ensure that the performance of detection and segmentation is not lost; The present invention first inputs the initial image into the target model for image segmentation, and finally inputs the fused feature map into the target model for image segmentation. By sharing the 3D U-Net network model, the size of the ROI cropping box can be adjusted within a larger range, improving the segmentation effect of three-dimensional medical images. Furthermore, it breaks the current framework of two-stage segmentation achieved by integrating two models, and achieves two-stage segmentation by calling the same model, which improves the flexibility and versatility of model segmentation. Compared with the prior art, which uses two models, it also reduces the space occupied by the model. Attached Figure Description
[0036] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0037] Figure 1 A flowchart illustrating the steps of a two-stage segmentation method for 3D medical images based on shared CNN provided in an embodiment of the present invention;
[0038] Figure 2 A flowchart illustrating a two-stage segmentation method for 3D medical images based on shared CNN, provided in an embodiment of the present invention.
[0039] Figure 3 A comparison diagram of the segmentation results of two-stage segmentation of three-dimensional medical images based on shared CNN and the segmentation results of Double 3D U-Net provided in the embodiments of the present invention;
[0040] Figure 4 This is a schematic diagram of a two-stage segmentation device for three-dimensional medical images based on shared CNN, provided in an embodiment of the present invention. Detailed Implementation
[0041] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.
[0042] To address the problems existing in the prior art, embodiments of the present invention provide a two-stage segmentation method for three-dimensional medical images based on shared CNNs, such as... Figure 1 As shown, the method includes steps S100-S600:
[0043] S100: Input the initial image into the target model for image segmentation to obtain the initial segmentation map.
[0044] Specifically, firstly, an LGE MRI (cardiac magnetic resonance imaging) dataset was acquired. Then, 3D MRI images containing atrial fibrillation from the LGE MRI dataset were preprocessed. The grayscale distribution of the MRI images is in the range [0, 255], and the spatial resolution is 0.625x0.625x0.625mm. 3 First, the MRI images are cropped to 576*576*80 pixels. Then, the x and y axes are downsampled to one-third of their original size. Finally, the images are input into the 3D U-Net network model, which has a size of 192*192*80 pixels.
[0045] Specifically, the target model is the 3D U-Net network model. The 3D U-Net network model is created based on the previous U-Net (2D) and is a U-shaped structure consisting of an encoder and a decoder. The encoder is used for feature extraction, and the decoder is used for upsampling to restore the original resolution of the feature map. It is mainly used to process blocky images in medical images and can improve the processing efficiency of 3D images.
[0046] Specifically, the step of inputting the initial image into the target model for image segmentation to obtain the initial segmentation map includes steps S110-S120:
[0047] S110: The encoder of the target model extracts features from the initial image to obtain a five-layer feature map.
[0048] Specifically, the encoder consists of five downsampling modules, each including a 3D residual convolution module and a max-pooling layer. The 3D residual convolution module comprises two 3x3x3 3D convolutional layers. Additionally, residual connections are added at the feature input. The input features are upscaled through a 1x1x1 convolution operation and then directly added to the feature map obtained from the two convolutions for feature fusion. The max-pooling layer is 2x2x2. The initial image is input to the target model, and the encoder extracts features from the initial image. Each downsampling operation by the five downsampling modules halves the size of the extracted feature map. For example, if the spatial resolution of the initial image is H×W×Z, where H, W, and Z are the values of the three dimensions of the initial image (H is the height, W is the width, and Z is the length), after passing through the encoder's five downsampling modules, the resulting feature map is... Five-layer feature map.
[0049] S120: The five-layer feature maps are fused using the decoder of the target model to obtain an initial segmentation map.
[0050] Specifically, the decoder consists of five upsampling modules. Each upsampling module comprises a 2x2x2 3D inverted convolutional layer, a skip connection layer, and two 3x3x3 3D convolutional layers, forming an upsampling module corresponding to the encoder's downsampling module. The skip connection layer fuses the positional information of the lower layers with the semantic information of the deeper layers through concatenation. The five upsampling modules restore the five feature maps to the initial image size. Then, the feature maps are thresholded using a sigmoid function to map the specific values in the feature maps to the range (0, 1). According to the labeling rules of the training labels (0 for background, 1 for other regions), values in the feature maps greater than 0.5 are considered other regions (regions in medical images that require image processing), and values less than or equal to 0.5 are considered background regions, thus obtaining the initial segmentation map.
[0051] S200: Obtain the target location information of the ROI of the initial segmentation map based on the initial segmentation map.
[0052] Specifically, ROI stands for Region of Interest in an image. In the embodiments of this invention, ROI refers to the region in a three-dimensional medical image that requires image processing. For example, if the initial image is a cardiac MRI image (LGE-MRI), then the ROI is the region containing the heart; if the initial image is a lung CT scan or kidney CT scan, then the ROI is the lung and kidney regions in the image; if the initial image is a uterine ultrasound image, then the ROI is the uterine region in the ultrasound image. In machine vision and image processing, the region to be processed is delineated from the image using rectangles, circles, ellipses, irregular polygons, etc., and is called the Region of Interest, or ROI. Various operators and functions are commonly used in machine vision software such as Halcon, OpenCV, and Matlab to obtain the ROI and perform further image processing.
[0053] Specifically, the step of obtaining the target location information of the ROI of the initial segmentation map based on the initial segmentation map includes steps S201-S203:
[0054] S201: The initial segmented image is subjected to feature extraction through a fully connected layer to obtain the first location information.
[0055] Specifically, the initial segmentation map is input into the fully connected layer to obtain two first position information containing the relative position information of the segmentation targets. The relative position information is the ROI position coordinates.
[0056] S202: The first position information is mapped through an activation function to obtain the second position information.
[0057] Specifically, the obtained first location information is mapped using the Sigmoid function to obtain second location information location_0 and location_1 in the range (0, 1), which are used to adapt to two different sizes of images: the initial segmentation map and the initial image.
[0058] S203: Based on the second location information, the ROI side length, and the size of the initial image, obtain the target location information.
[0059] Specifically, the ROI side length `len_of_roi` is preset according to the actual image conditions. Based on the second location information and the initial image size, the accurate x and y coordinates of the image are set. Only the X and Y axes are cropped; the Z axis is not cropped. Specifically, the ROI side length is first subtracted from the maximum values of the image's X and Y axes to ensure the ROI box does not exceed the original image boundary. Then, the subtracted values are multiplied by the second location information `location_0` and `location_1` to obtain the coordinates (roi_x, roi_y) of the lower left corner of the ROI region on the X and Y planes. Thus, the position of the ROI box in the original image [roi_x:roi_x+len_of_roi, roi_y:roi_y+len_of_roi], i.e., the target location information, is obtained.
[0060] S300: Determine the position and size of the ROI clipping frame based on the target position information of the ROI and the preset ROI side length.
[0061] Specifically, the size of the ROI clipping box is determined according to the preset ROI side length, and the position of the ROI clipping box is determined according to the coordinates of the target position information as [roi_x:roi_x+len_of_roi,roi_y:roi_y+len_of_roi].
[0062] S400: Based on the position and size of the ROI cropping box, crop the initial segmentation map and the initial image to obtain a first feature map and a second feature map.
[0063] Specifically, the region at position [roi_x:roi_x+len_of_roi,roi_y:roi_y+len_of_roi] in the initial segmentation image is cropped to obtain the first feature map; the region at position [roi_x:roi_x+len_of_roi,roi_y:roi_y+len_of_roi] in the initial image is cropped to obtain the second feature map, thus obtaining two feature maps containing ROIs at different stages.
[0064] S500: The first feature map and the second feature map are fused to obtain a fused feature map.
[0065] Specifically, the step of fusing the first feature map and the second feature map to obtain a fused feature map includes fusing the features by directly adding the first feature map and the second feature map. The dimensions of the resulting fused feature map do not change, only the corresponding values are superimposed, but each dimension contains more features.
[0066] S600: Input the fused feature map into the target model for image segmentation to obtain the target segmentation result.
[0067] Specifically, the fused feature map is input into the 3D U-Net network model for image segmentation to obtain the target segmentation result. In the segmentation result, the value greater than 0.5 is identified as the region in the medical image that needs image processing, and the value less than or equal to 0.5 is identified as the background region.
[0068] Reference Figure 2 The following example illustrates the implementation and application of the two-stage segmentation method for three-dimensional medical images based on shared CNN.
[0069] 1. First, 154 3D MRI images containing new atrial fibrillation were obtained from the LGE MRI dataset of the 2018 Left Atrial Segmentation Challenge. These images were divided into a training set of 100 cases and a test set of 54 cases. The images were available in two sizes: 640*640*88 and 576*576*88. The MRI grayscale distribution was in the range [0, 255], and the spatial resolution was 0.625x0.625x0.625mm. 3 The MRI images are labeled as binary images, with 0 representing the background and 255 representing the left atrial region. First, the 640*640*88 and 576*576*88 images are cropped to 576*576*80. Then, the x and y axes are downsampled to 1 / 3 of their original size. The initial image of size 192*192*80 is then input into the 3D U-Net network model.
[0070] 2. Input the initial image into the 3D U-Net network model. The encoder of the model extracts features from the initial image. Each time the image passes through a downsampling module, the size of the feature map is reduced by half, resulting in five layers of feature maps. The five layers of feature maps are then fused by the decoder of the model to obtain the initial segmentation map.
[0071] 3. Then, based on the initial segmentation map, obtain the target location information of the ROI in the initial segmentation map. In this example, the ROI is the region of the left atrium in the map, and obtain the location information of the left atrium region in the map.
[0072] 4. Then, based on the location information of the left atrial region in the initial feature map and the preset ROI side length, determine the position and size of the clipping box for the left atrial region;
[0073] 5. Based on the position and size of the obtained left atrial region cropping box, crop the initial segmentation map and the initial image to obtain the first feature map and the second feature map respectively; then add the first feature map and the second feature map directly to perform feature fusion to obtain the fused feature map;
[0074] 6. Finally, the fused feature map is input into the 3D U-Net network model for image segmentation to obtain the target segmentation result. Values greater than 0.5 are identified as atria, and values less than or equal to 0.5 are identified as background. (Refer to...) Figure 3 This paper compares the segmentation results of the proposed two-stage segmentation method for 3D medical images based on shared CNN with the segmentation results of the existing Double 3D U-Net technology. Figure 3 (a) represents the left atrial labels from a publicly available dataset, annotated by professional physicians. Figure 3 (b) represents the segmentation result of the method proposed in this invention. Figure 3 (c) shows the segmentation results of the Double 3D U-Net model. It can be seen that the method extracted by this invention has more advantages in processing surface details, making the results similar to the true values. Table 1 shows a comparison of the evaluation index parameters of Double 3D U-Net and the method proposed in this invention. Among them, the shared 3D U-Net is the two-stage segmentation method for three-dimensional medical images based on shared CNN proposed in this invention.
[0075] Table 1
[0076] Model DICE Jaccard HD Size Double 3D U-Net 0.916 0.845 1.300 45.18M Shared 3D U-Net 0.918 0.848 1.211 22.66M
[0077] Among them, the DICE coefficient is a set similarity metric, usually used to calculate the similarity between two samples, with a maximum similarity value of 1 and a minimum value of 0; the Jaccard coefficient is used to compare the similarity and difference between a finite set of samples, and the larger the Jaccard coefficient value, the higher the sample similarity; the HD coefficient is the Hausdorff distance, which calculates the distance between two sets, and the smaller the value, the higher the similarity between the two sets; Size is the model size, used to measure the space occupied by the model; among them, the DICE coefficient, Jaccard coefficient, and HD coefficient all measure the similarity with the results of manual segmentation by professional doctors.
[0078] As can be seen from Table 1, the shared 3D U-Net network model proposed in this invention has better segmentation performance, the results are closer to the true values, and the space used is smaller.
[0079] This embodiment is based on PyTorch and runs on an NVIDIA GTX3090 GPU. The Adam optimizer was used in the experiment. The initial learning rate was set to 0.001, the number of iterations was set to 200, the batch size was set to 1, and the ROI side length was set to 96. During the training phase, a strategy of five-fold cross-validation and early stopping training was used to continuously monitor the model's loss on the validation set; training was stopped when the loss on the validation set was minimized to prevent overfitting and ensure model generalization. In addition, the model for each iteration was saved, and the best model was selected from them based on the DICE coefficient.
[0080] In summary, the two-stage segmentation method for 3D medical images based on shared CNN of the present invention has the following advantages:
[0081] 1. This invention obtains a fused feature map by fusing the first feature map and the second feature map, and updates the 3D U-Net network model with the information learned from the two feature maps, so that the detection and segmentation tasks can promote each other and ensure that the performance of detection and segmentation is not lost.
[0082] 2. This invention first inputs the initial image into the target model for image segmentation, and finally inputs the fused feature map into the target model for image segmentation. By sharing the 3D U-Net network model, the size of the ROI cropping box can be adjusted within a larger range, improving the segmentation effect. Furthermore, it breaks the current framework of two-stage segmentation achieved by integrating two models. By calling the same model to achieve two-stage segmentation, it improves the flexibility and versatility of model segmentation. Compared with the existing technology that uses two models, it also reduces the space occupied by the model.
[0083] Reference Figure 4 The present invention also provides a two-stage segmentation device for three-dimensional medical images based on shared CNN, comprising:
[0084] The first module 401 is used to input the initial image into the target model for image segmentation to obtain an initial segmentation map;
[0085] The second module 402 is used to obtain the target location information of the ROI of the initial segmentation map based on the initial segmentation map;
[0086] The third module 403 is used to determine the position and size of the ROI clipping frame based on the target position information of the ROI and the preset ROI side length;
[0087] The fourth module 404 is used to crop the first segmentation image and the second segmentation image according to the position and size of the ROI cropping box, so as to obtain the first feature image and the second feature image respectively.
[0088] The fifth module 405 is used to fuse the first feature map and the second feature map to obtain a fused feature map;
[0089] The sixth module 406 is used to input the fused feature map into the target model for image segmentation to obtain the target segmentation result.
[0090] This invention also provides an electronic device, including a processor and a memory; the memory is used to store a program; the processor executes the program to implement the two-stage segmentation method for three-dimensional medical images based on shared CNN as described above.
[0091] This invention also provides a computer-readable storage medium storing a program that is executed by a processor to implement the aforementioned two-stage segmentation method for three-dimensional medical images based on shared CNN.
[0092] This invention also discloses a computer program product or computer program, which includes computer instructions stored in a computer-readable storage medium. A processor of a computer device can read the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, causing the computer device to perform... Figure 1 The method shown.
[0093] In some alternative embodiments, the functions / operations mentioned in the block diagrams may not occur in the order shown in the operation diagrams. For example, depending on the functions / operations involved, two consecutively shown blocks may actually be executed substantially simultaneously, or the blocks may sometimes be executed in reverse order. Furthermore, the embodiments presented and described in the flowcharts of this invention are provided by way of example to provide a more comprehensive understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is altered and sub-operations described as part of a larger operation are executed independently.
[0094] Furthermore, although the invention has been described in the context of functional modules, it should be understood that, unless otherwise stated, one or more of the described functions and / or features may be integrated into a single physical device and / or software module, or one or more functions and / or features may be implemented in a separate physical device or software module. It is also understood that a detailed discussion of the actual implementation of each module is unnecessary for understanding the invention. Rather, given the properties, functions, and internal relationships of the various functional modules in the apparatus disclosed herein, the actual implementation of the module will be understood within the scope of conventional skill of an engineer. Therefore, those skilled in the art can implement the invention as set forth in the claims using ordinary techniques without excessive experimentation. It is also understood that the specific concepts disclosed are merely illustrative and not intended to limit the scope of the invention, which is determined by the full scope of the appended claims and their equivalents.
[0095] If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this invention, essentially, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0096] The logic and / or steps represented in the flowchart or otherwise described herein, for example, can be considered as a sequenced list of executable instructions for implementing logical functions, and can be embodied in any computer-readable medium for use by, or in conjunction with, an instruction execution system, apparatus, or device (such as a computer-based system, a processor-included system, or other system that can fetch and execute instructions from, an instruction execution system, apparatus, or device). For the purposes of this specification, "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transmit programs for use by, or in conjunction with, an instruction execution system, apparatus, or device.
[0097] More specific examples of computer-readable media (a non-exhaustive list) include: electrical connections (electronic devices) having one or more wires, portable computer disk drives (magnetic devices), random access memory (RAM), read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disc read-only memory (CDROM). Furthermore, computer-readable media can even be paper or other suitable media on which the program can be printed, because the program can be obtained electronically, for example, by optically scanning the paper or other medium, followed by editing, interpreting, or otherwise processing as necessary, and then stored in computer memory.
[0098] It should be understood that various parts of the present invention can be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, multiple steps or methods can be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, it can be implemented using any one or a combination of the following techniques known in the art: discrete logic circuits having logic gates for implementing logical functions on data signals, application-specific integrated circuits (ASICs) having suitable combinational logic gates, programmable gate arrays (PGAs), field-programmable gate arrays (FPGAs), etc.
[0099] In the description of this specification, references to terms such as "one embodiment," "some embodiments," "example," "specific example," or "some examples," etc., indicate that a specific feature, structure, material, or characteristic described in connection with that embodiment or example is included in at least one embodiment or example of the invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.
[0100] Although embodiments of the invention have been shown and described, those skilled in the art will understand that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
[0101] The above is a detailed description of the preferred embodiments of the present invention, but the present invention is not limited to the embodiments described. Those skilled in the art can make various equivalent modifications or substitutions without departing from the spirit of the present invention, and these equivalent modifications or substitutions are all included within the scope defined by the claims of this application.
Claims
1. A two-stage segmentation method for 3D medical images based on shared CNNs, characterized in that, include: The initial image is input into the target model for image segmentation to obtain the initial segmentation map; The target location information of the ROI in the initial segmentation map is obtained based on the initial segmentation map; Based on the target location information of the ROI and the preset ROI side length, determine the position and size of the ROI clipping frame; Based on the position and size of the ROI cropping box, the initial segmentation map and the initial image are cropped to obtain a first feature map and a second feature map; The first feature map and the second feature map are fused to obtain a fused feature map; The fused feature map is input into the target model for image segmentation to obtain the target segmentation result; The step of obtaining the target location information of the ROI of the initial segmentation map based on the initial segmentation map includes: The initial segmented image is processed by a fully connected layer to extract features and obtain the first location information. The first location information is mapped using an activation function to obtain the second location information; Based on the second location information, the ROI side length, and the size of the initial image, the target location information is obtained; The cropping of the initial segmentation map and the initial image includes: The initial segmentation map and the initial image are cropped along the X and Y axes.
2. The shared-CNN-based three-dimensional medical image two-stage segmentation method according to claim 1, characterized in that, In the step of inputting the initial image into the target model for image segmentation to obtain the initial segmentation map, the target model is a 3D U-Net network model. 3.The shared-CNN based three-dimensional medical image two-stage segmentation method of claim 1, wherein, The step of inputting the initial image into the target model for image segmentation to obtain the initial segmentation map includes: The encoder of the target model extracts features from the initial image to obtain a five-layer feature map; The five-layer feature maps are fused using the decoder of the target model to obtain an initial segmentation map. 4.The shared-CNN based three-dimensional medical image two-stage segmentation method of claim 1, wherein, The step of fusing the first feature map and the second feature map to obtain a fused feature map includes fusing the features by directly adding the first feature map and the second feature map.
5. The shared-CNN-based three-dimensional medical image two-stage segmentation method according to claim 3, characterized in that, In the step of extracting features from the initial image through the encoder of the target model to obtain a five-layer feature map, the encoder consists of five downsampling modules, each of which consists of two convolutional layers and one max pooling layer; wherein, the size of the two convolutional layers is 3x3x3, and the size of the max pooling layer is 2x2x2. 6.The two-stage 3D medical image segmentation method based on shared CNN according to claim 3, wherein, In the step of fusing the five feature maps through the decoder of the target model to obtain the initial segmentation map, the decoder consists of five upsampling modules, each of which consists of a deconvolutional layer, a skip connection layer, and two convolutional layers; wherein, the two convolutional layers are both 3x3x3 in size.
7. A three-dimensional medical image two-stage segmentation apparatus based on a shared CNN, characterized by, include: The first module is used to input the initial image into the target model for image segmentation to obtain the initial segmentation map; The second module is used to obtain the target location information of the ROI of the initial segmentation map based on the initial segmentation map; The third module is used to determine the position and size of the ROI clipping frame based on the target position information of the ROI and the preset ROI side length; The fourth module is used to crop the initial segmentation map and the initial image according to the position and size of the ROI cropping box to obtain a first feature map and a second feature map; The fifth module is used to fuse the first feature map and the second feature map to obtain a fused feature map; The sixth module is used to input the fused feature map into the target model for image segmentation to obtain the target segmentation result; The second module is specifically used to perform the following steps: The initial segmented image is processed by a fully connected layer to extract features and obtain the first location information. The first location information is mapped using an activation function to obtain the second location information; The fourth module is specifically used to perform the following steps: The initial segmentation map and the initial image are cropped along the X and Y axes.
8. An electronic device, characterized in that, Including the processor and memory; The memory is used to store programs; The processor executes the program to implement the method as described in any one of claims 1 to 6.
9. A computer-readable storage medium, characterized in that, The storage medium stores a program that is executed by a processor to implement the method as described in any one of claims 1 to 6.