A method and system for automatic segmentation of prostate magnetic resonance T2 sequence images
By combining pixel labeling, scalar cropping, and data augmentation with a multi-path input module, U-Net network, and attention sub-network, the technical bottleneck in automatic segmentation of T2 sequence images from prostate MRI was solved, achieving efficient and accurate image segmentation and improving segmentation efficiency and system development accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHANGHAI QINGPAI INTELLIGENT TECHNOLOGY CO LTD
- Filing Date
- 2026-03-19
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies cannot effectively solve the problem of automatic segmentation of T2 sequence images of prostate MRI, especially in terms of taking into account both local details and global structural information, fusing multi-scale prediction results, noise sensitivity, and consistency of segmentation results.
Pixel labeling, scalar cropping, and data augmentation methods are used to process NMR data. By combining a multi-path input module, a U-Net network, an enhanced residual module, and an attention sub-network, the network weights are optimized using the Dice loss function and a two-dimensional loss function to generate a prostate segmentation map.
It achieves efficient and accurate automatic segmentation of T2 sequence images of prostate MRI, improves segmentation efficiency and system development accuracy, and breaks through the technical bottleneck of multimodal spatial data fusion.
Smart Images

Figure CN122244074A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of medical image segmentation, specifically to an automatic segmentation method and system for prostate MRI T2 sequence images. Background Technology
[0002] Medical image segmentation is a core supporting link in the precise diagnosis and treatment of prostate diseases. T2-weighted MRI images of the prostate, due to their ability to clearly present the anatomical structure of the gland, the extent of lesions, and boundary features, have become a key basis for clinical screening, disease assessment, and treatment plan formulation. Traditional clinical practice relies on manual image segmentation by physicians, which is not only time-consuming, labor-intensive, and inefficient, but also easily affected by subjective experience differences, leading to poor consistency in segmentation results. This makes it difficult to meet the practical needs of large-scale clinical data processing and precision medicine. Therefore, developing efficient and accurate automatic segmentation technology for T2-weighted MRI images of the prostate is of significant practical importance for improving diagnostic and treatment efficiency, reducing the risk of misdiagnosis, and promoting standardized diagnosis and treatment of prostate diseases.
[0003] Existing automatic segmentation techniques still suffer from significant technical bottlenecks: single-path input structures rely solely on fixed-scale feature extraction processes, failing to simultaneously consider both local details and global structural information. Multiple downsampling easily leads to the loss of fine-grained boundary information, resulting in redundant feature representations, low key information recognition, and sensitivity to imaging noise and individual differences. Furthermore, the lack of effective residual enhancement mechanisms or the use of simple residual structures makes them prone to feature degradation and hindered gradient propagation as network depth increases or multi-scale prediction branches multiply. Semantic inconsistencies exist between features at different scales, resulting in insufficient consistency and robustness of segmentation results. Finally, the fusion of multi-scale prediction results often involves simple stitching, lacking an adaptive weight allocation mechanism based on prediction reliability. This easily introduces redundant or unstable scale information, making the segmentation results sensitive to noise, artifacts, or local anomalies, thus limiting overall segmentation accuracy. Summary of the Invention
[0004] To address the shortcomings of existing technologies, this invention proposes an automatic segmentation method and system for prostate MRI T2 sequence images, enabling automatic segmentation of prostate MRI images and the establishment of an intelligent segmentation system.
[0005] The technical solution to achieve the purpose of this invention is as follows:
[0006] An automatic segmentation method for prostate MRI T2 sequence images includes the following steps:
[0007] K MRI images of prostate patients were pre-collected. Each MRI image was labeled using a pixel labeling method, marking the corresponding prostate data position on the MRI image. A coordinate system was established with the lower left corner of the current MRI image as the origin. The coordinates of the marked prostate position were calculated to obtain the label coordinates. The pixel labels were then combined to generate a labeled image file. The MRI images and the labeled image files were combined to generate K sets of file pairs. The pre-collected MRI images were used in the training process.
[0008] The preprocessing process is executed, and the K sets of file pairs are processed using the scalar pruning method and normalized to obtain preprocessed data. The preprocessed data includes preprocessed NMR images and preprocessed marker images.
[0009] The data augmentation process is executed, and data augmentation methods are used to augment the preprocessed data to expand the limited training dataset and obtain training data. Simultaneously, the training data is divided into training set, validation set, and test set, accounting for 75%, 15%, and 10% respectively.
[0010] The training process is as follows: inputting the prostate prediction network, processing the multipath input module to obtain the MRI enhanced image, processing the U-Net network and the enhancement residual to obtain the gland prediction image, extracting features using the attention sub-network to obtain prostate segmentation blocks, calculating the loss of the prostate segmentation blocks using Dice, updating the network weight parameters using a two-dimensional loss function, and completing the training upon convergence. The two-dimensional loss function is NLLLoss2d.
[0011] The prediction process involves collecting real-time MRI data and performing a preprocessing procedure before inputting it into a trained prostate prediction network. This generates prostate segmentation blocks, which are then stitched together to obtain a prostate segmentation map. The real-time MRI data is acquired and applied in real-time during the prediction process and does not include pre-acquired MRI data.
[0012] Furthermore, a pixel-marking method is used on the MRI data to mark the data locations corresponding to the prostate on the MRI data, including the following steps:
[0013] Pixels are extracted from each NMR image, the pixel value of each pixel is obtained and the location information is read, a pixel value file and a location file are generated and sorted by pixel value size, the first 80% of pixel values are extracted, the location file of the corresponding pixel value is read and the location information is obtained, the location of the corresponding pixel in the NMR image is found based on the location information, and the pixel value of the region to which the corresponding pixel belongs is uniformly assigned a value of 1 and marked as the prostate region;
[0014] Extract the last 10% of pixel values, read the location file of the corresponding pixel value, obtain the location information, find the location of the corresponding pixel in the NMR data based on the location information, and assign the pixel value of the region to which the corresponding pixel belongs to 0 to mark it as a non-prostate region;
[0015] The remaining pixel values are extracted, and the pixel values are binary clustered using the size clustering method. The location file of the corresponding pixel value is read to obtain the location information. The pixel values of the large pixel locations are uniformly assigned a value of 1 and marked as the prostate region, while the pixel values of the small pixel locations are uniformly assigned a value of 0 and marked as the non-prostate region.
[0016] Furthermore, scalar pruning is used to process the K sets of file pairs, and the data is normalized to obtain preprocessed data, including the following steps:
[0017] Extract MRI data and marker image files one by one from K sets of files. Set the scan value of a certain point in the MRI data as a scalar. Extract the maximum and minimum values of the scalar in the MRI data to obtain the global maximum and global minimum values. Find all prostate regions in the MRI data based on the marker image files.
[0018] Traverse the prostate region, find the maximum and minimum values of the scalar values in all prostate regions, obtain the local maximum and local minimum values, divide the local maximum value by the global maximum value to obtain the maximum value threshold, and divide the local minimum value by the global minimum value to obtain the minimum value threshold.
[0019] Traverse the prostate region, extract the maximum and minimum values of the current scalar value in the prostate region, and obtain the maximum and minimum values of the gland;
[0020] Divide the maximum value of the gland by the global maximum value to get the percentage of the maximum value, and divide the minimum value of the gland by the global minimum value to get the percentage of the minimum value;
[0021] The maximum and minimum percentages of all prostate regions are averaged to obtain the maximum mean and minimum mean. The inverse of the minimum mean is then superimposed with the maximum mean to obtain the clipping threshold.
[0022] When the clipping threshold is greater than the maximum threshold, update the clipping threshold to the maximum threshold; when the clipping threshold is less than the minimum threshold, update the clipping threshold to the minimum threshold.
[0023] Multiply all marker coordinates in the marker image file by the clipping threshold and overlay them with the marker coordinates to generate refined coordinates. Then, combine the pixel annotations to generate a refined image file.
[0024] The thinning coordinates are matched with the left and right pixel coordinates of the NMR data. If the matching fails, the four coordinate points that are closest to the thinning coordinates are found. Kriging interpolation is used to interpolate the NMR data at the thinning coordinates and the blocks are cropped according to the thinning coordinates to generate a thinning NMR.
[0025] The refined image file and the refined NMR image are normalized in the range of [-1,1] to obtain the preprocessed NMR image and the preprocessed labeled image, which are then combined to generate preprocessed data.
[0026] Furthermore, data augmentation techniques are used to augment the preprocessed data, including the following steps:
[0027] Starting from the origin, the cropping window is set to 224×224 pixels. Simultaneously, the horizontal and vertical translation intervals are both 224 pixels. K preprocessed data images are traversed and cropped to obtain sub-NMR data and sub-labeled image files, forming K×M×N sub-file pairs. The coordinates of the lower left corner of the image in each sub-file pair are recorded synchronously to generate a coordinate file. Here, N is the ratio of the length of the NMR data to the length of the cropping window, and M is the ratio of the width of the NMR data to the width of the cropping window.
[0028] Reverse the sub-file pairs 90 degrees clockwise along the central axis to obtain K×M×N sets of forward and reverse file pairs. Reverse the sub-file pairs 90 degrees counterclockwise along the central axis to obtain K×M×N sets of reverse and reverse file pairs. Use limited training data to augment the dataset. Expand the dimensions of the sub-file pairs, forward and reverse files, and reverse and reverse file pairs to construct training data with a size of 1×224×224 pixels. Divide the training data into training set, validation set, and test set, with a proportion of 75%, 15%, and 10%, respectively.
[0029] Furthermore, the training data is input into the prostate prediction network to obtain prostate segmentation blocks, including the following steps:
[0030] The training data is input into the prostate prediction network, which includes a multi-path input module, a U-Net module, an enhancement residual module, and an attention sub-network.
[0031] The multi-path input module inputs the training data into a 4-layer average pooling module. After each pooling layer, four NMR features with pixel sizes of 1×112×112, 1×56×56, 1×28×28, and 1×14×14 are obtained. The first three NMR features are input into the initial convolution module to obtain three NMR enhancement maps with pixel sizes of 64×112×112, 64×56×56, and 64×28×28. The fourth NMR feature is input into the initial convolution module and then processed by three double-layer convolution modules to obtain the fourth NMR enhancement map with a pixel size of 512×14×14. The convolution modules are conv modules, the initial convolution module is inconv module, which is an improved conv module, and the double-layer convolution module is double_conv, which is two convolution modules connected together.
[0032] Four NMR-enhanced images and training data are input into the U-Net network to obtain five labeled features. The U-Net network includes four down modules, four up modules, and one initial convolutional module. The down and up modules are existing technologies and will not be described in detail.
[0033] The labeled feature with a size of 64x224x224 pixels at the 5th pixel is input into the 5th output module for output mapping, resulting in a gland prediction map with a pixel size of 2x224x224. The labeled feature with a size of 64x112x112 pixels at the 4th pixel is input into the residual convolution module and the 4th output module, resulting in a gland prediction map with a pixel size of 2x112x112. The labeled feature with a size of 128x56x56 pixels at the 3rd pixel is input into two residual convolution modules and the 3rd output module, resulting in a gland prediction map with a pixel size of 2x56x56. The gland prediction map is generated by inputting the labeled feature with the second pixel size of 256x28x28 into the residual convolution module *3 and the second out module, and outputting a gland prediction map with a mapped pixel size of 2x28x28. The labeled feature with the first pixel size of 512x14x14 is input into the residual convolution module *4 and the first out module, and outputting a gland prediction map with a mapped pixel size of 2x14x14. The residual convolution module is the res_conv module, and the out module includes batch normalization, ReLU activation and conv channel compression.
[0034] Five gland prediction maps of different sizes are input into the attention subnetwork, including an enhanced attention module, an enhanced pooling module, and a tensor normalization module;
[0035] The enhanced attention module processes the i-th gland prediction map using batch normalization and ReLU activation to obtain a gland normalization map. Simultaneously, a 2×1 kernel convolution is input for channel compression to obtain a compressed gland map. The compressed gland map is then processed again using batch normalization and ReLU activation and input into a 1×1 kernel convolution module to traverse and obtain 5 gland feature maps. Through two batch normalization and ReLU activations, the 2×1 convolution compresses redundant channel information and aggregates local vertical context, while the 1×1 convolution performs linear reweighting and cross-channel fusion on the features of each channel to achieve linear correction of the prediction map, where i∈{1,2,3,4,5}.
[0036] The enhanced pooling module inputs the i-th gland feature map into the average pooling module, performs global average pooling in the spatial dimension and maps it to channel-level weights to obtain global feature weights. Simultaneously, the i-th gland feature map is input into the max pooling module, which enhances salient features through global max pooling to generate local feature weights, and weights them with global feature weights. This process iterates through the five gland enhancement weights to obtain five gland weight slices.
[0037] The slice with the highest weight among the gland weight slices is selected as the attention weight, and the five gland prediction maps are weighted according to the attention weight to generate the prostate segment block.
[0038] Furthermore, the NMR features are input into the initial convolutional module to obtain the enhanced NMR image. Specific steps include:
[0039] The q-th NMR feature of different sizes is first input into a 3×3 convolution module to obtain a refined NMR feature. Then, it is processed by a batch normalization and ReLU activation module to obtain a compressed NMR feature, which stabilizes the feature distribution and introduces a nonlinear expression. Finally, a 3×3 convolution module is used to process the compressed NMR feature to obtain an enhanced NMR map, where q∈{1,2,3} and the convolution module is a conv module.
[0040] Furthermore, the labeled features of different pixel sizes are fed into the residual convolution module for processing, including the following steps:
[0041] The labeled features are fed into a two-layer convolutional module to calculate the size of the output of the two-layer convolutional module. When the size of the output of the two-layer convolutional module is the same as the size of the labeled features, the output of the residual convolutional module is equal to the element-wise addition of the output of the two-layer convolutional module and the labeled features. Otherwise, the output of the residual convolutional module is equal to the element-wise addition of the output of the two-layer convolutional module and the output of the single-layer convolutional module. The single-layer convolutional module is called the one_conv module, and the residual convolutional module is called the res_conv module.
[0042] The prostate prediction network divides the training data into 75%, 15%, and 10% portions, with a maximum number of iterations of 10,000. It converges and stops iterating when the mean square error is less than 0.01. The learning rate is initially 0.001 and decays by 0.8 every 100 iterations.
[0043] An automatic segmentation system for prostate MRI T2 sequence images, used to execute the automatic segmentation method for prostate MRI T2 sequence images, includes an acquisition module, a preprocessing module, a data augmentation module, a training module, and a prediction module;
[0044] The acquisition module pre-collects K MRI images of prostate patients. Each MRI image is labeled with a pixel marker, marking the corresponding prostate data position. A coordinate system is established with the lower left corner of the current MRI image as the origin. The coordinates of the marked prostate position are calculated to obtain the marked coordinates. The pixel annotations are combined to generate a labeled image file. The MRI images and the labeled image files are combined to generate K sets of file pairs. The pre-collected MRI images are used in the training process.
[0045] The preprocessing module uses scalar pruning to process K sets of file pairs and normalizes them to obtain preprocessed data, which includes preprocessed NMR maps and preprocessed marker maps.
[0046] The data augmentation module uses data augmentation techniques to augment the preprocessed data, thereby expanding the limited training dataset to obtain training data. Simultaneously, the training data is divided into training set, validation set, and test set, accounting for 75%, 15%, and 10% respectively.
[0047] The training module inputs training data into the prostate prediction network, processes it using a multi-path input module to obtain an enhanced MRI image, processes it with the U-Net network and the enhancement residual to obtain a gland prediction image, extracts features using an attention sub-network to obtain prostate segmentation blocks, calculates the loss of the prostate segmentation blocks using Dice, updates the network weight parameters using a two-dimensional loss function, and completes training upon convergence. The two-dimensional loss function is NLLLoss2d.
[0048] The prediction module collects real-time MRI data and performs a preprocessing process. After extracting the red, green, and blue channel information, it inputs it into the trained prostate prediction network to generate prostate segmentation blocks and stitch them together to obtain a prostate segmentation map. The real-time MRI is acquired and applied in real time in the prediction process and does not include the pre-acquired MRI.
[0049] Compared with existing technologies, this invention pre-collects K MRI images of prostate patients using an acquisition module, performs pixel annotation and combination to generate labeled image files, combines the MRI data and labeled image files to generate K sets of file pairs, processes the K sets of file pairs using preprocessing methods, applies data augmentation to the preprocessed data, uses a multi-path input module to enhance the global perception and localization capabilities of the prostate region to obtain an enhanced MRI image, uses a U-Net network and an enhanced residual network to enhance discriminative information and suppress noise interference to obtain a gland prediction image, and uses an attention subnetwork to highlight high-confidence gland responses, weighted according to attention weights to generate a prostate segmentation map. This scheme constructs a cross-modal semantic mapping mechanism through multi-technology fusion, overcomes the bottleneck of manual intervention, systematically solves the technical bottleneck of automatic segmentation of prostate T2 MRI sequence images under multimodal spatial data fusion, realizes full-process automation from data acquisition to image segmentation, and significantly improves segmentation efficiency and system development accuracy. Attached Figure Description
[0050] Figure 1 This is a flowchart of the method.
[0051] Figure 2 The structure diagram of the multi-path ResUNet;
[0052] Figure 3 For attention subnetwork graph;
[0053] Figure 4 This is a schematic diagram of the system. Detailed Implementation
[0054] The present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
[0055] Example 1
[0056] like Figure 1 As shown, this invention discloses an automatic segmentation method for prostate MRI T2 sequence images, comprising the following steps:
[0057] K MRI images of prostate patients were pre-collected. Each MRI image was labeled using a pixel labeling method, marking the corresponding prostate data position on the MRI image. A coordinate system was established with the lower left corner of the current MRI image as the origin. The coordinates of the marked prostate position were calculated to obtain the label coordinates. The pixel labels were then combined to generate a labeled image file. The MRI images and the labeled image files were combined to generate K sets of file pairs. The pre-collected MRI images were used in the training process.
[0058] The preprocessing process is executed, and the K sets of file pairs are processed using the scalar pruning method and normalized to obtain preprocessed data. The preprocessed data includes preprocessed NMR images and preprocessed marker images.
[0059] The data augmentation process is executed, and data augmentation methods are used to augment the preprocessed data to expand the limited training dataset and obtain training data. Simultaneously, the training data is divided into training set, validation set, and test set, accounting for 75%, 15%, and 10% respectively.
[0060] The training process is as follows: inputting the prostate prediction network, processing the multipath input module to obtain the MRI enhanced image, processing the U-Net network and the enhancement residual to obtain the gland prediction image, extracting features using the attention sub-network to obtain prostate segmentation blocks, calculating the loss of the prostate segmentation blocks using Dice, updating the network weight parameters using a two-dimensional loss function, and completing the training upon convergence. The two-dimensional loss function is NLLLoss2d.
[0061] The prediction process involves collecting real-time MRI data and performing a preprocessing procedure before inputting it into a trained prostate prediction network. This generates prostate segmentation blocks, which are then stitched together to obtain a prostate segmentation map. The real-time MRI data is acquired and applied in real-time during the prediction process and does not include pre-acquired MRI data.
[0062] Furthermore, a pixel-marking method is used on the MRI data to mark the data locations corresponding to the prostate on the MRI data, including the following steps:
[0063] Pixels are extracted from each NMR image, the pixel value of each pixel is obtained and the location information is read, a pixel value file and a location file are generated and sorted by pixel value size, the first 80% of pixel values are extracted, the location file of the corresponding pixel value is read and the location information is obtained, the location of the corresponding pixel in the NMR image is found based on the location information, and the pixel value of the region to which the corresponding pixel belongs is uniformly assigned a value of 1 and marked as the prostate region;
[0064] Extract the last 10% of pixel values, read the location file of the corresponding pixel value, obtain the location information, find the location of the corresponding pixel in the NMR data based on the location information, and assign the pixel value of the region to which the corresponding pixel belongs to 0 to mark it as a non-prostate region;
[0065] The remaining pixel values are extracted, and the pixel values are binary clustered using the size clustering method. The location file of the corresponding pixel value is read to obtain the location information. The pixel values of the large pixel locations are uniformly assigned a value of 1 and marked as the prostate region, while the pixel values of the small pixel locations are uniformly assigned a value of 0 and marked as the non-prostate region.
[0066] Furthermore, scalar pruning is used to process the K sets of file pairs, and the data is normalized to obtain preprocessed data, including the following steps:
[0067] Extract MRI data and marker image files one by one from K sets of files. Set the scan value of a certain point in the MRI data as a scalar. Extract the maximum and minimum values of the scalar in the MRI data to obtain the global maximum and global minimum values. Find all prostate regions in the MRI data based on the marker image files.
[0068] Traverse the prostate region, find the maximum and minimum values of the scalar values in all prostate regions, obtain the local maximum and local minimum values, divide the local maximum value by the global maximum value to obtain the maximum value threshold, and divide the local minimum value by the global minimum value to obtain the minimum value threshold.
[0069] Traverse the prostate region, extract the maximum and minimum values of the current scalar value in the prostate region, and obtain the maximum and minimum values of the gland;
[0070] Divide the maximum value of the gland by the global maximum value to get the percentage of the maximum value, and divide the minimum value of the gland by the global minimum value to get the percentage of the minimum value;
[0071] The maximum and minimum percentages of all prostate regions are averaged to obtain the maximum mean and minimum mean. The inverse of the minimum mean is then superimposed with the maximum mean to obtain the clipping threshold.
[0072] When the clipping threshold is greater than the maximum threshold, update the clipping threshold to the maximum threshold; when the clipping threshold is less than the minimum threshold, update the clipping threshold to the minimum threshold.
[0073] Multiply all marker coordinates in the marker image file by a clipping threshold and overlay them with the marker coordinates to generate refined coordinates. Then, perform pixel annotation combination to generate a refined image file, thereby achieving refined clipping of the prostate region in the marker image file and improving the marking accuracy.
[0074] The thinning coordinates are matched with the left and right pixel coordinates of the NMR data. If the matching fails, the four coordinate points that are closest to the thinning coordinates are found. Kriging interpolation is used to interpolate the NMR data at the thinning coordinates and the blocks are cropped according to the thinning coordinates to generate a thinning NMR.
[0075] The refined image file and the refined NMR image are normalized in the range of [-1,1] to obtain the preprocessed NMR image and the preprocessed labeled image, which are then combined to generate preprocessed data.
[0076] Furthermore, data augmentation techniques are used to augment the preprocessed data, including the following steps:
[0077] Starting from the origin, the cropping window is set to 224×224 pixels. Simultaneously, the horizontal and vertical translation intervals are both 224 pixels. K preprocessed data images are traversed and cropped to obtain sub-NMR data and sub-labeled image files, forming K×M×N sub-file pairs. The coordinates of the lower left corner of the image in each sub-file pair are recorded synchronously to generate a coordinate file. Here, N is the ratio of the length of the NMR data to the length of the cropping window, and M is the ratio of the width of the NMR data to the width of the cropping window.
[0078] Reverse the sub-file pairs 90 degrees clockwise along the central axis to obtain K×M×N sets of forward and reverse file pairs. Reverse the sub-file pairs 90 degrees counterclockwise along the central axis to obtain K×M×N sets of reverse and reverse file pairs. Use limited training data to augment and expand the dataset. Expand the dimensions of the sub-file pairs, forward and reverse files, and reverse and reverse file pairs to construct training data with a size of 1×224×224 pixels. Divide the training data into training set, validation set, and test set, with a proportion of 75%, 15%, and 10%, respectively.
[0079] Furthermore, the training data is input into the prostate prediction network to obtain prostate segmentation blocks, including the following steps:
[0080] The training data is input into the prostate prediction network, which includes a multi-path input module, a U-Net module, an enhancement residual module, and an attention sub-network.
[0081] The multi-path input module inputs the training data into a four-layer average pooling module. After each pooling layer, four NMR features with pixel sizes of 1×112×112, 1×56×56, 1×28×28, and 1×14×14 are obtained, respectively. This reduces the resolution while preserving the overall structural information, forming a multi-scale representation from shallow to deep, enabling the learning of local details and global context. The first three NMR features are then input into the initial convolution module, resulting in three enhanced NMR maps with pixel sizes of 64×112×112, 64×56×56, and 64×28×28, respectively. Channel expansion... The initial semantic mapping enhances the boundary and texture response, improves the discriminativeness and noise resistance of shallow features, and inputs the fourth NMR feature into the initial convolutional module. After processing by three double-layer convolutional modules, the fourth NMR enhancement map with a pixel size of 512×14×14 is obtained. The receptive field is expanded and high-level semantics are refined through deeper convolution stacking, which enhances the global perception and localization ability of the prostate region. The convolutional module is a conv module, the initial convolutional module is an inconv module, which is an improved conv module, and the double-layer convolutional module is double_conv, which is two convolutional modules connected together.
[0082] Four NMR-enhanced images and training data are input into the U-Net network to obtain five labeled features. The U-Net network includes four down modules, four up modules, and one initial convolutional module. The down and up modules are existing technologies and will not be described in detail.
[0083] The 5th pixel's labeled feature (64x224x224 pixels) is input into the 5th output module for output mapping, resulting in a gland prediction image with a pixel size of 2x224x224. The 4th pixel's labeled feature (64x112x112 pixels) is input into the residual convolution module and the 4th output module, resulting in a gland prediction image with a pixel size of 2x112x112. The 3rd pixel's labeled feature (128x56x56 pixels) is input into two residual convolution modules and the 3rd output module, resulting in a gland prediction image with a pixel size of 2x56x56. The 2nd pixel's labeled feature (256x28x28 pixels) is input into the residual convolution module*. The third and second out modules output a gland prediction map with a mapped pixel size of 2x28x28. The first labeled feature with a pixel size of 512x14x14 is input into the residual convolution module*4 and the first out module, and outputs a gland prediction map with a mapped pixel size of 2x14x14. The residual module plays a comprehensive role in stabilizing feature distribution, enhancing discriminative information and suppressing noise interference in the multi-scale prediction process, so that the model can maintain high segmentation accuracy and good generalization ability under complex prostate morphology and diverse imaging conditions. The residual convolution module is the res_conv module, and the out module includes batch normalization, ReLU activation and conv channel compression.
[0084] Five gland prediction maps of different sizes are input into the attention subnetwork, including an enhanced attention module, an enhanced pooling module, and a tensor normalization module;
[0085] The enhanced attention module processes the i-th gland prediction map using batch normalization and ReLU activation to obtain a gland normalization map. Simultaneously, a 2×1 kernel convolution is input for channel compression to obtain a compressed gland map. The compressed gland map is then processed again using batch normalization and ReLU activation and input into a 1×1 kernel convolution module to traverse and obtain 5 gland feature maps. Through two batch normalization and ReLU activations, the feature distribution of the gland prediction map is standardized and nonlinear enhancement is introduced to highlight high-confidence gland responses and suppress background noise, achieving attention-guided feature recalibration and boundary refinement. The 2×1 convolution compresses redundant channel information and aggregates local vertical context to obtain a more compact discriminative representation. The 1×1 convolution performs linear reweighting and cross-channel fusion on the features of each channel to achieve linear correction of the prediction map and improve the accuracy and consistency of gland prediction, where i∈{1,2,3,4,5}.
[0086] The enhanced pooling module inputs the i-th gland feature map into the average pooling module, performs global average pooling in the spatial dimension and maps it to channel-level weights to obtain global feature weights. Simultaneously, the i-th gland feature map is input into the max pooling module, which enhances salient features through global max pooling to generate local feature weights, and weights them with global feature weights. This process iterates through the five gland enhancement weights to obtain five gland enhancement weights. While extracting key gland structure and boundary information, this module can focus on both local details of features and capture global features, achieving adaptive recalibration and noise suppression of gland regions and improving the accuracy of gland boundary judgment.
[0087] The tensor normalization module concatenates the five gland enhancement weights through the cat module and inputs them into SoftMax to perform the normalization process, resulting in five gland weight slices. This enables adaptive allocation of the relative importance of the gland enhancement weights, avoids weight scale drift, highlights more critical gland responses, and improves the stability and discriminativeness of multi-gland feature fusion.
[0088] The slice with the highest weight among the gland weight slices is selected as the attention weight, and the five gland prediction maps are weighted according to the attention weight to generate the prostate segment block.
[0089] Furthermore, the NMR features are input into the initial convolutional module to obtain the enhanced NMR image. Specific steps include:
[0090] The q-th NMR feature of different sizes is first input into a 3×3 convolutional module to obtain a refined NMR feature, which is used to enhance local texture and boundary response, suppress detail loss caused by downsampling, and improve the separability of shallow features. Then, it is processed by batch normalization and ReLU activation module to obtain a compressed NMR feature, which stabilizes the feature distribution and introduces nonlinear expression. Finally, a 3×3 convolutional module is used to process the compressed NMR feature to obtain an enhanced NMR map, which integrates the neighborhood context and highlights key structural regions, improving the feature fusion effect and discrimination ability. Here, q∈{1,2,3}, and the convolutional module is a conv module.
[0091] Furthermore, the labeled features of different pixel sizes are fed into the residual convolution module for processing, including the following steps:
[0092] The labeled features are fed into a two-layer convolutional module to calculate the size of the output of the two-layer convolutional module. When the size of the output of the two-layer convolutional module is the same as the size of the labeled features, the output of the residual convolutional module is equal to the element-wise sum of the output of the two-layer convolutional module and the labeled features. Otherwise, the output of the residual convolutional module is equal to the element-wise sum of the output of the two-layer convolutional module and the output of the single-layer convolutional module. The single-layer convolutional module is called the one_conv module, and the residual convolutional module is called the res_conv module. The prostate prediction network divides the training data into 75%, 15%, and 10%, with a maximum number of iterations of 10,000. It converges and stops iterating when the mean squared error is less than 0.01. The learning rate is initially 0.001 and decays by 0.8 every 100 iterations.
[0093] Example 2
[0094] like Figure 4 As shown, the present invention also discloses an automatic segmentation system for prostate MRI T2 sequence images, used to execute the aforementioned automatic segmentation method for prostate MRI T2 sequence images, including an acquisition module, a preprocessing module, a data augmentation module, a training module, and a prediction module;
[0095] The acquisition module pre-collects K MRI images of prostate patients. Each MRI image is labeled with a pixel marker, marking the corresponding prostate data position. A coordinate system is established with the lower left corner of the current MRI image as the origin. The coordinates of the marked prostate position are calculated to obtain the marked coordinates. The pixel annotations are combined to generate a labeled image file. The MRI images and the labeled image files are combined to generate K sets of file pairs. The pre-collected MRI images are used in the training process.
[0096] The preprocessing module uses scalar pruning to process K sets of file pairs and normalizes them to obtain preprocessed data, which includes preprocessed NMR maps and preprocessed marker maps.
[0097] The data augmentation module uses data augmentation techniques to augment the preprocessed data, thereby expanding the limited training dataset to obtain training data. Simultaneously, the training data is divided into training set, validation set, and test set, accounting for 75%, 15%, and 10% respectively.
[0098] The training module inputs training data into the prostate prediction network, processes it using a multi-path input module to obtain an enhanced MRI image, processes it with the U-Net network and the enhancement residual to obtain a gland prediction image, extracts features using an attention sub-network to obtain prostate segmentation blocks, calculates the loss of the prostate segmentation blocks using Dice, updates the network weight parameters using a two-dimensional loss function, and completes training upon convergence. The two-dimensional loss function is NLLLoss2d.
[0099] The prediction module collects real-time MRI data and performs a preprocessing procedure before inputting it into the trained prostate prediction network to generate prostate segmentation blocks and stitch them together to obtain a prostate segmentation map. The real-time MRI is acquired and applied in real time in the prediction process, excluding the pre-acquired MRI.
[0100] This invention discloses an automatic segmentation method and system for prostate T2 MRI sequence images. The method involves acquiring K MRI images of a prostate patient using an acquisition module, performing pixel annotation and combining to generate labeled image files, and then combining the MRI data and labeled image files to generate K file pairs. Preprocessing methods are used to process these K file pairs, followed by data augmentation to enhance the preprocessed data. A multi-path input module enhances the global perception and localization capabilities of the prostate region, resulting in an enhanced MRI image. A U-Net network and an enhanced residual network are used to enhance discriminative information and suppress noise interference, yielding a gland prediction image. An attention subnetwork is used to highlight high-confidence gland responses, and the responses are weighted according to attention weights to generate a prostate segmentation map. This scheme overcomes the bottleneck of manual intervention by constructing a cross-modal semantic mapping mechanism through multi-technology fusion, systematically solving the technical bottleneck of automatic segmentation of prostate T2 MRI sequence images under multi-modal spatial data fusion. It achieves full-process automation from data acquisition to image segmentation, significantly improving segmentation efficiency and system development accuracy.
[0101] The above description is merely a preferred embodiment of the present invention, and the scope of protection of the present invention is not limited to the above embodiments. All technical solutions falling within the scope of the present invention's concept are within the scope of protection of the present invention. It should be noted that for those skilled in the art, any improvements and modifications made without departing from the principle of the present invention should also be considered within the scope of protection of the present invention.
Claims
1. An automatic segmentation method for prostate MRI T2 sequence images, characterized in that, Includes the following steps: K MRI images of prostate patients were pre-collected and labeled one by one using pixel labeling to generate K labeled image files, which were then combined with the K MRI image data to form K file pairs; The preprocessing process is executed, and the K groups of files are truncated into blocks and normalized using the scalar pruning method to obtain preprocessed data. Data augmentation techniques are used to enhance preprocessed data to obtain training data; The training process involves inputting training data into the prostate prediction network, processing it using a multi-path input module to obtain an enhanced MRI image, processing it with the U-Net network and the enhancement residual to obtain a gland prediction image, extracting features using an attention sub-network to obtain prostate segmentation blocks, calculating the loss of the prostate segmentation blocks using Dice, updating the network weight parameters using a two-dimensional loss function, and completing the training upon convergence. The two-dimensional loss function is NLLLoss2d. After acquiring real-time MRI data and performing a preprocessing procedure, the data is input into a trained prostate prediction network to generate prostate segmentation blocks, which are then stitched together to obtain a prostate segmentation map.
2. The automatic segmentation method for prostate MRI T2 sequence images as described in claim 1, characterized in that, pixel labeling is used for annotation. Includes the following steps: Extract all pixel values and locations from each NMR image, generate pixel value files and location files and sort them. Assign the first 80% of the pixels to 1 and mark them as the prostate region, and assign the last 10% of the pixels to 0 and mark them as the non-prostate region. The remaining pixel values are clustered by size. Large pixel values are assigned 1 to mark the prostate region, and the remaining values are assigned 0 to mark the non-prostate region.
3. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 1, wherein scalar cropping is used to process K sets of file pairs and normalize them to obtain preprocessed data, characterized in that, Includes the following steps: Extract MRI data and marker map files one by one from K file pairs. Set the MRI data scan value as a scalar, extract the maximum and minimum values of the scalar in the MRI data to obtain the global maximum and minimum values, and find all prostate regions in the MRI data based on the marker map files. Multiply all marker coordinates in the marker image file by the clipping threshold and overlay them with the marker coordinates to generate refined coordinates. Then, combine the pixel annotations to generate a refined image file. The refined coordinates are matched with the left and right pixel coordinates of the NMR data. If the matching fails, the four coordinate points that are closest to the refined coordinates are found and interpolated. The block is then cropped according to the refined coordinates to generate a refined NMR. The refined image file and the refined NMR image are normalized and combined to generate preprocessed data.
4. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 1, characterized in that data augmentation is used to enhance the preprocessed data, and that... Includes the following steps: The cropping window is set to 224×224 pixels. The horizontal and vertical translation intervals are both 224 pixels. K preprocessed data are traversed and cropped to obtain sub-NMR data and sub-label map files. These are combined into sub-file pairs and a coordinate file is generated. Here, N is the ratio of the NMR data length to the cropping window length, and M is the ratio of the NMR data width to the cropping window width. The sub-file pairs are rotated 90 degrees clockwise and counterclockwise along the central axis to obtain forward-reverse file pairs and reverse-reverse file pairs. The sub-file pairs, forward-reverse files, and reverse-reverse file pairs are then combined to form pre-training data.
5. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 1, wherein a multi-path input module is used to process and obtain an enhanced MRI image, characterized in that, Includes the following steps: The training data is input into a 4-layer average pooling module to obtain 4 NMR features of different sizes. The first 3 NMR features are input into an initial convolution module to obtain 3 NMR enhancement maps of different sizes. The 4th NMR feature is input into the initial convolution module and simultaneously input into a 3-layer double convolution module to obtain the 4th NMR enhancement map. The double convolution module is a double_conv module.
6. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 1, wherein a gland prediction map is obtained after enhancement residual processing, characterized in that, Includes the following steps: The fifth labeled feature is input into the fifth out module for output mapping, and the gland prediction map is output. The remaining labeled features are input into the residual convolution module, residual convolution module *2, residual convolution module *3, and residual convolution module *4 respectively, and then through the corresponding out modules to output the gland prediction map. The residual convolution module is the res_conv module.
7. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 1, characterized in that, attention subnetworks are used to extract features. Includes the following steps: The attention subnetwork includes an enhanced attention module, an enhanced pooling module, and a tensor normalization module. The enhanced attention module is used to obtain the gland feature map, the enhanced pooling module is used to obtain the gland enhanced weights, and the tensor normalization module is used to obtain the gland weight slices.
8. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 7, wherein an enhanced attention module is used to obtain glandular feature maps, characterized in that, Includes the following steps: The enhanced attention module processes the gland prediction map using batch normalization and ReLU activation to obtain a normalized gland map. This normalized map is then simultaneously input into a convolutional module for channel compression, resulting in a compressed gland map. The compressed gland map is then processed again using batch normalization and ReLU activation and input into the convolutional module to obtain a gland feature map. Through two rounds of batch normalization and ReLU activation, redundant channel information is compressed via convolution. Linear reweighting and cross-channel fusion are then performed on the features of each channel to achieve linear correction of the gland feature map.
9. The method for automatic segmentation of prostate MRI T2 sequence images as described in claim 7, wherein an enhancement pooling module is used to obtain gland enhancement weights, characterized in that, Includes the following steps: The enhanced pooling module inputs the gland feature map into the average pooling module to obtain global feature weights. Simultaneously, it inputs the gland feature map into the max pooling module to generate local feature weights, which are then weighted with the global feature weights to obtain gland enhancement weights. The tensor normalization module concatenates the gland enhancement weights through the cat module and inputs them into SoftMax to perform a normalization process, resulting in gland weight slices.
10. An automatic segmentation system for prostate MRI T2 sequence images, characterized in that, It includes a data acquisition module, a preprocessing module, a data augmentation module, a training module, and a prediction module. The acquisition module pre-acquires K MRI images of prostate patients and annotates each image using pixel labeling, generating K labeled image files and combining them with the K MRI image data to form K file pairs; The preprocessing module uses scalar pruning to process K sets of file pairs and normalizes them to obtain preprocessed data; The data augmentation module uses data augmentation techniques to enhance the preprocessed data to obtain training data; The training module executes the training process and inputs the prostate prediction network. The multi-path input module processes the data to obtain the MRI enhanced image. After processing with the U-Net network and the enhancement residual, the gland prediction image is obtained. The attention sub-network is used to extract features to obtain prostate segmentation blocks. The loss of the prostate segmentation blocks is calculated through Dice. The network weight parameters are updated using a two-dimensional negative log-likelihood loss function. Training is completed upon convergence. Here, NLLLoss2d is a two-dimensional negative log-likelihood loss function. The prediction module collects real-time MRI data and performs a preprocessing procedure before inputting it into a trained prostate prediction network to generate prostate segmentation blocks and stitch them together to obtain a prostate segmentation map.