A method and apparatus for cerebral artery segmentation classification based on MRI images
By constructing an end-to-end U-shaped encoder-decoder network and utilizing the Swing Transformer and multi-scale feature fusion technology, the problems of long imaging time, noise interference, and class imbalance in cerebral artery segment classification in MRI images were solved, achieving accurate vascular segment classification.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- ZHEJIANG UNIV OF TECH
- Filing Date
- 2026-02-09
- Publication Date
- 2026-06-30
AI Technical Summary
Existing MRI image processing methods suffer from problems such as long imaging time, uneven image quality, noise interference, and class imbalance in cerebral artery segmentation and classification, resulting in large segmentation errors and low accuracy, making it difficult to achieve accurate blood vessel segmentation and classification.
An end-to-end U-shaped encoder-decoder network based on MRI images was constructed. The Swing Transformer module was used for feature extraction. Combined with multi-scale feature fusion and graph structure update, data augmentation techniques were used to improve the robustness and feature representation ability of the model, thereby achieving accurate classification of blood vessel segments.
It significantly improves the accuracy and robustness of cerebral artery segment classification, enabling precise segment classification while maintaining the integrity of vascular topology, reducing imaging time and improving image quality.
Smart Images

Figure CN121686115B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of medical image processing, and more specifically to a method and device for classifying cerebral arteries based on MRI images. Background Technology
[0002] As a vital circulatory pathway in the human body, the precise anatomical division of the cerebral arterial system is crucial for the diagnosis and treatment of neurovascular diseases. Achieving accurate segmentation and classification of cerebral arteries, including major branches of the Circle of Willis such as the anterior, middle, and posterior cerebral arteries and their respective subdivisions, can assist clinicians in accurately locating vascular lesions. This provides important reference for the diagnosis of cerebral aneurysms, the treatment of ischemic stroke, and navigation for endovascular interventional procedures.
[0003] Magnetic resonance imaging (MRI) is one of the most important tools in medical diagnosis and research. Due to its lack of radiation damage and high soft tissue resolution, MRI is widely used in clinical diagnosis. However, the long acquisition time required for MRI often causes discomfort to patients. Therefore, how to shorten MRI time without reducing, or even improving, the overall or local image quality is a key research focus and challenge.
[0004] Furthermore, during the acquisition of MRI images, factors such as imaging mechanisms, imaging equipment, and individual differences lead to gray-level imbalances in the acquired MR images. This is mainly manifested in significant differences in gray levels within the same tissue type or substantial gray-level overlap between different tissue types. This unique gray-level imbalance is also known as the offset field of MRI images. This offset field causes significant segmentation errors in tissue segmentation of MRI images, leading many segmentation methods that rely on gray-level information to produce incorrect segmentation results. Moreover, MRI image processing methods often require substantial manual intervention, which is time-consuming and error-prone. Automated and efficient image processing methods are crucial for improving diagnostic efficiency. Additionally, post-image processing of MRI images is affected by various factors, resulting in blurred or low-resolution images, which will affect the physician's accurate judgment of lesion locations.
[0005] In recent years, deep learning algorithms have demonstrated tremendous potential in MRI analysis, attracting an increasing number of researchers to use them to solve this challenging problem, primarily focusing on fundamental tasks such as image denoising, vessel enhancement, and vessel-background segmentation. For example, leveraging the excellent performance of the denoising diffusion probability model (DDPM) and score-based generative models in image generation, using diffusion models to reconstruct undersampled MRI images yields better reconstruction results. However, the slow sampling speed of diffusion models during imaging is a major obstacle to their development. Furthermore, in practical applications, shortening the acquisition time is often achieved by altering the basic acquisition parameters of existing sequences, leading to a decrease in accuracy.
[0006] Furthermore, MRI imaging is inevitably affected by inherent system noise, resulting in images with low signal-to-noise ratios, which significantly hinders the classification of cerebral blood vessels. Secondly, the complex structure of cerebral arteries leads to varying proportions of different vessel segments. This class imbalance makes models prone to getting trapped in local optima focused on identifying large-scale vessel segments, neglecting the learning of features from smaller, equally clinically important segments. Achieving accurate cerebral artery segment classification requires not only overcoming the limitations of traditional image processing methods but also urgently developing novel deep learning architectures and feature representation techniques. Summary of the Invention
[0007] The purpose of this invention is to provide a method and device for cerebral artery segmentation and classification based on MRI images. An end-to-end U-shaped encoder-decoder network for cerebral artery segmentation and classification is constructed. Pre-training methods and data augmentation techniques are used to encode image features and enhance data diversity. The Swing Transformer module is used as the encoder layer for latent vector feature extraction, effectively improving the model's feature representation capability. A multi-scale feature fusion module enhances and reconstructs the original features to optimize the segmentation model at the latent space feature level. Finally, the decoder network upsamples the reconstructed features to obtain more accurate vessel segment classification results.
[0008] To achieve the above-mentioned objectives, an embodiment provides a method for classifying cerebral arteries based on MRI images, comprising the following steps:
[0009] Step 1: Using MRI 3D images of cerebral arteries as input, preprocessing is performed to obtain standardized vessel volume images;
[0010] Step 2: Using standardized blood vessel volume images as input, hierarchical feature extraction is performed through a Swin Transformer-based encoder to obtain multi-scale features;
[0011] Step 3: Based on multi-scale features, establish a graph structure through the image fusion module to form multi-level features, and update the graph structure to realize the updating and enhancement of graph node features, thus obtaining the updated multi-level features;
[0012] Step 4: Input the updated multi-level features into the decoding module, and reconstruct the feature map by upsampling layer by layer and fusing it with the multi-level features output by the image fusion module, and output the cerebral artery segmentation classification results.
[0013] In one embodiment, the preprocessing in step 1 includes data format conversion, image cropping, resampling, and normalization;
[0014] The data format conversion includes: converting the input MRI images into a four-dimensional NumPy array format and saving the corresponding metadata;
[0015] The image cropping includes: identifying non-zero regions in the image after data format conversion, where the non-zero regions contain actual cerebral artery structure information, calculating the minimum three-dimensional bounding box of the non-zero regions and cropping them;
[0016] The resampling includes: resampling the cropped image and unifying the voxel spacing of all samples, wherein the voxel represents the smallest sampling unit of a blood vessel in three-dimensional space;
[0017] The standardization includes: standardizing the resampled MRI images based on the mean and variance of a single image, and cropping the image intensity to remove extreme noise intensity values.
[0018] In one embodiment, the encoder in step 2 comprises four sequentially stacked encoding stages, each consisting of a Swing Transformer module for progressively capturing semantic features from local to global, and downsampling via an image patch merging layer after each stage.
[0019] Each Swin Transformer module includes: a window-based multi-head self-attention layer, a shift-window-based multi-head self-attention layer, and a multilayer perceptron.
[0020] In one embodiment, hierarchical feature extraction is performed at each encoding stage, specifically including:
[0021] The input standardized image is divided into multiple image patches through an image patch embedding layer. Each image patch is flattened and linearly projected onto a high-dimensional feature space to form initial features.
[0022] The initial features are input into the window-based multi-head self-attention layer in the Swin Transformer module, which divides the features into multiple local windows. Multi-head self-attention of the initial features is calculated within each local window to capture short-range dependencies within the local region, resulting in the first attention feature. Simultaneously, the window division is spatially offset by a multi-head self-attention layer based on shifted windows to establish long-range dependencies across windows, enhancing the model's global context modeling ability, resulting in the second attention feature.
[0023] The first and second attention features are subjected to nonlinear transformation and channel dimension feature enhancement by a multilayer perceptron, and then subjected to residual connection and layer normalization to obtain the enhanced features.
[0024] The enhanced features are input to the image patch merging layer for hierarchical downsampling to obtain multi-scale features.
[0025] In one embodiment, step 3, which involves establishing a graph structure based on multi-scale features using an image fusion module, includes:
[0026] The input multi-scale features are unified to the same spatial size through pooling operations, resulting in multi-scale features of uniform size.
[0027] Multi-scale features of uniform size are fused, and the fused multi-scale features are constructed into a graph structure. ; where the nodes of the graph structure Corresponding to voxels, edges This represents the local interaction relationship between voxel nodes, where a voxel represents the smallest sampling unit of a blood vessel in three-dimensional space.
[0028] In one embodiment, step 3 of updating the graph structure includes: performing a graph update operation on the graph structure, performing a nonlinear transformation by calculating the feature differences between nodes, and fusing it with the node features before the graph structure update to obtain the updated multi-level features and the updated graph structure.
[0029] In one embodiment, the decoding module in step 4 is a decoding path constructed based on multi-layer convolutional blocks; specifically, it includes:
[0030] During the decoding process, a hierarchical feature representation is learned from the updated multi-scale features through the first convolutional block to obtain the shallow and deep features of the MRI image;
[0031] The encoder extracts shallow and deep features from the MRI image, and fuses these features with the features upsampled by the decoder through skip connections. The fused features are then passed through a convolutional block to obtain a reconstructed feature map.
[0032] In one embodiment, the loss function used when reconstructing the feature map includes pixel-level loss and cross-entropy loss;
[0033] The pixel-level loss It is a metric function for calculating overlapping regions, defined as:
[0034] ,
[0035] in, This indicates the total number of vessel segment categories. For the first The MRI image number Probability of predicting blood vessel segments of a class The total number of MRI images. For the first The MRI image number True labeling of blood vessel segments of the class It is a smoothing constant;
[0036] The cross-entropy loss The value used to measure the difference between the predicted class probability distribution and the true label distribution at each pixel in an image is defined as:
[0037] ,
[0038] in, This indicates the total number of vessel segment categories. For the first For vascular segment-like unique heat tags, the position corresponding to the actual category is 1, and the rest are 0. For the first The logits output of the blood vessel segment of the class. The Softmax function is used to convert the logits of all vessel segment categories into a probability distribution. The output value of each class is mapped to the interval [0,1] and the sum of the probabilities of all classes is 1. Calculate the... The predicted probability of vascular segments, in the formula It is the natural logarithm function, and the negative sign ensures that the loss is smaller as the prediction gets closer to the true class.
[0039] In one embodiment, the reconstructed feature map is mapped to a Softmax layer to generate a vascular segment classification probability map and output the cerebral artery segment classification result.
[0040] The present invention also provides a brain artery segmentation classification device based on MRI images, including a memory and a processor. The memory is used to store a computer program, and the processor is used to implement the brain artery segmentation classification method based on MRI images when the computer program is executed.
[0041] Compared with the prior art, the beneficial effects of the present invention include at least the following:
[0042] This invention addresses the challenges of accurate segmentation and classification of cerebral arteries on MRI by proposing an end-to-end cerebral artery segmentation and classification network, effectively achieving precise MRI cerebral artery segmentation and classification. First, specialized image preprocessing techniques effectively suppress inherent speckle noise and imaging artifacts in MRI images. Simultaneously, data augmentation strategies simulate complex imaging variations in real-world scenarios (such as illumination differences, local occlusion, and non-rigid deformation), significantly improving the model's robustness to clinical data noise. Subsequently, a Swin Transformer encoder extracts deep feature representations with a global receptive field, providing precise semantic information guidance for vessel segmentation. In the multi-scale feature fusion stage, a hierarchical feature interaction mechanism enhances the network's spatial structure perception capability of 3D MRI images, effectively capturing vascular morphological features at different scales. Finally, an optimized convolutional neural network decoder achieves accurate mapping from features to vessel segment classification results, significantly improving cerebral artery segmentation and classification performance while maintaining the integrity of the vascular topology. Attached Figure Description
[0043] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the accompanying drawings used in the description of the embodiments or the prior art will be briefly introduced below.
[0044] Figure 1 This is a flowchart illustrating the cerebral artery segmentation and classification method based on MRI images provided by the present invention.
[0045] Figure 2 The original MRI image provided in the embodiment; wherein, Figure 2 Figure (A) shows the original MRI image in the axial plane. Figure 2 Figure (B) shows the original MRI image in the sagittal plane. Figure 2 Figure (C) in the image is the original MRI image in the coronal plane. Figure 2 Figure (D) in the diagram is a schematic diagram of the original MRI image.
[0046] Figure 3 This is a schematic diagram illustrating the principle of the cerebral artery segmentation and classification method based on MRI images provided by the present invention.
[0047] Figure 4 The comparison results of the cerebral artery segmentation classification method based on MRI images provided in the examples are shown. Detailed Implementation
[0048] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the scope of protection of this invention.
[0049] To achieve accurate cerebral artery segmentation classification, this embodiment provides a cerebral artery segmentation classification method based on MRI images. By constructing an end-to-end U-shaped encoder-decoder network for MRI cerebral artery segmentation classification, accurate MRI cerebral artery segmentation classification is effectively achieved. First, professional image preprocessing techniques effectively suppress inherent speckle noise and imaging artifacts in MRI images. Simultaneously, data augmentation strategies simulate complex imaging variations in real-world scenes (such as illumination differences, local occlusion, and non-rigid deformation), significantly improving the model's robustness to clinical data noise. Subsequently, a Swin Transformer encoder is used to extract deep feature representations with a global receptive field, providing precise semantic information guidance for vessel segmentation. In the multi-scale feature fusion stage, a hierarchical feature interaction mechanism enhances the network's spatial structure perception ability of 3D MRI images, effectively capturing vascular morphological features at different scales. Finally, an optimized decoder network achieves accurate mapping from features to vessel segment classification results, significantly improving classification performance while maintaining the integrity of the vessel topology.
[0050] like Figure 1 As shown in the embodiment, a method for classifying cerebral artery segments based on MRI images includes the following steps:
[0051] S1. Using 3D MRI images of cerebral arteries as input, preprocessing is performed to obtain standardized vessel volume images. Details are as follows:
[0052] In the embodiments, as shown in Figure 2 The original MRI 3D image of the cerebral arteries shown is the input, where, Figure 2 Figure (A) shows the original MRI image in the axial plane. Figure 2 Figure (B) shows the original MRI image in the sagittal plane. Figure 2 Figure (C) in the image is the original MRI image in the coronal plane. Figure 2 Figure (D) in the figure is a schematic diagram of the original MRI image. Data format conversion, image cropping, resampling and standardization are performed in sequence to obtain a standardized blood vessel volume image.
[0053] Specifically, the data format conversion transforms the original MRI 3D images of brain arteries from their original format into a standard four-dimensional NumPy array (number of modalities × 3D volume), and simultaneously saves the image and annotation data for each sample. Meta-information such as voxel spacing and spatial origin is recorded in a .pkl file. Image cropping identifies non-zero regions (i.e., parts containing actual structural information), calculates the minimum 3D bounding box, retains only the image and label of that region, and marks the background region as -1. The resampling stage unifies the voxel spacing to ensure consistent spatial resolution across different samples in physical space, using third-order interpolation (image) and nearest-neighbor interpolation (label) to complete pixel-level transformation. Finally, the standardization step performs z-score standardization on the image pixel values, normalizing based on the mean and variance of a single image, and cropping the image intensity (HU value) to suppress the influence of noise and extreme abnormal intensity values.
[0054] S2. Using standardized blood vessel volume images as input, hierarchical feature extraction is performed through a Swin Transformer-based encoder to obtain multi-scale features. Specifically:
[0055] like Figure 3 As shown, the input is a standardized blood vessel volume image output by S1, and hierarchical feature extraction is achieved through an encoder based on the SwinTransformer architecture. The encoder consists of four sequentially stacked encoding stages, each composed of a SwinTransformer module, which is used to progressively capture semantic representations from local to global, and downsampling is performed after each stage through an image patch merging layer.
[0056] In terms of overall structure, the input standardized blood vessel volume image is first divided into fixed-size non-overlapping image patches (4×4×4 voxels) through a patch embedding layer, and then mapped to initial features via linear projection. This step achieves the transformation from the pixel domain to the feature domain, enabling subsequent attention mechanisms to model efficiently in a more compact representation space. Specifically, the... Each encoding stage (i=1,2,3,4) is composed of a Swin Transformer module, and each module contains, in sequence:
[0057] Window-based Multi-Head Self-Attention Layer (W-MSA): The initial input features are divided into several local windows. Multi-head self-attention is computed within each local window to capture short-range dependencies within the local region, resulting in the first attention feature. Compared to global attention mechanisms, W-MSA significantly reduces computational complexity while preserving fine-grained spatial sensitivity.
[0058] Multi-head self-attention layer based on shifted windows (SW-MSA): By introducing spatial offset on the basis of the previous window division, interaction is generated between different windows, thereby breaking through the locality limitation of fixed windows, enhancing the feature connection across regions, building a broader context awareness capability, and obtaining a second attention feature.
[0059] Multilayer Perceptron (MLP): The first attention feature and the second attention feature are input into the multilayer perceptron. The multilayer perceptron consists of two fully connected layers and an intermediate nonlinear activation function (such as GELU) to perform feature transformation and nonlinear mapping in the channel dimension, thereby enhancing the expressive power and discriminative power of the features.
[0060] Finally, layer normalization is applied before each attention layer and feedforward network to ensure training stability and consistency with feature distribution, thereby obtaining enhanced features.
[0061] Between each stage, a patch merging layer is used to achieve hierarchical downsampling, resulting in a multi-scale feature representation. The image patch merging layer concatenates adjacent feature blocks and fuses them through a linear transformation, halving the spatial resolution at each stage. , For height, For width, For depth; at the same time, the channel dimension is doubled, that is Through this hierarchical encoding mechanism, the network gradually extracts detailed structural information from the high-resolution features in the shallow layers and captures global semantic features in the deeper layers.
[0062] Finally, the output consists of multi-scale features composed of feature layers of multiple scales, providing rich and structured feature inputs for subsequent image fusion and decoding stages.
[0063] S3. Based on multi-scale features, an image fusion module is used to establish a graph structure, forming multi-level features. The graph structure is then updated to update and enhance the graph node features, resulting in updated multi-level features. This step enables the collaborative integration and semantic enhancement of features at different scales. Specifically:
[0064] First, using the multi-scale features constructed by splicing as input, pooling operations of different scales are performed on different features at different levels to unify all features to the same spatial size, resulting in multi-scale features of uniform size.
[0065] Subsequently, multi-scale features of uniform size are fused, and the fused features are then constructed into a graph structure to form multi-level features. Graph update operations are performed on the graph structure to capture the relationships and contextual information between nodes. Finally, the graph convolution-enhanced features are restored to their original spatial resolution through upsampling operations at different scales for use in subsequent decoding.
[0066] Specifically, the construction of the graph structure includes: constructing the fused multi-scale features into a graph structure. ; among which nodes and edge These correspond to the local interactions between voxels and blood vessels, respectively. Each node... This corresponds to a voxel in the original volumetric image, representing the smallest sampling unit of a blood vessel in three-dimensional space. (Edge) Represents a node and another node The information transmission relationship between them.
[0067] In updating the graph structure, the feature differences between each voxel node are first calculated to characterize the local relationships between them. These differences are then subjected to a ReLU layer for nonlinear transformation, resulting in more discriminative difference features. Finally, the nonlinear differences are fused with the original voxel features to update and enhance the graph node features, yielding updated multi-level features.
[0068] Finally, the updated multi-level features are fed into a multi-scale upsampling path. Specifically, progressive upsampling and feature fusion operations are performed on graph features at different levels to gradually restore their spatial resolution to match that of the original input. In this way, not only can high-level semantic information be preserved, but local detailed features can also be fully integrated during the upsampling process, thus providing a richer and more spatially consistent representation for subsequent decoding and fine prediction.
[0069] S4. Input the updated multi-level features into the decoding module, and reconstruct the feature map by upsampling layer by layer and fusing it with the multi-level features output by the image fusion module, and output the cerebral artery segmentation classification results.
[0070] In this embodiment, the fused features are first upsampled and reconstructed layer by layer through a decoding path composed of multiple convolutional blocks (CNNs) to gradually restore spatial resolution. Unlike the traditional U-Net structure, this step does not directly use shallow features from the encoder for skip connections during decoding. Instead, it introduces updated multi-scale features generated by S3 as supplementary information across layers. This design allows the model to utilize both global dependencies and local spatial details inherent in the graph structure during the upsampling stage, thereby achieving more semantically consistent and structurally constrained feature fusion and ultimately outputting accurate blood vessel segment classification results.
[0071] The specific process is as follows:
[0072] Hierarchical feature representations are learned from updated multi-scale features using CNN convolutional blocks;
[0073] The encoder extracts shallow and deep features from MRI images and combines them with the decoder via skip connections. This fuses the features extracted by the encoder with the features upsampled by the decoder, achieving effective integration of cross-layer information. The fused features are further refined and reconstructed by CNN convolutional blocks to enhance spatial detail and semantic consistency. This process is repeated across layers until the feature resolution is gradually restored to match the input. Finally, the restored feature maps are fed into a Softmax layer to generate vessel segmentation results with accurate boundaries and structural consistency, yielding the final vessel segment classification result.
[0074] This study uses the U-Net network as its foundation and introduces the Swin Transformer into its encoder section to construct a novel hybrid encoding structure. The feature encoding process consists of four stages, each composed of multiple stacked SwinTransformer modules forming a Transformer layer to progressively extract multi-scale features. Subsequently, graph fusion is performed on the features output from each encoder layer to enhance inter-layer semantic relationships. The fused features are then restored to their original scale layer by layer, and finally fused with the features from the corresponding CNN decoder layer, thereby achieving a progressive reconstruction of vascular structure information. During training, the loss function used is... Including pixel-level loss and cross-entropy loss, expressed as ;
[0075] Among them, pixel-level loss , This indicates the total number of vessel segment categories. For the first The MRI image number Probability of predicting blood vessel segments of a class The total number of MRI images. For the first The MRI image number True labeling of blood vessel segments of the class It is a smoothing constant;
[0076] Cross-entropy loss , This indicates the total number of vessel segment categories. For the first Unique heat labels for vascular segments (the actual category corresponds to 1, and the rest are 0). For the first The output of logits for the blood vessel segment of the class. Where, The Softmax function transforms the logits of all classes into a probability distribution. The output value of each class is mapped to the interval [0,1] and the sum of the probabilities of all classes is 1, thus representing the model's effect on the [0,1] class. Predictive probability of vascular segments; It is the natural logarithm, and the negative sign ensures that the model loses less as the prediction gets closer to the true class.
[0077] Training was performed using the Adam trainer with an initial learning rate of 0.001; a total of 1000 epochs were trained, with the learning rate decreasing to 80% of its original value every 20 epochs; the batch size was 2 during training; the model parameters were saved every 50 epochs, and the weights on the validation set were updated.
[0078] On the other hand, the embodiment also provides a brain artery segmentation classification device based on MRI images, including a memory and a processor. The memory is used to store a computer program, and the processor is used to implement the brain artery segmentation classification method based on MRI images when the computer program is executed.
[0079] To better illustrate the effectiveness of the method provided by this invention, such as Figure 4 As shown, qualitative and quantitative evaluations were performed on the cerebral artery dataset, and the vessel segmentation results on this dataset are presented. Experimental results show that the proposed method can achieve accurate segmentation and classification of cerebral vessels. The figure also shows the ground truth labels of the images in the dataset, the segmentation results of various comparison methods, and the results of the proposed method. The methods involved in the comparison include: interleaved Transformer (nnFormer) for volume segmentation, Swin Transformer (SwinUnetr) for image semantic segmentation, and a hybrid multi-scale U-net based on CNN and transformer for medical image segmentation (Hmsunet). The comparison shows that existing methods are prone to errors in the classification process, while the proposed method achieves superior segmentation performance, demonstrating good accuracy and robustness.
[0080] This method addresses the challenges of accurate segmentation and classification of cerebral arteries on MRI by proposing an end-to-end cerebral artery segmentation and classification network, effectively achieving precise MRI cerebral artery segmentation and classification. First, specialized image preprocessing techniques effectively suppress inherent speckle noise and imaging artifacts in MRI images. Simultaneously, data augmentation strategies simulate complex imaging variations in real-world scenarios (such as illumination differences, local occlusion, and non-rigid deformation), significantly improving the model's robustness to clinical data noise. Subsequently, a Swin Transformer encoder is used to extract deep feature representations with a global receptive field, providing precise semantic information guidance for vessel segmentation. In the multi-scale feature fusion stage, a hierarchical feature interaction mechanism enhances the network's spatial structure perception capability of 3D MRI images, effectively capturing vascular morphological features at different scales. Finally, an optimized CNN decoder achieves accurate mapping from features to vessel segment classification results, significantly improving classification performance while maintaining the integrity of the vessel topology.
[0081] The specific embodiments described above illustrate the technical solution and beneficial effects of the present invention in detail. It should be understood that the above description is only the most preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, additions, and equivalent substitutions made within the scope of the principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A method for classifying cerebral artery segmentation based on MRI images, characterized by, Includes the following steps: Step 1: Using MRI 3D images of cerebral arteries as input, preprocessing is performed to obtain standardized vessel volume images; Step 2: Using standardized blood vessel volume images as input, hierarchical feature extraction is performed through a Swin Transformer-based encoder to obtain multi-scale features. Specifically, the input standardized image is divided into multiple image patches through an image patch embedding layer. Each image patch is flattened and linearly projected to a high-dimensional feature space to form initial features. The initial features are input into a window-based multi-head self-attention layer in the Swin Transformer module, which divides the image into multiple local windows. Multi-head self-attention of the initial features is calculated within each local window to capture short-range dependencies within the local region, resulting in the first attention feature. Simultaneously, a shift-window-based multi-head self-attention layer spatially offsets the window division to establish long-range dependencies across windows, enhancing the model's global context modeling capability, resulting in the second attention feature. The first and second attention features undergo nonlinear transformation and channel-dimensional feature enhancement through a multilayer perceptron, followed by residual connections and layer normalization to obtain enhanced features. The enhanced features are input into an image patch merging layer for hierarchical downsampling to obtain multi-scale features. Step 3: Based on multi-scale features, establish a graph structure using an image fusion module, including: unifying the input multi-scale features to the same spatial size through pooling operations to obtain multi-scale features of uniform size; fusing the unified multi-scale features to construct a graph structure from the fused multi-scale features. ; where the nodes of the graph structure Corresponding to voxels, edges This represents the local interaction relationships between voxel nodes, where a voxel represents the smallest sampling unit of a blood vessel in three-dimensional space, forming multi-level features, and updating the graph structure to realize the updating and enhancement of graph node features, resulting in updated multi-level features; updating the graph structure includes: performing a graph update operation on the graph structure, performing a nonlinear transformation by calculating the feature difference between nodes, and fusing it with the node features before the graph structure update, to obtain updated multi-level features and updated graph structure; Step 4: Input the updated multi-level features into the decoding module. By upsampling layer by layer and fusing them with the multi-level features output by the image fusion module, the feature map is reconstructed, and the cerebral artery segmentation classification result is output. The decoding module is a decoding path constructed based on multi-layer convolutional blocks. Specifically, during the decoding process, the first layer convolutional block learns hierarchical feature representation from the updated multi-scale features to obtain the shallow and deep features of the MRI image. The encoder extracts the shallow and deep features of the MRI image and fuses them with the features upsampled by the decoder through skip connections. The fused features are then passed through convolutional blocks to obtain the reconstructed feature map.
2. The method of classifying the segmentation of brain arteries based on MRI images according to claim 1, characterized in that, The preprocessing in step 1 includes data format conversion, image cropping, resampling, and normalization; The data format conversion includes: converting the input MRI images into a four-dimensional NumPy array format and saving the corresponding metadata; The image cropping includes: identifying non-zero regions in the image after data format conversion, where the non-zero regions contain actual cerebral artery structure information, calculating the minimum three-dimensional bounding box of the non-zero regions and cropping them; The resampling includes: resampling the cropped image and unifying the voxel spacing of all samples, wherein the voxel represents the smallest sampling unit of a blood vessel in three-dimensional space; The standardization includes: standardizing the resampled MRI images based on the mean and variance of a single image, and cropping the image intensity to remove extreme noise intensity values.
3. The method of brain artery segmentation classification based on MRI images according to claim 1, characterized in that, The encoder in step 2 consists of four sequentially stacked encoding stages, each consisting of a Swing Transformer module, which is used to progressively capture semantic features from local to global, and downsamples them after each stage through an image patch merging layer. Each Swin Transformer module includes: a window-based multi-head self-attention layer, a shift-window-based multi-head self-attention layer, and a multilayer perceptron.
4. The method of classifying the segmentation of brain arteries based on MRI images according to claim 1, characterized in that, When reconstructing feature maps, the loss functions used include pixel-level loss and cross-entropy loss; the pixel-level loss is a metric function that computes the overlap region, defined as: , wherein, denotes the total number of vessel segment classes, is the predicted probability of a vessel segment of class for the th MRI image, is the total number of MRI images, is the true label of a vessel segment of class for the th MRI image, is a smoothing constant; the cross-entropy loss For measuring the difference between the predicted class probability distribution and the real label distribution on each pixel in the image, defined as: , in, This indicates the total number of vessel segment categories. No. For vascular segment-like unique heat tags, the position corresponding to the actual category is 1, and the rest are 0. For the first The logits output of the blood vessel segment of the class. The Softmax function is used to convert the logits of all vessel segment categories into a probability distribution. The output value of each class is mapped to the interval [0,1] and the sum of the probabilities of all classes is 1. Calculate the... The predicted probability of vascular segments, in the formula It is the natural logarithm function, and the negative sign ensures that the loss is smaller as the prediction gets closer to the true class.
5. The cerebral artery segmentation classification method based on MRI images according to claim 1, characterized in that, The reconstructed feature map is mapped to the Softmax layer to generate a vascular segment classification probability map and output the cerebral artery segment classification results.
6. A brain artery segmentation and classification device based on MRI images, comprising a memory and a processor, wherein the memory is used to store a computer program, characterized in that, The processor is configured to implement the cerebral artery segmentation classification method based on MRI images as described in any one of claims 1 to 5 when executing the computer program.