A blood vessel and lesion multi-task segmentation method based on an ultra-wide-angle fundus image
By employing an end-to-end multi-task semi-supervised learning algorithm, combined with a cross-level nonlocal graph module and a multi-head CNN network, fine segmentation of blood vessels and lesion regions in ultra-wide-angle fundus images was achieved. This solved the problems of high annotation difficulty and insufficient information utilization in existing technologies, thereby improving the accuracy and efficiency of diagnosis.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- NINGBO UNIVERSITY OF TECHNOLOGY
- Filing Date
- 2024-09-04
- Publication Date
- 2026-06-26
AI Technical Summary
Existing ultra-wide-angle fundus image segmentation methods mainly focus on single-task learning, which cannot fully utilize the multiple information in the image. Furthermore, they are difficult to annotate, consume a lot of manpower and resources, and cannot output fine and accurate segmentation results.
An end-to-end multi-task semi-supervised learning algorithm is adopted. By constructing a multi-task semi-supervised learning network based on a weight control mechanism, and combining a cross-level nonlocal graph module and a multi-head CNN network, high-level features and low-level features are fused to achieve joint segmentation of blood vessel and lesion regions. Training is performed using a small amount of labeled data and a large amount of unlabeled data.
While reducing the workload of annotation, it maintains high precision and reliability, outputting detailed and accurate segmentation results of blood vessels and lesions, thus improving the accuracy and efficiency of fundus disease diagnosis.
Smart Images

Figure CN118918128B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to a method for segmenting ultra-wide-angle fundus images, specifically a multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images, belonging to the field of image processing. Background Technology
[0002] Retinal vessels, being the only vascular tissue in the human body that can be directly observed non-invasively, can improve the convenience and safety of patient diagnosis. Researching and analyzing retinal vessels can help us better understand and identify early symptoms of vascular diseases, thus enabling early screening and diagnosis. Furthermore, the detection and identification of lesions in eye diseases are equally crucial. Fundus images may reveal lesions such as hemorrhage, exudates, hard exudates, and macular degeneration. Accurate segmentation and analysis of these lesions not only helps in understanding disease progression but also aids in developing more effective treatment plans and monitoring treatment outcomes.
[0003] Ultra-wide-field (UWF) fundus imaging has been widely used in the diagnosis and treatment of ophthalmic diseases due to its extremely wide field of view. Unlike other ophthalmic diagnostic techniques, UWF is a non-invasive and non-contact advanced ophthalmic imaging technology. A single UWF image can capture 180 to 200 degrees of light, covering approximately 82% of the retina, which largely meets the needs of clinicians for diagnosing ophthalmic diseases. Clinical practice has shown that using computer-aided technology for the automatic detection and analysis of retinal vascular features and lesion areas in UWF images not only improves the accuracy and efficiency of diagnosis but also provides important support for the early detection and monitoring of ophthalmic diseases. Furthermore, UWF technology excels in capturing fundus details, enabling the identification of complex lesions such as retinal detachment, diabetic retinopathy, and macular degeneration, providing valuable information for developing personalized treatment plans. Overall, UWF fundus imaging demonstrates enormous application potential and promising prospects in ophthalmic clinical practice.
[0004] Efficiently and accurately extracting vascular structure information and lesion region features from UWF images is crucial. However, existing methods mainly focus on single-task learning, i.e., extracting only vascular structure information or segmenting only lesion regions. This approach, to some extent, inhibits the interactivity of information, fails to fully utilize the diverse information in the image, and increases training complexity. Furthermore, UWF image annotation is difficult to obtain and extremely resource-intensive. Finally, outputting fine and accurate segmentation results is critical for the diagnosis of fundus diseases. To address these issues, this invention proposes an end-to-end multi-task semi-supervised learning algorithm designed to jointly segment vascular and lesion regions in UWF images. By utilizing a small amount of labeled data and a large amount of unlabeled data, it reduces the annotation workload while maintaining high accuracy and reliability. Summary of the Invention
[0005] To address the shortcomings of existing technologies, this invention develops a multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images. This method can effectively fuse high-level and low-level features of the image to complete multi-task segmentation of blood vessels and lesions. Weight sharing is performed between the two tasks to enhance the model's ability to extract detailed image features. While reducing the amount of annotation work, it maintains high accuracy and reliability in blood vessel and lesion segmentation.
[0006] To achieve the above objectives, the present invention employs the following technical solution:
[0007] A multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images includes the following steps:
[0008] S1. Obtain the ultra-wide-angle UWF fundus image dataset, preprocess the images in the UWF fundus image dataset, and divide the dataset into a training set and a test set. The UWF fundus image dataset includes labeled data and unlabeled data.
[0009] S2. Construct a multi-task semi-supervised learning network based on a weight control mechanism. This network includes an encoder, a decoder, a cross-level nonlocal graph module, and a loss weight control mechanism. The network is trained using a training set. Images in the training set undergo feature extraction via the encoder portion of the network to obtain low-level features. and advanced features ;
[0010] S3. The low-level features are... and advanced features The input is fed into the decoder part of the multi-task semi-supervised learning network based on the weight control mechanism for feature decoding and feature reconstruction, resulting in vascular structure segmentation images and lesion region segmentation images;
[0011] S4. The multi-task semi-supervised learning network based on the weight control mechanism is optimized by training the combined loss function and the total loss to obtain a multi-task segmentation model of blood vessels and lesions based on ultra-wide-angle fundus images;
[0012] S5. Input the data in the test set into the multi-task segmentation model of blood vessels and lesions based on ultra-wide-angle fundus images to obtain the UWF fundus blood vessel segmentation results and the UWF fundus lesion region segmentation results.
[0013] Furthermore, the preprocessing operations described in step S1 include data rotation and contrast enhancement.
[0014] Furthermore, a skip connection is adopted between the encoder and the decoder, and a cross-level nonlocal graph module is added to the skip connection. Step S3 specifically includes:
[0015] The skip connections utilize the low-level features through convolutional layers in the cross-level nonlocal graph module. and advanced features The algorithm is decomposed into overlapping patches, which are passed to three parallel fully connected layers to compute the query Q, key K, and value V, where high-level features are incorporated. As keys and values, low-level features As a query, low-level features are aligned by aligning feature dimensions. Matching high-level features By calculating the similarity between the key and the query, an association matrix is obtained. Then, based on the association matrix and the values, the cross-attention result is obtained. The calculation formula is expressed as:
[0016]
[0017] Where n represents the dimension of the feature embedding. Q represents the query, K represents the key, and V represents the value;
[0018] The cross-attention results After noise reduction processing by the feedforward module, the noise-reduced cross-attention result is obtained. The calculation formula is expressed as:
[0019]
[0020] in, Represents the affinity matrix. Represents a deep convolutional layer. and This represents two fully connected linear mapping layers. This represents the ReLU activation function.
[0021] Furthermore, the combined loss function and total loss mentioned in step S4 specifically include:
[0022] The labeled and unlabeled data in the UWF fundus image dataset are denoted as U and J, respectively, where U' is the augmented data of labeled data U and J' is the augmented data of unlabeled data J. For the label of U, The "guess" label is for the unlabeled data J. For the prediction results of labeled data, The combined loss function formula is expressed as follows for the prediction results of unlabeled data:
[0023] ,
[0024] ,
[0025] ,
[0026] in, Represents the cross-entropy loss function. Indicates the number of categories. Total loss. From labeled data Cross-entropy loss and unlabeled data The squared loss consists of, This represents the loss weighting coefficient used to control for unlabeled data;
[0027] The multi-task loss during the training process of the multi-task semi-supervised learning network based on the weight control mechanism consists of vascular structure loss and lesion area loss. The vascular structure loss includes labeled data loss and unlabeled data loss, and the lesion area loss also includes labeled data loss and unlabeled data loss.
[0028] The formula for calculating the multi-task loss of labeled data is:
[0029] ,
[0030] in, and These represent the loss of vascular structures in labeled data and the loss of lesion regions in labeled data, respectively. and Let represent the weight scalars for the labeled data vascular structure task and the labeled data lesion task, respectively. and For the two labels of the image corresponding to the labeled data, This indicates the probability that the labeled data belongs to a vascular structure region. , This indicates the probability that the labeled data belongs to a lesion area. By and Multiply, and Multiply, the result is and Used to adjust the weight parameters between tasks;
[0031] The formula for calculating the unlabeled multi-task loss is:
[0032] ,
[0033] in, and These represent the loss of vascular structures in unlabeled data and the loss of lesion areas in unlabeled data, respectively. and These represent the weight scalars for vascular structures in unlabeled data and the weight scalars for lesion regions in unlabeled data, respectively. and These are the prediction probabilities for the vascular structure task and the lesion task in the labeling task, respectively. and As soft weights, they are used to control for vascular structure loss and lesion area loss in unlabeled data;
[0034] The formula for calculating total loss is: .
[0035] Furthermore, through the combined loss function Total loss Under supervision, the training and optimization of the multi-task semi-supervised learning network based on the weight control mechanism are completed.
[0036] Furthermore, by concatenating the features output by the cross-level nonlocal graph model with the features output by the decoder, an upsampling operation is performed, and a nonlinear transformation is applied using the Sigmoid function. The output is then processed through a multi-head CNN in the decoder section, resulting in the output of vascular structure segmentation images and lesion region segmentation images, respectively.
[0037] The present invention also provides a computing device, including a memory configured to store computer-executable instructions; and a processor configured to execute a multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images when the computer-executable instructions are executed by the processor.
[0038] The advantages of this invention are:
[0039] 1. This invention proposes for the first time an end-to-end multi-task semi-supervised learning algorithm to jointly segment blood vessels and lesion regions in UWF images. Through a weight control mechanism, it effectively performs quantitative analysis of retinal structural indicators, providing a novel and effective approach for the diagnosis of fundus diseases.
[0040] 2. This invention proposes for the first time a cross-level nonlocal graph module (CNGM), which fuses high-level and low-level features in the decoder stage. This module can capture the intrinsic relationships between different features while effectively recovering fine-grained details, thus improving the network's performance in the final segmentation task.
[0041] 3. This invention designs a multi-head CNN network to predict both blood vessels and lesions. By utilizing the same decoder, this invention achieves weight sharing between the two tasks, enhancing the model's ability to extract detailed image features, thereby outputting refined and accurate segmentation results. Two weight control strategies are employed for labeled and unlabeled data streams, enabling the blood vessel segmentation branch and the lesion region segmentation branch to learn simultaneously using weak datasets.
[0042] 4. This invention proposes a multi-task loss function: in semi-supervised multi-task loss, the losses of labeled and unlabeled data are comprehensively considered. By using cross-entropy loss and squared loss, high accuracy and reliability are maintained, while reducing the amount of labeling work. Attached Figure Description
[0043] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used together with the embodiments to explain the invention and do not constitute a limitation thereof.
[0044] Figure 1 This is a flowchart of the multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images according to the present invention.
[0045] Figure 2 This invention relates to a multi-task semi-supervised learning network based on a weight control mechanism.
[0046] Figure 3 This is the cross-level nonlocal graph module of the present invention.
[0047] Figure 4 This is the result of UWF image multi-task segmentation according to the present invention. Detailed Implementation
[0048] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0049] Example 1
[0050] In this embodiment, as Figure 1 As shown, this invention provides a multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images, the specific steps of which include:
[0051] S1. Obtain an ultra-wide-angle UWF fundus image dataset, perform data rotation and contrast enhancement on the images in the UWF fundus image dataset, and divide the dataset into a training set and a test set. The UWF fundus image dataset includes labeled data and unlabeled data.
[0052] S2. For example Figure 2 As shown, a multi-task semi-supervised learning network based on a weight control mechanism is constructed. This network includes an encoder, a decoder, a cross-level nonlocal graph module (CNGM), and a loss weight control mechanism. The network is trained using a training set. Images in the training set undergo feature extraction via the encoder portion of the network to obtain low-level features. and advanced features ;
[0053] S3. The low-level features are... and advanced features The input is fed into the decoder part of the multi-task semi-supervised learning network based on the weight control mechanism for feature decoding and feature reconstruction, resulting in a vascular structure segmentation image. vessel Output of segmented images of lesion regions focal ;
[0054] Specifically, a skip connection is used between the encoder and decoder, and cross-level nonlocal graph modules are added to the skip connection, such as... Figure 3 As shown, the skip connections utilize convolutional layers in the cross-level nonlocal graph module to integrate the low-level features. and advanced features The algorithm is decomposed into overlapping patches, which are passed to three parallel fully connected Linear layers to compute the query Q, key K, and value V. High-level features are then incorporated into the Cross-Attention module. As keys and values, low-level features As a query, low-level features are aligned by aligning feature dimensions. Matching high-level features By calculating the similarity between the key and the query, an association matrix is obtained. Then, based on the association matrix and the values, the cross-attention result is obtained. The calculation formula is expressed as:
[0055]
[0056] Where n represents the dimension of the feature embedding. Q represents the query, K represents the key, and V represents the value;
[0057] The cross-attention results After noise reduction processing by the Feed Forward module, the noise-reduced cross-attention result is obtained. The calculation formula is expressed as:
[0058]
[0059] in, Represents the affinity matrix. Represents a deep convolutional layer. and This represents two fully connected linear mapping layers. Represents the ReLU activation function;
[0060] Specifically, by concatenating the features output by the cross-level nonlocal graph model with the features output by the decoder, an upsampling operation is performed, and a nonlinear transformation is applied using the Sigmoid function. The output is then processed through a multi-head CNN in the decoder section, resulting in the output of a vascular structure segmentation image. vessel Output of segmented images of lesion regions focal .
[0061] S4. The multi-task semi-supervised learning network based on the weight control mechanism is optimized by training the combined loss function and the total loss to obtain a multi-task segmentation model of blood vessels and lesions based on ultra-wide-angle fundus images;
[0062] Specifically, the labeled data and unlabeled data in the UWF fundus image dataset are denoted as U and J, respectively, where U' is the augmented data of labeled data U and J' is the augmented data of unlabeled data J; The label for U, The "guess" label is for the unlabeled data J. For the prediction results of labeled data, The combined loss function formula is expressed as follows for the prediction results of unlabeled data:
[0063] ,
[0064] ,
[0065] ,
[0066] in, Represents the cross-entropy loss function. Indicates the number of categories. Total loss. From labeled data Cross-entropy loss and unlabeled data The squared loss consists of, This represents the loss weighting coefficient used to control for unlabeled data;
[0067] The multi-task loss during the training process of the multi-task semi-supervised learning network based on the weight control mechanism consists of vascular structure loss and lesion area loss. The vascular structure loss includes labeled data loss and unlabeled data loss, and the lesion area loss also includes labeled data loss and unlabeled data loss.
[0068] The formula for calculating the multi-task loss of labeled data is:
[0069] ,
[0070] in, and These represent the loss of vascular structures in labeled data and the loss of lesion regions in labeled data, respectively. and Let represent the weight scalars for the labeled data vascular structure task and the labeled data lesion task, respectively. and For the two labels of the image corresponding to the labeled data, This indicates the probability that the labeled data belongs to a vascular structure region. , This indicates the probability that the labeled data belongs to a lesion area. By and Multiply, and Multiply, the result is and Used to adjust the weight parameters between tasks;
[0071] The formula for calculating the unlabeled multi-task loss is:
[0072] ,
[0073] in, and These represent the loss of vascular structures in unlabeled data and the loss of lesion areas in unlabeled data, respectively. and These represent the weight scalars for vascular structures in unlabeled data and the weight scalars for lesion regions in unlabeled data, respectively. and These are the prediction probabilities for the vascular structure task and the lesion task in the labeling task, respectively. and As soft weights, they are used to control for vascular structure loss and lesion area loss in unlabeled data;
[0074] The formula for calculating total loss is: .
[0075] Specifically, through the combined loss function Total loss Under supervision, the training and optimization of the multi-task semi-supervised learning network based on the weight control mechanism are completed.
[0076] S5. Input the data in the test set into the multi-task segmentation model of blood vessels and lesions based on ultra-wide-angle fundus images to obtain the UWF fundus blood vessel segmentation results and the UWF fundus lesion region segmentation results.
[0077] Example 2
[0078] In this embodiment, experiments were conducted using the UWF fundus image dataset to verify the effectiveness of the method of the present invention through comparative experiments. We compared the method Our of the present invention with existing segmentation methods Unet and SwinUnet, and the experimental results are shown in Tables 1 and 2.
[0079] Table 1. Comparison of blood vessel segmentation results between the method of the present invention and existing methods.
[0080]
[0081] Table 2 Comparison of lesion segmentation results between the method of the present invention and existing methods
[0082]
[0083] In this embodiment, Dice coefficient: measures the precision and recall of the model by comparing the degree of overlap between the predicted positive class and the actual positive class; AUC: Area Under the Curve, representing the area under the ROC curve (Receiver Operating Characteristic curve); FDR: 0 Discovery Rate, representing the false discovery rate; BACC: Balanced Accuracy, representing the balanced accuracy; SEN: Sensitivity, representing the recall rate or positive rate, which is the proportion of all results that are actually positive that are correctly predicted as positive.
[0084] In the comparative experiment, we used Dice coefficient, AUC, false detection rate (FDR), balanced accuracy (BACC), and true class identity (SEN) as evaluation indicators. As can be seen from the table, the method of the present invention outperforms other comparative methods in all of the above indicators, indicating that the method of the present invention has significant advantages in multi-task segmentation of blood vessels and lesions in ultra-wide-angle fundus images and can effectively improve the accuracy of multi-task segmentation of blood vessels and lesions in ultra-wide-angle fundus images.
[0085] Example 3
[0086] In this embodiment, as Figure 4 As shown, we used Unet, SwinUnet, and the method Our of this invention to segment blood vessels and lesion regions in ultra-wide-angle fundus images. The first column in the figure is the original image, the first row is the lesion region segmentation result of each method, and the second row is the blood vessel segmentation result of each method. It can be seen that the method of this invention provides clearer blood vessel segmentation and lesion region segmentation results and higher segmentation accuracy compared to the other two methods, further demonstrating the effectiveness of the method of this invention.
[0087] Example 4
[0088] A computing device includes a memory configured to store computer-executable instructions; and a processor configured to execute a multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images when the computer-executable instructions are executed by the processor.
[0089] Finally, it should be noted that the above descriptions are merely preferred embodiments of the present invention and are not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing embodiments or make equivalent substitutions for some of the technical features. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the protection scope of the present invention.
Claims
1. A multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images, characterized in that, Includes the following steps: S1. Obtain the ultra-wide-angle UWF fundus image dataset, preprocess the images in the UWF fundus image dataset, and divide the dataset into a training set and a test set. The UWF fundus image dataset includes labeled data and unlabeled data. S2. Construct a multi-task semi-supervised learning network based on a weight control mechanism. This network includes an encoder, a decoder, a cross-level nonlocal graph module, and a loss weight control mechanism. The network is trained using a training set. Images in the training set undergo feature extraction via the encoder portion of the network to obtain low-level features. and advanced features ; Specifically, the encoder and decoder employ a skip connection, incorporating a cross-level nonlocal graph module within the skip connection: The skip connections utilize the low-level features through convolutional layers in the cross-level nonlocal graph module. and advanced features The algorithm is decomposed into overlapping patches, which are passed to three parallel fully connected layers to compute the query Q, key K, and value V, where high-level features are incorporated. As keys and values, low-level features As a query, low-level features are aligned by aligning feature dimensions. Matching high-level features By calculating the similarity between the key and the query, an association matrix is obtained. Then, based on the association matrix and the values, the cross-attention result is obtained. The calculation formula is expressed as: , Where n represents the dimension of feature embedding, Q represents query, K represents key, and V represents value; The cross-attention results After noise reduction processing by the feedforward module, the noise-reduced cross-attention result is obtained. The calculation formula is expressed as: , in, Represents the affinity matrix. Represents a deep convolutional layer. and This represents two fully connected linear mapping layers. Represents the ReLU activation function; S3. The low-level features are... and advanced features The input is fed into the decoder part of the multi-task semi-supervised learning network based on the weight control mechanism for feature decoding and feature reconstruction, resulting in vascular structure segmentation images and lesion region segmentation images; S4. The multi-task semi-supervised learning network based on the weight control mechanism is optimized by training the combined loss function and the total loss to obtain a multi-task segmentation model of blood vessels and lesions based on ultra-wide-angle fundus images; the combined loss function and the total loss specifically include: The labeled and unlabeled data in the UWF fundus image dataset are denoted as U and J, respectively, where U' is the augmented data of labeled data U and J' is the augmented data of unlabeled data J. The label for U, The "guess" label is for the unlabeled data J. For the prediction results of labeled data, The combined loss function formula is expressed as follows for the prediction results of unlabeled data: , , , in, Represents the cross-entropy loss function. Represents the number of categories and the total loss. From labeled data Cross-entropy loss and unlabeled data The squared loss consists of, This represents the loss weighting coefficient used to control for unlabeled data; The multi-task loss during the training process of the multi-task semi-supervised learning network based on the weight control mechanism consists of vascular structure loss and lesion area loss. The vascular structure loss includes labeled data loss and unlabeled data loss, and the lesion area loss also includes labeled data loss and unlabeled data loss. The formula for calculating the multi-task loss of labeled data is: , in, and These represent the loss of vascular structures in labeled data and the loss of lesion regions in labeled data, respectively. and Let represent the weight scalars for the labeled data vascular structure task and the labeled data lesion task, respectively. and For the two labels of the image corresponding to the labeled data, This indicates the probability that the labeled data belongs to a vascular structure region. , This indicates the probability that the labeled data belongs to a lesion area. By and Multiply, and Multiply, the result is and Used to adjust the weight parameters between tasks; The formula for calculating the unlabeled multi-task loss is: , in, and These represent the loss of vascular structures in unlabeled data and the loss of lesion areas in unlabeled data, respectively. and These represent the weight scalars for vascular structures in unlabeled data and the weight scalars for lesion regions in unlabeled data, respectively. and These are the prediction probabilities for the vascular structure task and the lesion task in the labeling task, respectively. and As soft weights, they are used to control for vascular structure loss and lesion area loss in unlabeled data; The formula for calculating total loss is: ; By the combined loss function Total loss Under supervision, the training and optimization of the multi-task semi-supervised learning network based on the weight control mechanism are completed. S5. Input the data in the test set into the multi-task segmentation model of blood vessels and lesions based on ultra-wide-angle fundus images to obtain the UWF fundus blood vessel segmentation results and the UWF fundus lesion region segmentation results.
2. The multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images according to claim 1, characterized in that, The preprocessing described in step S1 includes data rotation and contrast enhancement.
3. The multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images according to claim 2, characterized in that, By concatenating the features output by the cross-level nonlocal graph model with the features output by the decoder, an upsampling operation is performed, and a nonlinear transformation is applied using the Sigmoid function. The output is then processed through a multi-head CNN in the decoder section, resulting in segmented images of vascular structures and lesion regions, respectively.
4. A computing device, characterized in that, Includes a memory configured to store computer-executable instructions; and a processor configured to execute, when the computer-executable instructions are executed by the processor, a multi-task segmentation method for blood vessels and lesions based on ultra-wide-angle fundus images as described in any one of claims 1-3.