A method, device, computer equipment, and storage medium for predicting the efficacy of anti-VEGF therapy.

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining multimodal data fusion methods of fundus color photography and OCT images, the problems of single data and neglect of multi-task correlation in the existing prediction of anti-VEGF efficacy are solved, and more accurate prediction of visual acuity changes is achieved.

CN119090852BActive Publication Date: 2026-06-30SHENZHEN UNIV

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SHENZHEN UNIV
Filing Date: 2024-09-03
Publication Date: 2026-06-30

Application Information

Patent Timeline

03 Sep 2024

Application

30 Jun 2026

Publication

CN119090852B

IPC: G06T7/00; G06T3/4038

AI Tagging

Technology Topics

Color image Imaging processing

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing methods for predicting the efficacy of anti-VEGF therapy rely on single-modal data, which cannot fully reflect changes in the patient's condition and ignores multi-task associations, resulting in insufficient predictive accuracy.

Method used

A multimodal data fusion method is adopted, which combines fundus color images and OCT images. The fundus color image prediction model and the OCT image prediction model are stitched together and feature extracted, and then input into the vision change prediction model for comprehensive prediction.

Benefits of technology

It improves the comprehensiveness and accuracy of eye health assessment and enhances the precision of vision change prediction.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN119090852B_ABST

Patent Text Reader

Abstract

This application belongs to the field of image processing technology and relates to a method, device, computer equipment, and storage medium for predicting the efficacy of anti-VEGF treatment. The method includes: acquiring original fundus color images and original OCT images to be predicted; inputting the original fundus color images into a fundus color image prediction model to perform fundus color image prediction operations, obtaining a predicted fundus color image; performing a fundus color image stitching operation on the original fundus color images and the predicted fundus color images to obtain a stitched fundus color image; inputting the original OCT image into an OCT image prediction model to perform OCT image prediction operations, obtaining a predicted OCT image; performing an OCT image stitching operation on the original OCT image and the predicted OCT image to obtain a stitched OCT image; and inputting the stitched fundus color image and the stitched OCT image into a vision change prediction model to perform a vision change prediction operation, obtaining a vision change prediction result. This application significantly improves the comprehensiveness and accuracy of assessing eye health status, thereby improving the accuracy of vision change prediction.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image processing technology, and in particular to a method, apparatus, computer device, and storage medium for predicting the efficacy of anti-VEGF therapy. Background Technology

[0002] Predicting individual treatment outcomes in the management of age-related macular degeneration (AMD) remains a significant and challenging issue. Currently, the most effective treatment is the injection of anti-vascular endothelial growth factor (VEGF); however, the efficacy of this treatment varies considerably among patients. Some patients experience improvement after treatment, while others fail to see effective improvement or even experience a worsening of their condition. Due to the high cost of anti-VEGF treatment, it is clearly unsuitable for patients who cannot achieve the desired results. Furthermore, patients experience visual changes during treatment, and the degree and magnitude of these changes are difficult to accurately assess, posing a challenge for clinicians in developing effective treatment plans.

[0003] Existing methods for predicting the efficacy of anti-VEGF treatments primarily rely on single-modality data, such as OCT images or fundus photography. These methods typically employ traditional machine learning algorithms or simple deep learning networks. While they have achieved some success in certain cases, they exhibit significant limitations when handling complex and diverse clinical data. First, single-modality data cannot comprehensively reflect changes in a patient's condition. Predicting the efficacy of anti-VEGF treatments requires considering multiple factors; using only OCT images or fundus photography cannot fully represent the treatment effect, and single-modality data is insufficient for accurately predicting treatment outcomes. Second, existing multi-task learning methods are rarely applied in predicting the efficacy of anti-VEGF treatments. Most methods focus only on optimizing a single task, neglecting the potential correlations between multiple tasks. This limits the model's generalization ability and predictive accuracy when handling complex tasks.

[0004] This shows that traditional methods for predicting the efficacy of anti-VEGF treatment have problems such as limited data, neglect of multi-task correlation, and insufficient fusion of multimodal data. Summary of the Invention

[0005] The purpose of this application is to propose a method, device, computer equipment, and storage medium for predicting the efficacy of anti-VEGF treatment, so as to solve the problems of traditional anti-VEGF efficacy prediction methods, such as single data, neglect of multi-task correlation, and insufficient multimodal data fusion.

[0006] To address the aforementioned technical problems, this application provides a method for predicting the efficacy of anti-VEGF treatment, employing the following technical solution:

[0007] Obtain the original fundus color image and the original OCT image to be predicted;

[0008] The original fundus color image is input into the fundus color image prediction model to perform fundus color image prediction operation, and the predicted fundus color image is obtained.

[0009] The original fundus color image and the predicted fundus color image are stitched together to obtain a stitched fundus color image.

[0010] The original OCT image is input into the OCT image prediction model to perform OCT image prediction operation, and the predicted OCT image is obtained.

[0011] The original OCT image and the predicted OCT image are stitched together to obtain a stitched OCT image.

[0012] The stitched fundus color image and the stitched OCT image are input into the vision change prediction model to perform vision change prediction operation and obtain vision change prediction results.

[0013] Furthermore, the fundus image prediction model consists of a fundus image residual module, a blood vessel extraction module, and a fundus image global feature extraction module.

[0014] Furthermore, the predicted fundus color photograph I pre-Fundus Represented as:

[0015] I pre-Fundus =CA(F Fundus +ReLu(Conv(Cat(V Fundus I Fundus )))+G Fundus )

[0016] G Fundus =G global (F Fundus )

[0017] V Fundus =H(I Fundus )

[0018] F Fundus =ResBlock5(...(ResBlock(ReLu(Conv(Cat(V Fundus I Fundus ))))

[0019] Where CA represents channel attention, F Fundus V represents the feature map after processing by the fundus color image residual module. Fundus G represents the high-frequency feature map extracted by the blood vessel extraction module. FundusThe global feature map output by the global feature extraction module of the fundus color image represents the global feature map, ReLU represents the activation function, Conv represents the convolution operation, and Cat represents the concatenation along the channel dimension.

[0020] Furthermore, the blood vessel extraction module H(I) is represented as:

[0021] H(I)=II*gaussion(r,σ)

[0022] Where H represents the blood vessel extraction module, I represents the input image, and gaussion(r,σ) represents a low-pass Gaussian filter with radius r and spatial constant σ.

[0023] Furthermore, the OCT image prediction model consists of a downsampling module, an OCT image residual module, an upsampling module, and an OCT image global feature extraction module.

[0024] Furthermore, the predicted OCT image I pre-OCT Represented as:

[0025] I pre-OCT =ReLu(Conv(ReflectionPad Cat(G OCT U OCT )))

[0026] G OCT =G global (F OCT )

[0027] U OCT =UpSanple3(UpSample2(UpSample1(F OCT )))

[0028] F OCT =ResBlock5(...(ResBlock(D OCT )))

[0029] D OCT

[0030] =DownSanple3(DownSample2(DownSample1(ReLu(InstanceNorm(Conv(ReflectionPad(I OCT )))))))

[0031] Among them, D OCT This represents the OCT image processed by the downsampling module, DownSample. i Indicates the downsampling operation, F OCTUpSanple represents the OCT feature map after processing by the OCT image residual module. i Indicates the upsampling operation, U OCT In the OCT image processed by the upsampling module, ReflectionPad represents reflection filling, G OCT This indicates global feature processing of OCT images.

[0032] Furthermore, the global feature extraction module for fundus color photography or the global feature extraction module for OCT images is represented as follows:

[0033] GFB(I)=Sigmoid(Conv(Cat(Max(I),Mean(I))))*I

[0034] Wherein, GFB represents the global feature processing module, I represents the input feature image, Sigmoid represents the non-linear activation layer, Conv represents the convolution operation, Cat represents the concatenation operation along the channel dimension, Max represents max pooling, and Mean represents average pooling.

[0035] To address the aforementioned technical problems, this application also provides an anti-VEGF efficacy prediction device, which employs the following technical solution:

[0036] The raw data acquisition module is used to acquire the original fundus color images and original OCT images to be predicted;

[0037] The fundus color image prediction module is used to input the original fundus color image into the fundus color image prediction model to perform fundus color image prediction operation and obtain the predicted fundus color image.

[0038] The fundus color image stitching module is used to perform a fundus color image stitching operation on the original fundus color image and the predicted fundus color image to obtain a stitched fundus color image.

[0039] The OCT image prediction module is used to input the original OCT image into the OCT image prediction model to perform OCT image prediction operations and obtain the predicted OCT image.

[0040] The OCT image stitching module is used to perform an OCT image stitching operation on the original OCT image and the predicted OCT image to obtain a stitched OCT image.

[0041] The vision change prediction module is used to input the stitched fundus color image and the stitched OCT image into the vision change prediction model to perform vision change prediction operations and obtain vision change prediction results.

[0042] To address the aforementioned technical problems, this application also provides a computer device that employs the following technical solution:

[0043] It includes a memory and a processor, wherein the memory stores computer-readable instructions, and the processor executes the computer-readable instructions to implement the steps of the anti-VEGF efficacy prediction method as described above.

[0044] To address the aforementioned technical problems, this application also provides a computer-readable storage medium, employing the technical solution described below:

[0045] The computer-readable storage medium stores computer-readable instructions, which, when executed by a processor, implement the steps of the anti-VEGF efficacy prediction method as described above.

[0046] This application provides a method for predicting the efficacy of anti-VEGF treatment, comprising: acquiring original fundus color images and original OCT images to be predicted; inputting the original fundus color images into a fundus color image prediction model to perform fundus color image prediction operations, obtaining a predicted fundus color image; performing a fundus color image stitching operation on the original fundus color images and the predicted fundus color images to obtain a stitched fundus color image; inputting the original OCT image into an OCT image prediction model to perform an OCT image prediction operation, obtaining a predicted OCT image; performing an OCT image stitching operation on the original OCT image and the predicted OCT image to obtain a stitched OCT image; inputting the stitched fundus color images and the stitched OCT image into a vision change prediction model to perform a vision change prediction operation, obtaining a vision change prediction result. Compared with the prior art, this application, by combining two different types of medical image data, fundus color images and OCT (Optical Coherence Tomography) images, enables the system to capture information on the state of eye health from multiple dimensions. Fundus photography primarily reflects the macroscopic structure and color changes of the retina, while OCT images provide microscopic tomographic information about the retina and deeper tissues. This fusion of multimodal data significantly improves the comprehensiveness and accuracy of eye health assessments, thereby enhancing the precision of vision change predictions. Attached Figure Description

[0047] To more clearly illustrate the solutions in this application, the accompanying drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the accompanying drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0048] Figure 1 This is an exemplary system architecture diagram to which this application can be applied;

[0049] Figure 2 This is a flowchart illustrating the implementation of the anti-VEGF efficacy prediction method provided in Embodiment 1 of this application;

[0050] Figure 3 This is a schematic diagram of a specific implementation of the multimodal, multi-task anti-VEGF efficacy prediction network provided in Embodiment 1 of this application;

[0051] Figure 4 This is a schematic diagram of the anti-VEGF efficacy prediction device provided in Embodiment 2 of this application;

[0052] Figure 5 This is a schematic diagram of the structure of one embodiment of the computer device according to this application. Detailed Implementation

[0053] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains; the terminology used herein in the specification of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having," and any variations thereof, in the specification, claims, and foregoing drawings of this application, are intended to cover non-exclusive inclusion. The terms "first," "second," etc., in the specification, claims, or foregoing drawings of this application are used to distinguish different objects, not to describe a particular order.

[0054] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0055] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

[0056] like Figure 1 As shown, system architecture 100 may include terminal device 101, network 102, and server 103. Terminal device 101 may be a laptop 1011, tablet 1012, or mobile phone 1013. Network 102 is used as a medium to provide a communication link between terminal device 101 and server 103. Network 102 may include various connection types, such as wired, wireless communication links, or fiber optic cables, etc.

[0057] Users can use terminal device 101 to interact with server 103 via network 102 to receive or send messages, etc. Various communication client applications can be installed on terminal device 101, such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social media platform software, etc.

[0058] Terminal device 101 can be various electronic devices with a display screen and support web browsing. In addition to laptops 1011, tablets 1012, or mobile phones 1013, terminal device 101 can also be an e-book reader, an MP3 player (Moving Picture Experts Group Audio Layer III), an MP4 player (Moving Picture Experts Group Audio Layer IV), a laptop computer, and a desktop computer, etc.

[0059] Server 103 can be a server that provides various services, such as a backend server that provides support for the pages displayed on terminal device 101.

[0060] It should be noted that the anti-VEGF efficacy prediction method provided in this application embodiment is generally executed by a server / terminal device, and correspondingly, the anti-VEGF efficacy prediction device is generally set in the server / terminal device.

[0061] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.

[0062] Example 1

[0063] Continue to refer to Figure 2 The diagram shows a flowchart of an embodiment of the anti-VEGF efficacy prediction method according to this application. The anti-VEGF efficacy prediction method includes steps S201, S202, S203, S204, S205, and S206.

[0064] In step S201, the original fundus color photograph and the original OCT image to be predicted are obtained;

[0065] In step S202, the original fundus color image is input into the fundus color image prediction model to perform fundus color image prediction operation, and the predicted fundus color image is obtained.

[0066] In step S203, the original fundus color image and the predicted fundus color image are stitched together to obtain a stitched fundus color image.

[0067] In step S204, the original OCT image is input into the OCT image prediction model to perform OCT image prediction operation, and the predicted OCT image is obtained.

[0068] In step S205, the original OCT image and the predicted OCT image are stitched together to obtain a stitched OCT image.

[0069] In step S206, the stitched fundus color image and the stitched OCT image are input into the vision change prediction model to perform vision change prediction operation and obtain vision change prediction results.

[0070] In the embodiments of this application, see Figure 3 The multimodal, multi-task anti-VEGF efficacy prediction network shown consists of three main parts: an OCT image prediction branch, a fundus color photography prediction branch, and a visual acuity change prediction branch. Two of the image prediction branches also incorporate corresponding discriminators for adversarial training.

[0071] The main body of the fundus color image prediction branch consists of five identical residual blocks. In addition, a blood vessel extraction module was designed to accurately predict the morphology of blood vessels after treatment. At the same time, a global feature extraction module was designed for feature learning of low- and mid-frequency lesions.

[0072] The OCT image prediction branch does not use a high-frequency vessel extraction module. Instead, it performs three downsampling operations before inputting multiple residual blocks to convert the original image's (1, 256, 256) feature map to (256, 64, 64). After feature extraction through multiple residual blocks, it performs three upsampling operations to restore the feature map to (1, 256, 256) size. The OCT prediction branch also uses the designed global feature extraction module for learning mid-to-low frequency features.

[0073] The vision prediction branch consists of two parallel classification blocks, which receive inputs from fundus color images and OCT images, respectively. These images are then processed by the classification blocks for feature extraction and classification. Finally, the two classification results are fused to obtain the final vision prediction result. It is worth noting that the images input to the classification blocks are feature maps obtained by stitching together the pre-treatment image with the corresponding predicted post-treatment image.

[0074] This application provides a method for predicting the efficacy of anti-VEGF treatment, comprising: acquiring original fundus color images and original OCT images to be predicted; inputting the original fundus color images into a fundus color image prediction model to perform fundus color image prediction operations, obtaining a predicted fundus color image; performing a fundus color image stitching operation on the original fundus color images and the predicted fundus color images to obtain a stitched fundus color image; inputting the original OCT image into an OCT image prediction model to perform OCT image prediction operations, obtaining a predicted OCT image; performing an OCT image stitching operation on the original OCT image and the predicted OCT image to obtain a stitched OCT image; and inputting the stitched fundus color image and the stitched OCT image into a vision change prediction model to perform a vision change prediction operation, obtaining a vision change prediction result. Compared with the prior art, this application, by combining two different types of medical image data—fundus color images and OCT (optical coherence tomography) images—enables the system to capture information about the eye health status from multiple dimensions. Fundus photography primarily reflects the macroscopic structure and color changes of the retina, while OCT images provide microscopic tomographic information about the retina and deeper tissues. This fusion of multimodal data significantly improves the comprehensiveness and accuracy of eye health assessments, thereby enhancing the precision of vision change predictions.

[0075] In some optional implementations of the embodiments of this application, the above-mentioned predictive fundus color image I pre-Fundus Represented as:

[0076] I pre-Fundus =CA(F Fundus +ReLu(Conv(Cat(V Fundus I Fundus )))+G Fundus )

[0077] G Fundus =G global (F Fundus )

[0078] V Fundus =H(I Fundus )

[0079] F Fundus =ResBlock5(...(ResBlock(ReLu(Conv(Cat(V Fundus I Fundus ))))

[0080] Where CA represents channel attention, F Fundus V represents the feature map after processing by the fundus color image residual module. Fundus G represents the high-frequency feature map extracted by the blood vessel extraction module. FundusThis represents the global feature map output by the global feature extraction module of fundus color image. ReLU represents the activation function, Conv represents the convolution operation, and Cat represents the concatenation along the channel dimension.

[0081] In some optional implementations of the embodiments of this application, the above-mentioned blood vessel extraction module H(I) is represented as:

[0082] H(I)=II*gaussion(r,σ)

[0083] Where H represents the blood vessel extraction module, I represents the input image, and gaussion(r,σ) represents a low-pass Gaussian filter with radius r and spatial constant σ.

[0084] In this embodiment, the high-frequency blood vessel extraction module uses a Gaussian filter to remove high-frequency details, resulting in a low-frequency component. Then, the low-frequency component is subtracted pixel-by-pixel from the original image to obtain an image with high-frequency blood vessel details. Applying this module to both the input pre-treatment image and the output post-treatment image yields both pre-treatment and post-treatment blood vessel features. During model training, the difference in time between the treated and post-treatment blood vessel features is calculated using the mean absolute value, resulting in more realistic post-treatment blood vessel predictions.

[0085] H(I)=II*gaussion(r,σ)

[0086] Here, H refers to the blood vessel extraction module, I refers to the input image, and gaussion(r,σ) refers to a low-pass Gaussian filter with radius r and spatial constant σ. During model training, the radius r and spatial constant σ are set to 26 and 9, respectively.

[0087] In some optional implementations of the embodiments of this application, the above-mentioned predicted OCT image I pre-OCT Represented as:

[0088] I pre-OCT =ReLu(Conv(ReflectionPad Cat(G OCT U OCT )))

[0089] G OCT =G global (F OCT )

[0090] U OCT =UpSanple3(UpSample2(UpSample1(F OCT )))

[0091] F OCT=ResBlock5(...(ResBlock(D OCT )))

[0092] D OCT

[0093] =DownSanple3(DownSample2(DownSample1(ReLu(InstanceNorm(Conv(ReflectionPad(I OCT )))))))

[0094] Among them, D OCT This represents the OCT image after being downsampled by the downsampling module. i Indicates the downsampling operation, F OCT UpSanple represents the OCT feature map after processing by the OCT image residual module. i Indicates the upsampling operation, U OCT In the OCT image after upsampling, ReflectionPad represents the reflection fill, G OCT This indicates global feature processing of OCT images.

[0095] In some optional implementations of the embodiments of this application, the above-mentioned fundus color photograph global feature extraction module or OCT image global feature extraction module is represented as follows:

[0096] GFB(I)=Sigmoid(Conv(Cat(Max(I),Mean(I))))*I

[0097] Wherein, GFB represents the global feature processing module, I represents the input feature image, Sigmoid represents the non-linear activation layer, Conv represents the convolution operation, Cat represents the concatenation operation along the channel dimension, Max represents max pooling, and Mean represents average pooling.

[0098] In this embodiment, the global feature processing module can effectively extract global information and predict changes in lesions before and after treatment. This module concatenates the feature maps obtained by max pooling and average pooling along the channel dimension, and then performs feature extraction through a convolutional layer and an activation layer to obtain global features. Finally, this global feature is multiplied pixel-by-pixel with the original image to obtain the final global feature.

[0099] GFB(I)=Sigmoid(Conv(Cat(Max(I),Mean(I))))*I

[0100] Wherein, GFB represents the global feature processing module, I represents the input feature image, Sigmoid represents the non-linear activation layer, Conv represents the convolution operation, Cat represents the concatenation operation along the channel dimension, Max represents max pooling, and Mean represents average pooling.

[0101] In some optional implementations of this application, the network receives pre-treatment OCT images and fundus photographs as input, and post-treatment OCT and fundus photographs, along with visual acuity changes, as labels. The visual acuity change prediction branch receives the pre- and post-treatment OCT images and fundus photographs to predict the changes in visual acuity after treatment. The predicted changes are compared with the actual changes to calculate a loss, thus constraining the model. The OCT image prediction branch and the fundus photograph prediction branch output the predicted post-treatment OCT images and fundus photographs, respectively. These predicted images and the actual post-treatment images are then paired and input into three networks: on one hand, the discriminator calculates the adversarial loss to ensure the authenticity of the predicted images; on the other hand, a pre-trained VGG19 network maps the predicted images and the actual post-treatment images to the VGG feature space, calculating the content loss between features to ensure the accuracy of the predicted image content. Furthermore, the predicted images and the actual images are also input into a registration network to calculate the registration loss between them, maintaining the consistency of the image structure. Through these steps, the network can accurately predict the efficacy of anti-VEGF treatment with the synergistic effect of multi-task and multimodal data, and ensure the consistency of the prediction results in both visual and structural aspects.

[0102] The loss calculation in the above three training processes can be expressed as:

[0103]

[0104] in, Indicating resistance to loss, Indicates content loss. Indicates the registration loss, I post Represents actual post-treatment images, I pred The image represents the predicted post-treatment image, D() represents the discriminator network, VGG() represents the pre-trained VGG19 model, and Registration() represents the registration network.

[0105] The main evaluation indicators are:

[0106] (1) Mean Squared Error (MAE): This measures the average absolute difference between each corresponding pixel in the predicted image and the actual image. The smaller the MAE, the smaller the difference between the predicted image and the actual image, meaning that the model's predicted image is closer to the actual image.

[0107]

[0108] Where n is the number of pixels. x i and y i These are the pixel values at the same location in the predicted image and the actual image, respectively.

[0109] (2) Peak Signal-to-Noise Ratio (PSNR): This measures the mean square error (MSE) between the predicted image and the actual image, and converts it to a logarithmic scale. The higher the PSNR, the smaller the mean square error between the predicted image and the actual image, which means the higher the quality of the predicted image.

[0110]

[0111] `max` is the maximum possible value of a pixel (for example, for an 8-bit grayscale image, `max = 255`), and `MSE` is the mean squared error. The calculation method is as follows:

[0112]

[0113] Where n is the number of pixels. x i and y i These are the pixel values at the same location in the predicted image and the actual image, respectively.

[0114] (3) Structural Similarity Index (SSIM): This index considers the similarity of brightness, contrast, and structural information between the predicted image and the actual image. SSIM comprehensively considers the structural information of the predicted image, not just the similarity of pixel values. The SSIM value is in the range of [-1, 1], and the closer it is to 1, the more similar the predicted image is to the actual image.

[0115]

[0116] Where, μ x and μ y It is the average brightness of the predicted image x and the actual image y, which measures the brightness of the image; and σ is the variance of pixel values between the predicted image x and the actual image y. It measures the distribution of pixel values; a larger variance indicates that the image contains more detail. xy y is the covariance between the pixel values of the predicted image x and the actual image y. It measures the linear relationship between the pixel values of the predicted and actual images. A positive covariance indicates a positive correlation, while a negative covariance indicates a negative correlation. C1 and C2 are constants used for stable calculations.

[0117] In practical applications, the anti-VEGF efficacy prediction model proposed in this application has better evaluation indicators than existing models, and the predicted post-treatment results have good detail, especially in the generation of details in fundus color photographs and the prediction of lesion changes in OCT images.

[0118]

[0119] Table 1 Comparison of Test Results

[0120] Qualitative and quantitative analyses show that the added modules effectively improve the quality of the images generated by the model. In particular, the addition of the high-frequency feature processing module allows for more vascular details in the generated fundus images. The added global feature module significantly improves the relevant performance metrics of OCT images during training.

[0121] Comparative analysis leads to the conclusion that adding global features, registration loss, and vessel extraction modules significantly improves model performance. Specifically, while the global feature module has limited effect on improving image quality, it significantly enhances prediction accuracy. The registration loss module excels in improving image quality, particularly in reducing mean absolute error and improving peak signal-to-noise ratio and structural similarity coefficient, but it has a slight negative impact on prediction accuracy. The vessel extraction module achieves the best performance in reducing prediction error, improving image quality, and increasing prediction accuracy. In summary, the introduction of these three modules effectively improves the overall model performance, with the vessel extraction module showing particularly outstanding performance in improving image quality and prediction accuracy, while the global feature module plays a positive role in improving prediction accuracy.

[0122] The embodiments of this application can acquire and process relevant data based on artificial intelligence technology. Artificial intelligence (AI) refers to the theories, methods, technologies, and application systems that use digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results.

[0123] Foundational technologies for artificial intelligence generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies mainly encompass computer vision, robotics, biometrics, speech processing, natural language processing, and machine learning / deep learning.

[0124] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing related hardware with computer-readable instructions. These computer-readable instructions can be stored in a computer-readable storage medium. When executed, the program can include the processes of the embodiments of the above methods. The aforementioned storage medium can be a non-volatile storage medium such as a magnetic disk, optical disk, or read-only memory (ROM), or random access memory (RAM).

[0125] It should be understood that although the steps in the flowcharts of the accompanying figures are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the accompanying figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.

[0126] Example 2

[0127] Further reference Figure 4 As a response to the above Figure 2 The implementation of the method shown in this application provides an embodiment of an anti-VEGF efficacy prediction device, which is similar to... Figure 2 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0128] like Figure 4 As shown, the anti-VEGF efficacy prediction device 200 of this application embodiment includes:

[0129] The raw data acquisition module 210 is used to acquire the original fundus color photograph and the original OCT image to be predicted;

[0130] The fundus color image prediction module 220 is used to input the original fundus color image into the fundus color image prediction model to perform fundus color image prediction operation and obtain the predicted fundus color image.

[0131] The fundus color image stitching module 230 is used to perform fundus color image stitching operation on the original fundus color image and the predicted fundus color image to obtain a stitched fundus color image.

[0132] OCT image prediction module 240 is used to input the original OCT image into the OCT image prediction model to perform OCT image prediction operation and obtain the predicted OCT image.

[0133] OCT image stitching module 250 is used to perform OCT image stitching operation on the original OCT image and the predicted OCT image to obtain a stitched OCT image.

[0134] The vision change prediction module 260 is used to input stitched fundus color images and stitched OCT images into the vision change prediction model to perform vision change prediction operations and obtain vision change prediction results.

[0135] In this embodiment, an anti-VEGF efficacy prediction device 200 is provided, comprising: a raw data acquisition module 210 for acquiring raw fundus color images and raw OCT images to be predicted; a fundus color image prediction module 220 for inputting the raw fundus color images into a fundus color image prediction model to perform fundus color image prediction operations, thereby obtaining a predicted fundus color image; a fundus color image stitching module 230 for stitching the raw fundus color images and the predicted fundus color images together to obtain a stitched fundus color image; an OCT image prediction module 240 for inputting the raw OCT image into an OCT image prediction model to perform OCT image prediction operations, thereby obtaining a predicted OCT image; an OCT image stitching module 250 for stitching the raw OCT image and the predicted OCT image together to obtain a stitched OCT image; and a vision change prediction module 260 for inputting the stitched fundus color image and the stitched OCT image into a vision change prediction model to perform vision change prediction operations, thereby obtaining a vision change prediction result. Compared to existing technologies, this application combines two different types of medical image data—fundus color photography and OCT (optical coherence tomography) images—to enable the system to capture information about eye health from multiple dimensions. Fundus color photography primarily reflects the macroscopic structure and color changes of the retina, while OCT images provide microscopic tomographic information about the retina and deeper tissues. This fusion of multimodal data significantly improves the comprehensiveness and accuracy of eye health assessment, thereby enhancing the precision of vision change prediction.

[0136] To address the aforementioned technical problems, embodiments of this application also provide a computer device. Please refer to [link / reference needed]. Figure 5 , Figure 5 This is a basic structural block diagram of a computer device according to an embodiment of this application.

[0137] The computer device 300 includes a memory 310, a processor 320, and a network interface 330 that are interconnected via a system bus. It should be noted that only the computer device 300 with components 310-330 is shown in the figure; however, it should be understood that it is not required to implement all the shown components, and more or fewer components can be implemented alternatively. Those skilled in the art will understand that the computer device described here is a device capable of automatically performing numerical calculations and / or information processing according to pre-set or stored instructions, and its hardware includes, but is not limited to, microprocessors, application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), digital signal processors (DSPs), embedded devices, etc.

[0138] The computer device can be a desktop computer, laptop, handheld computer, or cloud server, etc. The computer device can interact with the user via a keyboard, mouse, remote control, touchpad, or voice control.

[0139] The memory 310 includes at least one type of readable storage medium, including flash memory, hard disk, multimedia card, card-type memory (e.g., SD or DX memory), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 310 may be an internal storage unit of the computer device 300, such as the hard disk or memory of the computer device 300. In other embodiments, the memory 310 may also be an external storage device of the computer device 300, such as a plug-in hard disk, smart media card (SMC), secure digital (SD) card, flash card, etc. Of course, the memory 310 may also include both internal storage units and external storage devices of the computer device 300. In this embodiment, the memory 310 is typically used to store the operating system and various application software installed on the computer device 300, such as computer-readable instructions for predicting the efficacy of anti-VEGF treatment. Furthermore, the memory 310 can also be used to temporarily store various types of data that have been output or will be output.

[0140] In some embodiments, the processor 320 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip. The processor 320 is typically used to control the overall operation of the computer device 300. In this embodiment, the processor 320 is used to execute computer-readable instructions stored in the memory 310 or to process data, for example, to execute computer-readable instructions for the anti-VEGF efficacy prediction method.

[0141] The network interface 330 may include a wireless network interface or a wired network interface, which is typically used to establish communication connections between the computer device 300 and other electronic devices.

[0142] The computer device provided in this application, by combining two different types of medical image data—fundus color photography and OCT (optical coherence tomography) images—can capture information about the state of eye health from multiple dimensions. Fundus color photography mainly reflects the macroscopic structure and color changes of the retina, while OCT images provide microscopic tomographic information about the retina and deeper tissues. This fusion of multimodal data significantly improves the comprehensiveness and accuracy of eye health assessment, thereby enhancing the accuracy of vision change prediction.

[0143] This application also provides another embodiment, namely, providing a computer-readable storage medium storing computer-readable instructions that can be executed by at least one processor to cause the at least one processor to perform the steps of the anti-VEGF efficacy prediction method as described above.

[0144] The computer-readable storage medium provided in this application, by combining two different types of medical image data—fundus color photography and OCT (optical coherence tomography) images—enables the system to capture information about eye health status from multiple dimensions. Fundus color photography primarily reflects the macroscopic structure and color changes of the retina, while OCT images provide microscopic tomographic information about the retina and deeper tissues. This fusion of multimodal data significantly improves the comprehensiveness and accuracy of eye health assessment, thereby enhancing the precision of vision change prediction.

[0145] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk), and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0146] Obviously, the embodiments described above are only some embodiments of this application, not all embodiments. The accompanying drawings show preferred embodiments of this application, but do not limit the patent scope of this application. This application can be implemented in many different forms; rather, the purpose of providing these embodiments is to provide a more thorough and comprehensive understanding of the disclosure of this application. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or make equivalent substitutions for some of the technical features. Any equivalent structures made using the content of this application's specification and drawings, directly or indirectly applied to other related technical fields, are similarly within the scope of patent protection of this application.

Claims

1. A method for predicting anti-VEGF efficacy, characterized by, Includes the following steps: Obtain the original fundus color image and the original OCT image to be predicted; The original fundus color image is input into the fundus color image prediction model to perform fundus color image prediction operation, and the predicted fundus color image is obtained. The original fundus color image and the predicted fundus color image are stitched together to obtain a stitched fundus color image. The original OCT image is input into the OCT image prediction model to perform OCT image prediction operation, and the predicted OCT image is obtained. The original OCT image and the predicted OCT image are stitched together to obtain a stitched OCT image. The stitched fundus color image and the stitched OCT image are input into the vision change prediction model to perform vision change prediction operation and obtain vision change prediction results; The fundus image prediction model consists of a fundus image residual module, a blood vessel extraction module, and a fundus image global feature extraction module. The predicted fundus color photograph is represented as: wherein, represents channel attention, represents the feature map after the fundus color photo residual error module is processed, represents the high-frequency feature map extracted by the blood vessel extraction module, represents the original fundus color photo, represents the global feature map output by the fundus color photo global feature extraction module, represents an activation function, represents a convolution operation, represents concatenation along the channel dimension; The OCT image prediction model consists of a downsampling module, an OCT image residual module, an upsampling module, and an OCT image global feature extraction module. The predicted OCT image Represented as: in, This represents the OCT image after passing through the downsampling module. This indicates a downsampling operation. This represents the OCT feature map after processing by the OCT image residual module. Indicates an upsampling operation. OCT images processed by the upsampling module, Indicates instance normalization, Indicates reflection filling. This indicates global feature processing of OCT images.

2. The method for predicting the efficacy of anti-VEGF therapy according to claim 1, characterized in that, The blood vessel extraction module Represented as: Where H represents the blood vessel extraction module, and I represents the input image. This represents a radius of r and a space constant of r. A low-pass Gaussian filter.

3. The method for predicting the efficacy of anti-VEGF therapy according to claim 1, characterized in that, The fundus color photograph global feature extraction module or the OCT image global feature extraction module is represented as follows: Wherein, GFB represents the global feature processing module, I represents the input feature image, Sigmoid represents the non-linear activation layer, Conv represents the convolution operation, Cat represents the concatenation operation along the channel dimension, Max represents max pooling, and Mean represents average pooling.

4. A device for predicting the efficacy of anti-VEGF therapy, characterized in that, include: The raw data acquisition module is used to acquire the original fundus color images and original OCT images to be predicted; The fundus color image prediction module is used to input the original fundus color image into the fundus color image prediction model to perform fundus color image prediction operation and obtain the predicted fundus color image. The fundus color image stitching module is used to perform a fundus color image stitching operation on the original fundus color image and the predicted fundus color image to obtain a stitched fundus color image. The OCT image prediction module is used to input the original OCT image into the OCT image prediction model to perform OCT image prediction operations and obtain the predicted OCT image. The OCT image stitching module is used to perform an OCT image stitching operation on the original OCT image and the predicted OCT image to obtain a stitched OCT image. The vision change prediction module is used to input the stitched fundus color photograph and the stitched OCT image into the vision change prediction model to perform vision change prediction operation and obtain vision change prediction results; The fundus image prediction model consists of a fundus image residual module, a blood vessel extraction module, and a fundus image global feature extraction module. The predicted fundus photograph Represented as: in, Indicates channel attention. This represents the feature map after processing by the fundus color image residual module. This represents the high-frequency feature map extracted by the blood vessel extraction module. This refers to the original fundus color photograph. This represents the global feature map output by the global feature extraction module of the fundus image. This represents the activation function. This represents the convolution operation. This indicates splicing along the channel dimension; The OCT image prediction model consists of a downsampling module, an OCT image residual module, an upsampling module, and an OCT image global feature extraction module. The predicted OCT image Represented as: in, This represents the OCT image after passing through the downsampling module. This indicates a downsampling operation. This represents the OCT feature map after processing by the OCT image residual module. Indicates an upsampling operation. OCT images processed by the upsampling module, Indicates instance normalization, Indicates reflection filling. This indicates global feature processing of OCT images.

5. A computer device, comprising a memory and a processor, characterized in that, The memory stores computer-readable instructions, and when the processor executes the computer-readable instructions, it implements the steps of the anti-VEGF efficacy prediction method as described in any one of claims 1 to 3.

6. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-readable instructions, which, when executed by a processor, implement the steps of the anti-VEGF efficacy prediction method as described in any one of claims 1 to 3.